Deploy Foundry Local as an Azure Arc extension

This article shows you how to set up Foundry Local as an extension on your Azure Kubernetes Service (AKS) cluster enabled by Azure Arc. Use the Azure CLI to deploy Foundry Local as an extension on your Azure Arc-enabled Kubernetes cluster. Helm is also a supported deployment option, and installation instructions are provided during preview access onboarding.

Important

  • Foundry Local is available in preview. Preview releases provide early access to features that are in active deployment.
  • Features, approaches, and processes can change or have limited capabilities before general availability (GA).

Prerequisites

Before you begin, make sure you have:

Important

Ingress-NGINX is deprecated since March 2026. Microsoft currently supports NGINX annotations. The solution is tested with AKS's managed NGINX ingress controller.

GPU prerequisites

If you plan to run GPU workloads, also make sure:

  • NVIDIA GPU nodes are available in your cluster with CUDA drivers installed on the nodes.
  • The Kubernetes device plugin for NVIDIA is configured so the cluster can schedule GPU workloads.

For more information, see NVIDIA GPU Operator.

Step 1: Install cert-manager and trust-manager

Foundry Local on Azure Local requires cert-manager and trust-manager for automated certificate management.

Use the Azure CLI to create the cert-manager extension on your cluster. Choose the appropriate command for your shell environment:

az k8s-extension create \
    --cluster-name <your_arc_cluster_name> \
    --name "azure-cert-manager" \
    --resource-group <resource_group_of_the_arc_cluster> \
    --cluster-type connectedClusters \
    --extension-type Microsoft.CertManagement \
    --scope cluster \
    --release-train stable \
    --config config.enableGatewayAPI=true \
    --config cert-manager.crds.keep=true \
    --config trust-manager.defaultPackage.enabled=false \
    --config trust-manager.secretTargets.enabled=true \
    --config trust-manager.secretTargets.authorizedSecretsAll=true

Step 2: Install the inference operator

Use the Azure CLI to deploy the inference operator extension. Choose the appropriate command for your shell environment:

az k8s-extension create \
    --resource-group <resource_group_of_the_arc_cluster> \
    --cluster-name <arc_cluster_name> \
    --name "inference-operator" \
    --extension-type Microsoft.Foundry \
    --scope cluster \
    --release-namespace "foundry-local-operator" \
    --cluster-type connectedClusters \
    --auto-upgrade-minor-version true \
    --release-train stable \
    --config entraAuth.tenantId="<azure_tenant_id>" \
    --config entraAuth.clientId="<the_client_id_of_the_app_registration>"

Additional installation parameters

You can configure the following optional parameters during inference operator installation:

Parameter Description
entraAuth.enabled Boolean. When enabled, the Entra Auth SDK sidecar and msi-adapter sidecar are injected into inference pods for JWT validation and ARM RBAC authorization. When disabled, entraAuth.tenantId and entraAuth.clientId parameters are optional. Default: true. For more information, see Configure authentication for Foundry Local enabled by Azure Arc.
watch.namespaces Array of strings. Configure this parameter if you want the operator to manage resources across multiple namespaces. By default, the operator manages the foundry-local-operator namespace where models and inference workloads are deployed. Pass the installation command as: --config watch.namespaces[0]="NS1" --config watch.namespaces[1]="NS2". For more information, see Namespace configuration for model deployments.

Step 3: Verify the operator

Verify that the inference operator extension is installed and that all pods are running. Use the following commands to check the operator status:

kubectl get pods -n foundry-local-operator
kubectl get crd | grep foundry

Wait until all pods show a Running status before you proceed.

The following screenshots show an example of the expected output:

Screenshot of terminal output from kubectl get pods command showing five pods in the foundry-local-operator namespace with Running or Completed status.

Screenshot of terminal output from kubectl get crd command showing four Foundry Local custom resource definitions registered in the cluster.

Troubleshoot your deployment

Use the following commands to troubleshoot issues with your deployment.

Check ModelDeployment status and events:

kubectl describe mdep <name>

Check operator logs:

kubectl logs -f deployment/inference-operator -n foundry-local-operator

Check pod status:

kubectl get pods -l app.kubernetes.io/managed-by=inference-operator
kubectl describe pod <pod-name>
kubectl logs <pod-name>

List all resources created by a deployment:

kubectl get deploy,svc,ing -l foundry.azure.com/deployment=<name>

Check the catalog ConfigMap:

kubectl get configmap foundry-local-catalog -n foundry-local-operator -o yaml

Verify a Model CR exists:

kubectl get models
kubectl describe model <name>

Next step