S

Deploy AI Models on Microsoft Azure

Complete guide to deploying open-source AI models on Azure

Deploy Open-Source AI Models on Microsoft Azure

Microsoft Azure provides comprehensive AI infrastructure with Azure Machine Learning and Azure Kubernetes Service.

Prerequisites

  • Azure subscription
  • Azure CLI installed
  • Resource group created
  • Basic understanding of Azure services

Deployment Options

1. Azure Virtual Machines

Best for: Direct control and flexibility

Step 1: Create GPU VM

az vm create   --resource-group ai-models-rg   --name ai-model-vm   --image UbuntuLTS   --size Standard_NC6s_v3   --admin-username azureuser   --generate-ssh-keys

Step 2: Install NVIDIA Drivers

# SSH into VM
ssh azureuser@<vm-ip>

# Install drivers
sudo apt update
sudo apt install -y nvidia-driver-535
sudo reboot

2. Azure Kubernetes Service (AKS)

Best for: Production-grade deployments

Step 1: Create AKS Cluster

az aks create   --resource-group ai-models-rg   --name ai-cluster   --node-count 3   --enable-cluster-autoscaler   --min-count 1   --max-count 10   --vm-set-type VirtualMachineScaleSets

Step 2: Add GPU Node Pool

az aks nodepool add   --resource-group ai-models-rg   --cluster-name ai-cluster   --name gpupool   --node-count 1   --node-vm-size Standard_NC6s_v3   --enable-cluster-autoscaler   --min-count 0   --max-count 5

3. Azure Machine Learning

Best for: Managed ML workflows

from azure.ai.ml import MLClient
from azure.ai.ml.entities import ManagedOnlineEndpoint, ManagedOnlineDeployment

ml_client = MLClient.from_config()

# Create endpoint
endpoint = ManagedOnlineEndpoint(
    name="llama-endpoint",
    description="LLaMA model endpoint"
)
ml_client.online_endpoints.begin_create_or_update(endpoint)

# Deploy model
deployment = ManagedOnlineDeployment(
    name="llama-deployment",
    endpoint_name="llama-endpoint",
    model=model,
    instance_type="Standard_NC6s_v3",
    instance_count=1
)
ml_client.online_deployments.begin_create_or_update(deployment)

Cost Optimization

Spot VMs

Save up to 90%:

az vm create   --resource-group ai-models-rg   --name spot-vm   --priority Spot   --max-price -1   --eviction-policy Deallocate

Reserved Instances

  • 1-year: 40% savings
  • 3-year: 60% savings

Auto-shutdown

az vm auto-shutdown   --resource-group ai-models-rg   --name ai-model-vm   --time 1900

Monitoring

Azure Monitor

az monitor metrics alert create   --name high-cpu-alert   --resource-group ai-models-rg   --scopes /subscriptions/{subscription-id}/resourceGroups/ai-models-rg/providers/Microsoft.Compute/virtualMachines/ai-model-vm   --condition "avg Percentage CPU > 80"   --window-size 5m

Security

  • Use Azure Key Vault for secrets
  • Enable Azure AD authentication
  • Implement Network Security Groups
  • Use Private Endpoints
  • Enable Azure Defender

Troubleshooting

GPU Not Available

Check VM size supports GPUs and drivers are installed correctly

Deployment Failures

Review Azure Activity Log and deployment logs in Azure Portal

Production Checklist

  • [ ] Set up Azure Monitor
  • [ ] Configure Application Insights
  • [ ] Enable auto-scaling
  • [ ] Implement Azure Load Balancer
  • [ ] Set up backup with Azure Backup
  • [ ] Configure Virtual Network
  • [ ] Enable encryption
  • [ ] Set up Azure DevOps pipeline