TL;DR
- Azure provides deployment what-if for pre-deployment validation — use it in CI before every apply
- Azurite emulates Storage, Queues, and Tables locally — faster than real Azure for storage-heavy tests
- The #1 mistake: skipping Azure Policy testing until deployment fails in production
Best for: Teams deploying to Azure with Terraform, Bicep, or ARM templates Skip if: You’re on AWS/GCP only or using Azure PaaS without infrastructure code Read time: 10 minutes
Your Terraform plan looks clean. Azure deployment starts. Twenty minutes later, it fails: “Azure Policy evaluation failed.” You spend an hour figuring out which policy blocked the deployment, then another hour refactoring to comply. Meanwhile, the team is blocked.
Azure infrastructure testing has unique challenges. Azure Policy enforcement happens at deployment time. Resource naming conventions vary by region. Eventual consistency on Azure AD propagation causes intermittent failures. Understanding these patterns makes the difference between smooth CI and constant firefighting.
The Real Problem
Azure introduces testing challenges different from AWS:
Azure Policy: Enterprise Azure subscriptions have policies that block non-compliant deployments. You don’t know about violations until terraform apply or az deployment fails.
Resource Provider registration: First-time use of a service in a subscription requires provider registration. Tests fail unexpectedly in clean subscriptions.
Azure AD propagation delays: Service principals, managed identities, and role assignments take time to propagate. Tests that work locally fail in CI.
Naming constraints: Azure resource naming has complex rules — storage accounts must be globally unique, 3-24 lowercase alphanumeric characters. Key Vaults have different rules. VMs have different rules again.
Deployment What-If
Azure’s what-if operation validates deployments before execution:
# ARM/Bicep what-if
az deployment group what-if \
--resource-group myResourceGroup \
--template-file main.bicep \
--parameters @params.json
# Subscription-level deployment
az deployment sub what-if \
--location eastus \
--template-file main.bicep
For Terraform, combine plan with Azure-specific validation:
# Generate plan
terraform plan -out=tfplan
# Convert to JSON for analysis
terraform show -json tfplan > tfplan.json
# Check for Azure Policy compliance (requires Azure CLI)
az policy state trigger-scan --resource-group myResourceGroup
# Or use Checkov with Azure rules
checkov -f tfplan.json --framework terraform_plan
Terratest for Azure
Terratest has Azure-specific modules:
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/azure"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestAzureStorageAccount(t *testing.T) {
t.Parallel()
subscriptionID := azure.GetSubscriptionID()
uniqueID := random.UniqueId()
terraformOptions := &terraform.Options{
TerraformDir: "../modules/storage-account",
Vars: map[string]interface{}{
"resource_group_name": "rg-test-" + uniqueID,
"storage_account_name": "sttest" + uniqueID,
"location": "eastus",
},
}
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
// Get outputs
resourceGroupName := terraform.Output(t, terraformOptions, "resource_group_name")
storageAccountName := terraform.Output(t, terraformOptions, "storage_account_name")
// Verify storage account exists and has correct properties
exists := azure.StorageAccountExists(t, storageAccountName, resourceGroupName, subscriptionID)
assert.True(t, exists)
// Check storage account properties
storageAccount := azure.GetStorageAccount(t, storageAccountName, resourceGroupName, subscriptionID)
assert.Equal(t, "Standard_LRS", string(storageAccount.Sku.Name))
assert.True(t, *storageAccount.EnableHTTPSTrafficOnly)
}
func TestAzureVirtualNetwork(t *testing.T) {
t.Parallel()
subscriptionID := azure.GetSubscriptionID()
uniqueID := random.UniqueId()
terraformOptions := &terraform.Options{
TerraformDir: "../modules/virtual-network",
Vars: map[string]interface{}{
"resource_group_name": "rg-test-" + uniqueID,
"vnet_name": "vnet-test-" + uniqueID,
"address_space": []string{"10.0.0.0/16"},
"location": "eastus",
},
}
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
vnetName := terraform.Output(t, terraformOptions, "vnet_name")
resourceGroupName := terraform.Output(t, terraformOptions, "resource_group_name")
// Verify VNet exists
exists := azure.VirtualNetworkExists(t, vnetName, resourceGroupName, subscriptionID)
assert.True(t, exists)
// Check subnets
subnets := azure.GetVirtualNetworkSubnets(t, vnetName, resourceGroupName, subscriptionID)
assert.GreaterOrEqual(t, len(subnets), 1)
}
Azurite for Local Storage Testing
Azurite emulates Azure Storage services locally:
# Install via npm
npm install -g azurite
# Start all services
azurite --silent --location ./azurite-data --debug ./azurite-debug.log
# Or via Docker
docker run -d \
-p 10000:10000 \
-p 10001:10001 \
-p 10002:10002 \
-v azurite-data:/data \
mcr.microsoft.com/azure-storage/azurite
Configure Terraform to use Azurite:
provider "azurerm" {
features {}
# For Azurite, override storage endpoints
# Note: Full AzureRM doesn't support Azurite directly
# Use this pattern for app code testing, not full Terraform
}
# For application testing with storage
resource "null_resource" "test_storage" {
provisioner "local-exec" {
command = <<-EOT
export AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1;QueueEndpoint=http://127.0.0.1:10001/devstoreaccount1;TableEndpoint=http://127.0.0.1:10002/devstoreaccount1"
python test_storage_operations.py
EOT
}
}
Python tests with Azurite:
import os
from azure.storage.blob import BlobServiceClient
def test_blob_operations():
# Azurite connection string
connection_string = os.environ.get(
"AZURE_STORAGE_CONNECTION_STRING",
"DefaultEndpointsProtocol=http;AccountName=devstoreaccount1;AccountKey=Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==;BlobEndpoint=http://127.0.0.1:10000/devstoreaccount1"
)
blob_service = BlobServiceClient.from_connection_string(connection_string)
# Create container
container_client = blob_service.create_container("test-container")
# Upload blob
blob_client = container_client.get_blob_client("test-blob.txt")
blob_client.upload_blob("Hello, Azure!", overwrite=True)
# Download and verify
downloaded = blob_client.download_blob().readall()
assert downloaded == b"Hello, Azure!"
# Cleanup
container_client.delete_container()
Bicep Testing with What-If
For Bicep deployments, integrate what-if into CI:
# azure-pipelines.yml
trigger:
paths:
include:
- infra/**
stages:
- stage: Validate
jobs:
- job: BicepValidation
pool:
vmImage: ubuntu-latest
steps:
- task: AzureCLI@2
displayName: 'Bicep Lint'
inputs:
azureSubscription: 'MyServiceConnection'
scriptType: bash
scriptLocation: inlineScript
inlineScript: |
az bicep build --file infra/main.bicep --stdout > /dev/null
- task: AzureCLI@2
displayName: 'What-If Analysis'
inputs:
azureSubscription: 'MyServiceConnection'
scriptType: bash
scriptLocation: inlineScript
inlineScript: |
az deployment group what-if \
--resource-group $(ResourceGroup) \
--template-file infra/main.bicep \
--parameters infra/params.$(Environment).json
- stage: Deploy
dependsOn: Validate
condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
jobs:
- deployment: DeployInfra
environment: production
strategy:
runOnce:
deploy:
steps:
- task: AzureCLI@2
inputs:
azureSubscription: 'MyServiceConnection'
scriptType: bash
scriptLocation: inlineScript
inlineScript: |
az deployment group create \
--resource-group $(ResourceGroup) \
--template-file infra/main.bicep \
--parameters infra/params.$(Environment).json
Azure Policy Testing
Test Azure Policy compliance before deployment:
func TestAzurePolicyCompliance(t *testing.T) {
t.Parallel()
subscriptionID := azure.GetSubscriptionID()
terraformOptions := &terraform.Options{
TerraformDir: "../modules/storage-account",
Vars: map[string]interface{}{
"resource_group_name": "rg-policy-test",
"storage_account_name": "stpolicytest" + random.UniqueId(),
// Intentionally non-compliant for testing
"enable_https_only": false,
},
}
// Don't auto-destroy - we want to check policy state
terraform.Init(t, terraformOptions)
// Plan should succeed
terraform.Plan(t, terraformOptions)
// But apply should fail due to policy
_, err := terraform.ApplyE(t, terraformOptions)
// Assert that the error is policy-related
assert.Error(t, err)
assert.Contains(t, err.Error(), "PolicyViolation")
// Clean up the failed deployment
terraform.Destroy(t, terraformOptions)
}
Query policy compliance programmatically:
# Trigger policy evaluation
az policy state trigger-scan --resource-group myResourceGroup
# Check compliance state
az policy state list \
--resource-group myResourceGroup \
--filter "complianceState eq 'NonCompliant'" \
--query "[].{Resource:resourceId, Policy:policyDefinitionName}"
CI/CD Integration
GitHub Actions for Azure infrastructure:
name: Azure Infrastructure
on:
pull_request:
paths:
- 'terraform/**'
permissions:
id-token: write
contents: read
pull-requests: write
jobs:
validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Azure Login (OIDC)
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
- name: Terraform Init
run: terraform init
working-directory: terraform
- name: Terraform Validate
run: terraform validate
working-directory: terraform
- name: Terraform Plan
id: plan
run: terraform plan -out=tfplan -no-color
working-directory: terraform
continue-on-error: true
- name: Run Checkov
uses: bridgecrewio/checkov-action@v12
with:
directory: terraform/
framework: terraform
- name: Comment PR
uses: actions/github-script@v7
with:
script: |
const output = `#### Terraform Plan 📖
\`\`\`
${{ steps.plan.outputs.stdout }}
\`\`\`
`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: output
})
terratest:
runs-on: ubuntu-latest
needs: validate
steps:
- uses: actions/checkout@v4
- name: Azure Login
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- name: Setup Go
uses: actions/setup-go@v5
with:
go-version: '1.22'
- name: Run Terratest
run: go test -v -timeout 30m ./tests/...
env:
ARM_CLIENT_ID: ${{ secrets.AZURE_CLIENT_ID }}
ARM_TENANT_ID: ${{ secrets.AZURE_TENANT_ID }}
ARM_SUBSCRIPTION_ID: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
ARM_USE_OIDC: true
AI-Assisted Approaches
Azure has complex naming rules and policy interactions. AI tools help navigate this.
What AI does well:
- Generating compliant resource names for Azure naming conventions
- Translating Azure Policy definitions into test assertions
- Creating Terratest code from Azure resource specifications
- Explaining Azure-specific error messages and solutions
What still needs humans:
- Understanding organizational Azure Policy requirements
- Designing test architecture for complex Azure Landing Zones
- Deciding which tests need real Azure vs local emulation
- Debugging Azure AD propagation timing issues
Useful prompt:
I have an Azure Terraform module that creates:
- Resource Group
- Storage Account with blob containers
- Key Vault with access policies
- Azure Functions with managed identity
Generate:
1. Terratest code to validate all resources
2. Azure Policy checks I should include
3. Common Azure-specific pitfalls to test for
4. Azurite tests for storage operations
When This Breaks Down
Azure infrastructure testing has limitations:
Azure AD timing: Role assignments and managed identity propagation can take minutes. Tests need retry logic and delays.
Regional differences: Some services aren’t available in all regions. Tests that work in eastus fail in other regions.
Subscription-level resources: Management groups, subscriptions, and some policies require elevated permissions that CI service principals may not have.
Cost of cleanup: Failed Terraform destroys leave orphaned resources. Azure doesn’t have the same cleanup tooling as AWS.
Consider complementary approaches:
- Multi-cloud testing for portable patterns
- Azure DevTest Labs for isolated test environments
- Policy as Code for pre-deployment validation
Decision Framework
Use Azurite when:
- Testing application code that uses Azure Storage
- Speed is critical (Azurite is instant)
- Network isolation required
Use what-if when:
- Validating Bicep/ARM deployments
- Checking Azure Policy compliance
- Pre-deployment change review
Use Terratest with real Azure when:
- Testing complete infrastructure modules
- Validating cross-resource integrations
- Final validation before production
Measuring Success
| Metric | Before | After | How to Track |
|---|---|---|---|
| Azure Policy failures in CI | Frequent | 0 | Deployment logs |
| Test execution time | 20+ min | <10 min | CI metrics |
| Orphaned test resources | Unknown | 0 | Azure Cost Management |
| First-deploy success rate | 60% | 95%+ | Deployment history |
Warning signs it’s not working:
- what-if passes but deploy fails
- Tests flaky due to Azure AD timing
- Growing list of manual cleanup tasks
- Teams bypassing CI for “quick” deployments
What’s Next
Start with validation, then expand to integration:
- Add
az deployment what-ifto every PR - Implement Checkov for Azure policy scanning
- Set up Azurite for local storage testing
- Add Terratest for critical infrastructure modules
- Configure cleanup automation for failed tests
- Track Azure-specific test metrics
The goal is catching Azure-specific issues before deployment, not after.
Related articles:
- AWS Infrastructure Testing with LocalStack
- Multi-Cloud Infrastructure Testing
- Policy as Code Testing: OPA vs Sentinel
- Terraform Testing and Validation Strategies
External resources:
Official Resources
See Also
- Network Configuration Testing: Batfish, Terraform, and VPC Validation for Cloud Infrastructure - Master network configuration testing with Batfish for…
- Infrastructure Scalability Testing: Validating Auto-Scaling with K6, Locust, and Terraform - Master infrastructure scalability testing with K6, Locust, and…
- Security Group Testing: Validating AWS Security Groups, Azure NSGs, and GCP Firewall Rules - Master security group testing across AWS, Azure, and GCP with…
- Cost Estimation Testing for Infrastructure as Code: Complete Guide - Master cost estimation testing for IaC with Infracost, terraform…
