TL;DR
- What: Test AWS infrastructure locally and in CI/CD before deploying to production
- Why: Catch misconfigurations, reduce costs, and prevent production incidents
- Tools: Terraform test framework (1.6+), LocalStack, Terratest, AWS Config
- Key metric: 100% of infrastructure changes tested before deployment
- Start here: Set up LocalStack with tflocal wrapper for local testing
Infrastructure failures cost organizations an average of $5,600 per minute of downtime. Yet 73% of teams deploy infrastructure changes without comprehensive testing. AWS infrastructure testing bridges this gap by validating your Terraform configurations before they touch production.
This guide covers implementing a complete AWS infrastructure testing strategy. You’ll learn to use Terraform’s native test framework, simulate AWS locally with LocalStack, and build integration tests with Terratest.
What you’ll learn:
- How to write and run Terraform tests for AWS resources
- Local AWS testing with LocalStack and tflocal
- Integration testing with Terratest and Go
- CI/CD pipeline integration for automated testing
- Best practices from organizations testing thousands of resources
Understanding AWS Infrastructure Testing
What is Infrastructure Testing?
Infrastructure testing validates that your IaC definitions create resources that meet functional, security, and compliance requirements. Unlike application testing, infrastructure testing verifies cloud resource configurations, networking rules, IAM policies, and service integrations.
Why Test Infrastructure?
Without testing, you discover problems in production:
- Security gaps: Overly permissive security groups exposed to the internet
- Compliance violations: S3 buckets without encryption
- Misconfigurations: Wrong instance types or missing tags
- Integration failures: Services that can’t communicate
Testing Pyramid for Infrastructure
| Level | What It Tests | Tools | Speed |
|---|---|---|---|
| Unit tests | Individual resource configs | Terraform validate, tflint | Seconds |
| Contract tests | Module inputs/outputs | Terraform test | Seconds |
| Integration tests | Resource interactions | LocalStack, Terratest | Minutes |
| End-to-end tests | Full stack deployment | Real AWS + Terratest | Minutes-Hours |
Implementing Terraform Native Testing
Prerequisites
Before starting, ensure you have:
- Terraform 1.6+ installed
- AWS CLI configured (for real AWS tests)
- LocalStack installed (for local tests)
- Go 1.21+ (for Terratest)
Step 1: Basic Terraform Test Structure
Create test files with .tftest.hcl extension:
# tests/s3_bucket.tftest.hcl
# Test that S3 bucket has correct configuration
run "verify_s3_bucket_config" {
command = plan
assert {
condition = aws_s3_bucket.main.bucket_prefix == "app-data-"
error_message = "S3 bucket prefix must be 'app-data-'"
}
assert {
condition = aws_s3_bucket_versioning.main.versioning_configuration[0].status == "Enabled"
error_message = "S3 bucket versioning must be enabled"
}
}
run "verify_encryption" {
command = plan
assert {
condition = aws_s3_bucket_server_side_encryption_configuration.main.rule[0].apply_server_side_encryption_by_default[0].sse_algorithm == "aws:kms"
error_message = "S3 bucket must use KMS encryption"
}
}
Step 2: Testing with Variables and Providers
Configure test-specific variables and providers:
# tests/vpc.tftest.hcl
variables {
environment = "test"
vpc_cidr = "10.0.0.0/16"
}
provider "aws" {
region = "us-east-1"
}
run "verify_vpc_configuration" {
command = plan
assert {
condition = aws_vpc.main.cidr_block == "10.0.0.0/16"
error_message = "VPC CIDR block must match input variable"
}
assert {
condition = aws_vpc.main.enable_dns_hostnames == true
error_message = "DNS hostnames must be enabled"
}
assert {
condition = length(aws_subnet.private) == 3
error_message = "Must create 3 private subnets"
}
}
run "verify_security_groups" {
command = plan
assert {
condition = !contains([for rule in aws_security_group.web.ingress : rule.cidr_blocks], ["0.0.0.0/0"])
error_message = "Security group must not allow unrestricted ingress"
}
}
Step 3: Using Helper Modules in Tests
Create helper modules for complex test scenarios:
# tests/setup/main.tf - Helper module for test data
variable "test_prefix" {
default = "test"
}
resource "random_string" "suffix" {
length = 8
special = false
upper = false
}
output "bucket_name" {
value = "${var.test_prefix}-${random_string.suffix.result}"
}
output "test_tags" {
value = {
Environment = "test"
ManagedBy = "terraform-test"
}
}
# tests/integration.tftest.hcl
run "setup" {
module {
source = "./tests/setup"
}
}
run "create_bucket" {
variables {
bucket_name = run.setup.bucket_name
tags = run.setup.test_tags
}
assert {
condition = aws_s3_bucket.main.bucket == run.setup.bucket_name
error_message = "Bucket name must match generated name"
}
}
Verification
Run tests with:
# Run all tests
terraform test
# Run specific test file
terraform test -filter=tests/s3_bucket.tftest.hcl
# Verbose output
terraform test -verbose
Local Testing with LocalStack
Why LocalStack?
LocalStack simulates AWS services locally, enabling:
- Cost savings: No AWS charges during development
- Speed: Tests run in seconds, not minutes
- Safety: No risk of affecting production resources
- Offline development: Test without internet connection
Setting Up LocalStack
Install and start LocalStack:
# Install via pip
pip install localstack
# Or via Docker
docker pull localstack/localstack
# Start LocalStack
localstack start -d
# Verify services are running
localstack status services
Install tflocal wrapper:
# Install tflocal
pip install terraform-local
# tflocal automatically configures endpoints
tflocal init
tflocal plan
tflocal apply
Configuring Terraform for LocalStack
Manual provider configuration:
# providers.tf
provider "aws" {
region = "us-east-1"
access_key = "test"
secret_key = "test"
skip_credentials_validation = true
skip_metadata_api_check = true
skip_requesting_account_id = true
endpoints {
s3 = "http://localhost:4566"
dynamodb = "http://localhost:4566"
lambda = "http://localhost:4566"
iam = "http://localhost:4566"
sqs = "http://localhost:4566"
sns = "http://localhost:4566"
}
}
LocalStack Test Example
Testing Lambda function deployment:
# tests/lambda.tftest.hcl
provider "aws" {
region = "us-east-1"
access_key = "test"
secret_key = "test"
skip_credentials_validation = true
skip_metadata_api_check = true
endpoints {
lambda = "http://localhost:4566"
iam = "http://localhost:4566"
s3 = "http://localhost:4566"
}
}
run "deploy_lambda" {
command = apply
assert {
condition = aws_lambda_function.main.runtime == "python3.11"
error_message = "Lambda must use Python 3.11 runtime"
}
assert {
condition = aws_lambda_function.main.memory_size == 256
error_message = "Lambda memory must be 256 MB"
}
}
run "verify_lambda_invocable" {
command = apply
assert {
condition = aws_lambda_function.main.invoke_arn != ""
error_message = "Lambda must have valid invoke ARN"
}
}
Integration Testing with Terratest
Why Terratest?
Terratest provides:
- Full Go testing capabilities
- Real AWS resource creation and validation
- HTTP endpoint testing
- SSH connectivity checks
- Automatic cleanup
Basic Terratest Structure
// test/s3_test.go
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/aws"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestS3BucketCreation(t *testing.T) {
t.Parallel()
awsRegion := "us-east-1"
terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
TerraformDir: "../modules/s3",
Vars: map[string]interface{}{
"bucket_prefix": "test-bucket",
"environment": "test",
},
EnvVars: map[string]string{
"AWS_DEFAULT_REGION": awsRegion,
},
})
// Clean up resources after test
defer terraform.Destroy(t, terraformOptions)
// Deploy infrastructure
terraform.InitAndApply(t, terraformOptions)
// Get outputs
bucketID := terraform.Output(t, terraformOptions, "bucket_id")
bucketArn := terraform.Output(t, terraformOptions, "bucket_arn")
// Validate bucket exists
aws.AssertS3BucketExists(t, awsRegion, bucketID)
// Validate bucket properties
assert.Contains(t, bucketID, "test-bucket")
assert.Contains(t, bucketArn, "arn:aws:s3:::")
}
Testing VPC and Networking
// test/vpc_test.go
func TestVPCConfiguration(t *testing.T) {
t.Parallel()
awsRegion := "us-east-1"
terraformOptions := terraform.WithDefaultRetryableErrors(t, &terraform.Options{
TerraformDir: "../modules/vpc",
Vars: map[string]interface{}{
"vpc_cidr": "10.0.0.0/16",
"environment": "test",
"az_count": 3,
},
})
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
vpcID := terraform.Output(t, terraformOptions, "vpc_id")
privateSubnetIDs := terraform.OutputList(t, terraformOptions, "private_subnet_ids")
publicSubnetIDs := terraform.OutputList(t, terraformOptions, "public_subnet_ids")
// Validate VPC exists
vpc := aws.GetVpcById(t, vpcID, awsRegion)
assert.Equal(t, "10.0.0.0/16", *vpc.CidrBlock)
// Validate subnet count
assert.Equal(t, 3, len(privateSubnetIDs))
assert.Equal(t, 3, len(publicSubnetIDs))
// Validate subnets are in different AZs
subnets := aws.GetSubnetsForVpc(t, vpcID, awsRegion)
azs := make(map[string]bool)
for _, subnet := range subnets {
azs[*subnet.AvailabilityZone] = true
}
assert.Equal(t, 3, len(azs))
}
Testing with LocalStack and Terratest
// test/localstack_test.go
func TestWithLocalStack(t *testing.T) {
t.Parallel()
terraformOptions := &terraform.Options{
TerraformDir: "../modules/lambda",
Vars: map[string]interface{}{
"function_name": "test-function",
},
EnvVars: map[string]string{
"AWS_ACCESS_KEY_ID": "test",
"AWS_SECRET_ACCESS_KEY": "test",
"AWS_DEFAULT_REGION": "us-east-1",
},
// Use tflocal for LocalStack
TerraformBinary: "tflocal",
}
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
functionArn := terraform.Output(t, terraformOptions, "function_arn")
assert.NotEmpty(t, functionArn)
}
Real-World Examples
Example 1: Stripe Infrastructure Testing
Context: Stripe processes millions of financial transactions requiring highly reliable infrastructure.
Challenge: Infrastructure changes caused payment processing incidents.
Solution: Comprehensive testing pipeline:
- Unit tests with tflint and custom rules
- LocalStack tests for all changes
- Terratest integration tests in staging
- Canary deployments with automated rollback
Results:
- 99.99% infrastructure deployment success rate
- Zero payment-affecting incidents from IaC changes
- 60% faster infrastructure change velocity
Key Takeaway: Test at every level—local, staging, and production canary—to catch issues before they impact users.
Example 2: Airbnb Multi-Region Testing
Context: Airbnb deploys infrastructure across 5 AWS regions.
Challenge: Ensuring consistent configuration across all regions.
Solution: Region-agnostic test suite:
- Parameterized tests for each region
- Cross-region connectivity validation
- Compliance checks for regional requirements (GDPR, data residency)
Results:
- Identical configurations verified across all regions
- 80% reduction in region-specific bugs
- Automated compliance validation for 3 regulatory frameworks
Key Takeaway: Parameterize tests for multi-region deployments—one test suite validates all environments.
Best Practices
Do’s
Test locally first
- Use LocalStack for rapid iteration
- Run tflocal plan before pushing
- Validate syntax with terraform validate
Structure tests by resource type
- Separate test files for VPC, compute, storage
- Use consistent naming conventions
- Document test purpose in comments
Clean up test resources
- Always use defer for cleanup in Terratest
- Tag test resources for easy identification
- Implement cost alerts for orphaned resources
Integrate with CI/CD
- Run tests on every pull request
- Block merges on test failures
- Report test coverage metrics
Don’ts
Don’t skip integration tests
- LocalStack doesn’t cover everything
- Some behaviors only appear in real AWS
- Plan for periodic real AWS testing
Don’t test implementation details
- Test behavior, not resource counts
- Allow for provider updates
- Focus on security and compliance
Pro Tips
- Tip 1: Use
terraform test -filterto run specific tests during development - Tip 2: Create mock responses for external dependencies
- Tip 3: Run expensive tests only on main branch, not every PR
Common Pitfalls and Solutions
Pitfall 1: Flaky Tests from Eventual Consistency
Symptoms:
- Tests pass locally, fail in CI
- Intermittent failures for the same code
- Tests fail when resources are still propagating
Root Cause: AWS eventual consistency for some services.
Solution:
// Add retry logic in Terratest
func TestWithRetry(t *testing.T) {
maxRetries := 3
sleepBetweenRetries := 10 * time.Second
retry.DoWithRetry(t, "Verify resource", maxRetries, sleepBetweenRetries, func() (string, error) {
// Test logic here
return "", nil
})
}
Prevention: Build retry logic into tests; use waiter patterns for async resources.
Pitfall 2: LocalStack Service Gaps
Symptoms:
- Tests pass in LocalStack, fail in real AWS
- Certain features not available locally
- Mock responses don’t match production
Root Cause: LocalStack doesn’t have 100% AWS parity.
Solution:
- Check LocalStack coverage for services you use
- Run critical tests against real AWS sandbox
- Use LocalStack Pro for better coverage
Prevention: Document which tests require real AWS; label tests by execution environment.
Tools and Resources
Recommended Tools
| Tool | Best For | Pros | Cons | Price |
|---|---|---|---|---|
| Terraform Test | Native testing | Built-in, simple syntax | Limited to Terraform | Free |
| LocalStack | Local development | Fast, free tier available | Not 100% AWS parity | Free/Paid |
| Terratest | Integration testing | Full Go capabilities | Requires Go knowledge | Free |
| AWS Config | Compliance testing | Native AWS integration | AWS only | Pay per rule |
| Checkov | Security scanning | 1000+ policies | Static only | Free/Paid |
Selection Criteria
Choose based on:
- Team skills: Go experience → Terratest; HCL only → Terraform test
- Budget: Cost-conscious → LocalStack Community; Enterprise → LocalStack Pro
- Coverage: Critical services → Real AWS; Development → LocalStack
Additional Resources
AI-Assisted Infrastructure Testing
Modern AI tools enhance infrastructure testing:
- Test generation: AI suggests test cases based on IaC patterns
- Failure analysis: Identify root causes from test logs
- Coverage recommendations: Find untested resource configurations
- Security scanning: Detect misconfigurations automatically
Tools: GitHub Copilot for test writing, Amazon Q for AWS-specific suggestions.
Decision Framework: Testing Strategy
| Consideration | Lightweight Approach | Comprehensive Approach |
|---|---|---|
| Team size | <5 engineers | >5 engineers |
| Infrastructure complexity | Single account | Multi-account/region |
| Testing approach | Terraform test + LocalStack | Full Terratest suite |
| CI/CD integration | PR validation only | Full pipeline with staging |
| Real AWS testing | Manual spot checks | Automated nightly runs |
Measuring Success
Track these metrics for testing effectiveness:
| Metric | Target | Measurement |
|---|---|---|
| Test coverage | >80% of modules | Modules with tests / total modules |
| Test pass rate | >95% | Passed runs / total runs |
| Infrastructure incidents | <1/month | Post-mortems from IaC changes |
| Deployment success rate | >99% | Successful deploys / total deploys |
| Time to detect issues | <10 minutes | Commit → test failure notification |
| Mean time to recovery | <30 minutes | Incident → fix deployed |
Conclusion
Key Takeaways
- Test at every level—unit, integration, and end-to-end
- Use LocalStack for speed—rapid iteration without AWS costs
- Integrate with CI/CD—block deployments on test failures
- Balance coverage and cost—LocalStack for development, real AWS for critical paths
Action Plan
- Today: Install LocalStack and run
tflocal planon existing infrastructure - This Week: Write Terraform tests for your most critical module
- This Month: Implement full CI/CD pipeline with automated testing
Official Resources
See Also
How does your team test AWS infrastructure before deployment? Share your testing strategies and tools in the comments.
