Infrastructure as Code Automation with Terraform
Automate infrastructure provisioning using GitOps workflows, version control integration, and continuous deployment pipelines.
Overview
Terraform enables full infrastructure automation through code versioning, CI/CD integration, and automated deployments. Infrastructure changes are tracked like application code, enabling code reviews, rollbacks, and audit trails.
GitOps Workflow with Cloud Build
YAML - cloudbuild.yaml
steps:
# Step 1: Validate Terraform code
- name: 'gcr.io/cloud-builders/gke-deploy'
entrypoint: 'bash'
args:
- '-c'
- |
echo "Validating Terraform configuration..."
docker run -v "$PWD:/workspace" -w /workspace hashicorp/terraform init
docker run -v "$PWD:/workspace" -w /workspace hashicorp/terraform validate
# Step 2: Plan infrastructure changes
- name: 'gcr.io/cloud-builders/gke-deploy'
entrypoint: 'bash'
args:
- '-c'
- |
echo "Planning Terraform changes..."
docker run -v "$PWD:/workspace" -w /workspace \
-e GOOGLE_APPLICATION_CREDENTIALS=/workspace/key.json \
hashicorp/terraform plan -out=tfplan
# Step 3: Save plan for manual review (on non-main branches)
- name: 'gcr.io/cloud-builders/github'
entrypoint: 'bash'
args:
- '-c'
- |
if [ "$BRANCH_NAME" != "main" ]; then
echo "Plan saved for review at: gs://my-bucket/tfplan-$BUILD_ID"
fi
# Step 4: Apply changes (only on main branch)
- name: 'gcr.io/cloud-builders/gke-deploy'
entrypoint: 'bash'
args:
- '-c'
- |
if [ "$BRANCH_NAME" = "main" ]; then
echo "Applying Terraform changes to production..."
docker run -v "$PWD:/workspace" -w /workspace \
-e GOOGLE_APPLICATION_CREDENTIALS=/workspace/key.json \
hashicorp/terraform apply -auto-approve tfplan
else
echo "Pull request detected. Apply skipped for review."
fi
timeout: '3600s'
Remote State Management with Cloud Storage
HCL - backend.tf
terraform {
backend "gcs" {
bucket = "my-company-terraform-state"
prefix = "prod"
encryption_key = google_kms_crypto_key.terraform_state.id
}
}
# KMS key for state encryption
resource "google_kms_key_ring" "terraform" {
name = "terraform-state"
location = "us"
}
resource "google_kms_crypto_key" "terraform_state" {
name = "terraform-state-key"
key_ring = google_kms_key_ring.terraform.id
rotation_period = "7776000s" # 90 days
}
# GCS bucket for state
resource "google_storage_bucket" "terraform_state" {
name = "my-company-terraform-state"
location = "US"
force_destroy = false
versioning {
enabled = true
}
uniform_bucket_level_access = true
encryption {
default_kms_key_name = google_kms_crypto_key.terraform_state.id
}
lifecycle_rule {
condition {
num_newer_versions = 10
}
action {
action = "Delete"
type = "Delete"
}
}
}
HCP Terraform Cloud Integration
HCL - cloud configuration
terraform {
cloud {
organization = "my-company"
workspaces {
tags = ["automated", "gcp"]
}
}
required_providers {
google = {
source = "hashicorp/google"
version = "~> 5.0"
}
}
}
# HCP Terraform automatically manages:
# - State versioning and locking
# - Team access control
# - Cost estimation
# - Policy enforcement
# - Run notifications
Sentinel Policy Enforcement
HCL - Cost Limit Policy
import "tfplan/v2" as tfplan
# Maximum monthly cost threshold
max_cost = 5000
# Calculate estimated monthly cost
estimated_cost = 0
for resource in tfplan.resource_changes {
if resource.type == "google_compute_instance" {
# Estimate: $0.24/hour * 730 hours/month
estimated_cost += 175 * resource.change.after.count
}
if resource.type == "google_sql_database_instance" {
# Estimate: $30/month per instance
estimated_cost += 30 * resource.change.after.count
}
}
main = rule {
estimated_cost < max_cost
}
Automated Drift Detection
Python - drift-monitor.py
#!/usr/bin/env python3
import subprocess
import json
from datetime import datetime
import google.cloud.logging
def check_terraform_drift():
"""Detect infrastructure drift from current state"""
logging_client = google.cloud.logging.Client()
log_name = "terraform-drift-detection"
logger = logging_client.logger(log_name)
# Run terraform plan to detect drift
result = subprocess.run(
["terraform", "plan", "-json"],
capture_output=True,
text=True
)
plan_output = json.loads(result.stdout)
drift_detected = {
"resource_changes": [],
"timestamp": datetime.utcnow().isoformat(),
"status": "clean"
}
for event in plan_output:
if event.get("type") == "resource_drift":
drift_detected["status"] = "drift_detected"
drift_detected["resource_changes"].append({
"resource": event.get("address"),
"change": event.get("change", {}).get("actions"),
"reason": "Configuration mismatch detected"
})
# Log drift detection
logger.log_struct(drift_detected, severity="WARNING")
# Alert if drift detected
if drift_detected["status"] == "drift_detected":
alert_team(drift_detected)
return drift_detected
def alert_team(drift_info):
"""Send alert to ops team"""
# Implementation: Send Slack/PagerDuty notification
print(f"ALERT: Drift detected in {len(drift_info['resource_changes'])} resources")
if __name__ == "__main__":
check_terraform_drift()
Deployment Workflow
Bash - deployment-pipeline.sh
#!/bin/bash
set -e
ENV=${1:-dev}
echo "Deploying to $ENV environment..."
# 1. Validate
echo "Step 1: Validating Terraform configuration..."
terraform validate
# 2. Format check
echo "Step 2: Checking code formatting..."
terraform fmt -check
# 3. Plan
echo "Step 3: Planning infrastructure changes..."
terraform plan -var-file="$ENV.tfvars" -out="tfplan-$ENV"
# 4. Security scan
echo "Step 4: Scanning for security issues..."
tfsec . --format json > tfsec-report.json
# 5. Cost estimate
echo "Step 5: Estimating costs..."
terraform show tfplan-$ENV | grep -E "resource|plan:"
# 6. Apply (requires manual approval for prod)
if [ "$ENV" = "prod" ]; then
read -p "Review changes above. Apply to PRODUCTION? (yes/no): " -n 3
if [ "$REPLY" = "yes" ]; then
terraform apply tfplan-$ENV
else
echo "Deployment cancelled"
exit 1
fi
else
terraform apply tfplan-$ENV
fi
echo "Deployment to $ENV completed successfully"
Best Practices
- Store all Terraform code in Git with proper version control
- Implement mandatory code review process before applying changes
- Use separate branches for different environments (dev, staging, prod)
- Enable state file versioning and encryption
- Implement state file locking to prevent concurrent modifications
- Use Sentinel policies to enforce organizational compliance
- Monitor drift between desired and actual infrastructure state
- Run cost estimation before applying changes
- Maintain audit logs of all infrastructure changes
- Automate testing with tfsec, terraform validate, and fmt