Infrastructure Automation with Terraform: Enterprise-Scale Cloud Resource Management
Infrastructure as Code (IaC) has become essential for modern cloud operations, enabling teams to manage infrastructure with the same rigor and practices used for application code. Terraform, as a leading IaC tool, provides a declarative approach to infrastructure management across multiple cloud providers. This comprehensive guide explores enterprise-scale Terraform implementations, advanced patterns, and best practices for production environments.
Terraform Enterprise Architecture
Project Structure and Organization
# Directory structure for enterprise Terraform projects
terraform-infrastructure/
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ ├── outputs.tf
│ │ └── terraform.tfvars
│ ├── staging/
│ └── production/
├── modules/
│ ├── vpc/
│ ├── eks/
│ ├── rds/
│ ├── security-groups/
│ └── iam/
├── shared/
│ ├── backend.tf
│ ├── providers.tf
│ └── versions.tf
├── policies/
│ ├── sentinel/
│ └── opa/
└── scripts/
├── deploy.sh
├── plan.sh
└── destroy.sh
Backend Configuration with State Management
# shared/backend.tf
terraform {
required_version = ">= 1.5.0"
backend "s3" {
bucket = "terraform-state-company-prod"
key = "infrastructure/terraform.tfstate"
region = "us-west-2"
encrypt = true
dynamodb_table = "terraform-state-lock"
# Workspace-specific state files
workspace_key_prefix = "workspaces"
# Additional security
kms_key_id = "arn:aws:kms:us-west-2:123456789012:key/12345678-1234-1234-1234-123456789012"
# State file versioning
versioning = true
# Server-side encryption
server_side_encryption_configuration {
rule {
apply_server_side_encryption_by_default {
kms_master_key_id = "arn:aws:kms:us-west-2:123456789012:key/12345678-1234-1234-1234-123456789012"
sse_algorithm = "aws:kms"
}
}
}
}
}
# State locking with DynamoDB
resource "aws_dynamodb_table" "terraform_state_lock" {
name = "terraform-state-lock"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
server_side_encryption {
enabled = true
kms_key_arn = aws_kms_key.terraform_state.arn
}
point_in_time_recovery {
enabled = true
}
tags = {
Name = "terraform-state-lock"
Environment = "shared"
Purpose = "terraform-state-locking"
}
}
# KMS key for state encryption
resource "aws_kms_key" "terraform_state" {
description = "KMS key for Terraform state encryption"
deletion_window_in_days = 7
enable_key_rotation = true
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "Enable IAM User Permissions"
Effect = "Allow"
Principal = {
AWS = "arn:aws:iam::123456789012:root"
}
Action = "kms:*"
Resource = "*"
},
{
Sid = "Allow Terraform Service Role"
Effect = "Allow"
Principal = {
AWS = [
"arn:aws:iam::123456789012:role/TerraformExecutionRole",
"arn:aws:iam::123456789012:role/TerraformPlanRole"
]
}
Action = [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey"
]
Resource = "*"
}
]
})
tags = {
Name = "terraform-state-key"
Environment = "shared"
Purpose = "terraform-state-encryption"
}
}
resource "aws_kms_alias" "terraform_state" {
name = "alias/terraform-state"
target_key_id = aws_kms_key.terraform_state.key_id
}
Provider Configuration
# shared/providers.tf
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.0"
}
google = {
source = "hashicorp/google"
version = "~> 4.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.0"
}
helm = {
source = "hashicorp/helm"
version = "~> 2.0"
}
random = {
source = "hashicorp/random"
version = "~> 3.0"
}
tls = {
source = "hashicorp/tls"
version = "~> 4.0"
}
}
}
# AWS Provider Configuration
provider "aws" {
region = var.aws_region
# Assume role for cross-account access
assume_role {
role_arn = var.aws_assume_role_arn
session_name = "terraform-${var.environment}"
external_id = var.aws_external_id
}
# Default tags for all resources
default_tags {
tags = {
Environment = var.environment
Project = var.project_name
ManagedBy = "terraform"
Owner = var.team_name
CostCenter = var.cost_center
CreatedDate = formatdate("YYYY-MM-DD", timestamp())
}
}
# Retry configuration
retry_mode = "adaptive"
max_retries = 3
# Request timeout
http_timeout = "30s"
}
# Azure Provider Configuration
provider "azurerm" {
features {
key_vault {
purge_soft_delete_on_destroy = true
recover_soft_deleted_key_vaults = true
}
resource_group {
prevent_deletion_if_contains_resources = false
}
virtual_machine {
delete_os_disk_on_deletion = true
graceful_shutdown = false
skip_shutdown_and_force_delete = false
}
}
# Service Principal authentication
client_id = var.azure_client_id
client_secret = var.azure_client_secret
tenant_id = var.azure_tenant_id
subscription_id = var.azure_subscription_id
# Skip provider registration
skip_provider_registration = true
}
# Google Cloud Provider Configuration
provider "google" {
project = var.gcp_project_id
region = var.gcp_region
zone = var.gcp_zone
# Service account key
credentials = var.gcp_credentials_file
# Request timeout
request_timeout = "60s"
# Batching configuration
batching {
send_after = "10s"
enable_batching = true
}
}
# Kubernetes Provider Configuration
provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
token = data.aws_eks_cluster_auth.cluster.token
# Alternative: using kubeconfig
# config_path = "~/.kube/config"
# config_context = "production-cluster"
}
# Helm Provider Configuration
provider "helm" {
kubernetes {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
token = data.aws_eks_cluster_auth.cluster.token
}
# Helm repository cache
repository_cache = "/tmp/.helmcache"
repository_config_path = "/tmp/.helmrc"
# Debug mode
debug = var.helm_debug
}
Advanced Terraform Modules
VPC Module with Advanced Networking
# modules/vpc/main.tf
locals {
# Calculate subnet CIDRs automatically
public_subnet_cidrs = [
for i in range(var.public_subnet_count) :
cidrsubnet(var.vpc_cidr, 8, i)
]
private_subnet_cidrs = [
for i in range(var.private_subnet_count) :
cidrsubnet(var.vpc_cidr, 8, i + var.public_subnet_count)
]
database_subnet_cidrs = [
for i in range(var.database_subnet_count) :
cidrsubnet(var.vpc_cidr, 8, i + var.public_subnet_count + var.private_subnet_count)
]
# Availability zones
azs = slice(data.aws_availability_zones.available.names, 0, max(
var.public_subnet_count,
var.private_subnet_count,
var.database_subnet_count
))
}
# Data sources
data "aws_availability_zones" "available" {
state = "available"
filter {
name = "opt-in-status"
values = ["opt-in-not-required"]
}
}
data "aws_region" "current" {}
# VPC
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = var.enable_dns_hostnames
enable_dns_support = var.enable_dns_support
# IPv6 support
assign_generated_ipv6_cidr_block = var.enable_ipv6
tags = merge(var.tags, {
Name = "${var.name}-vpc"
Type = "vpc"
})
}
# Internet Gateway
resource "aws_internet_gateway" "main" {
count = var.create_igw ? 1 : 0
vpc_id = aws_vpc.main.id
tags = merge(var.tags, {
Name = "${var.name}-igw"
Type = "internet-gateway"
})
}
# Egress-only Internet Gateway for IPv6
resource "aws_egress_only_internet_gateway" "main" {
count = var.enable_ipv6 && var.create_egress_only_igw ? 1 : 0
vpc_id = aws_vpc.main.id
tags = merge(var.tags, {
Name = "${var.name}-eigw"
Type = "egress-only-internet-gateway"
})
}
# Public Subnets
resource "aws_subnet" "public" {
count = var.public_subnet_count
vpc_id = aws_vpc.main.id
cidr_block = local.public_subnet_cidrs[count.index]
availability_zone = local.azs[count.index]
map_public_ip_on_launch = var.map_public_ip_on_launch
# IPv6 support
ipv6_cidr_block = var.enable_ipv6 ? cidrsubnet(aws_vpc.main.ipv6_cidr_block, 8, count.index) : null
assign_ipv6_address_on_creation = var.enable_ipv6 ? var.assign_ipv6_address_on_creation : false
tags = merge(var.tags, {
Name = "${var.name}-public-${local.azs[count.index]}"
Type = "public"
Tier = "public"
"kubernetes.io/role/elb" = "1"
})
}
# Private Subnets
resource "aws_subnet" "private" {
count = var.private_subnet_count
vpc_id = aws_vpc.main.id
cidr_block = local.private_subnet_cidrs[count.index]
availability_zone = local.azs[count.index]
# IPv6 support
ipv6_cidr_block = var.enable_ipv6 ? cidrsubnet(aws_vpc.main.ipv6_cidr_block, 8, count.index + var.public_subnet_count) : null
assign_ipv6_address_on_creation = var.enable_ipv6 ? var.assign_ipv6_address_on_creation : false
tags = merge(var.tags, {
Name = "${var.name}-private-${local.azs[count.index]}"
Type = "private"
Tier = "private"
"kubernetes.io/role/internal-elb" = "1"
})
}
# Database Subnets
resource "aws_subnet" "database" {
count = var.database_subnet_count
vpc_id = aws_vpc.main.id
cidr_block = local.database_subnet_cidrs[count.index]
availability_zone = local.azs[count.index]
tags = merge(var.tags, {
Name = "${var.name}-database-${local.azs[count.index]}"
Type = "database"
Tier = "database"
})
}
# Elastic IPs for NAT Gateways
resource "aws_eip" "nat" {
count = var.enable_nat_gateway ? (var.single_nat_gateway ? 1 : var.private_subnet_count) : 0
domain = "vpc"
depends_on = [aws_internet_gateway.main]
tags = merge(var.tags, {
Name = "${var.name}-nat-eip-${count.index + 1}"
Type = "nat-gateway-eip"
})
}
# NAT Gateways
resource "aws_nat_gateway" "main" {
count = var.enable_nat_gateway ? (var.single_nat_gateway ? 1 : var.private_subnet_count) : 0
allocation_id = aws_eip.nat[count.index].id
subnet_id = aws_subnet.public[var.single_nat_gateway ? 0 : count.index].id
depends_on = [aws_internet_gateway.main]
tags = merge(var.tags, {
Name = "${var.name}-nat-gateway-${count.index + 1}"
Type = "nat-gateway"
})
}
# Route Tables
resource "aws_route_table" "public" {
count = var.public_subnet_count > 0 ? 1 : 0
vpc_id = aws_vpc.main.id
tags = merge(var.tags, {
Name = "${var.name}-public-rt"
Type = "public-route-table"
})
}
resource "aws_route_table" "private" {
count = var.private_subnet_count
vpc_id = aws_vpc.main.id
tags = merge(var.tags, {
Name = "${var.name}-private-rt-${count.index + 1}"
Type = "private-route-table"
})
}
resource "aws_route_table" "database" {
count = var.database_subnet_count > 0 && var.create_database_subnet_route_table ? 1 : 0
vpc_id = aws_vpc.main.id
tags = merge(var.tags, {
Name = "${var.name}-database-rt"
Type = "database-route-table"
})
}
# Routes
resource "aws_route" "public_internet_gateway" {
count = var.create_igw && var.public_subnet_count > 0 ? 1 : 0
route_table_id = aws_route_table.public[0].id
destination_cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main[0].id
timeouts {
create = "5m"
}
}
resource "aws_route" "public_internet_gateway_ipv6" {
count = var.create_igw && var.enable_ipv6 && var.public_subnet_count > 0 ? 1 : 0
route_table_id = aws_route_table.public[0].id
destination_ipv6_cidr_block = "::/0"
gateway_id = aws_internet_gateway.main[0].id
timeouts {
create = "5m"
}
}
resource "aws_route" "private_nat_gateway" {
count = var.enable_nat_gateway ? var.private_subnet_count : 0
route_table_id = aws_route_table.private[count.index].id
destination_cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.main[var.single_nat_gateway ? 0 : count.index].id
timeouts {
create = "5m"
}
}
resource "aws_route" "private_ipv6_egress" {
count = var.enable_ipv6 && var.create_egress_only_igw ? var.private_subnet_count : 0
route_table_id = aws_route_table.private[count.index].id
destination_ipv6_cidr_block = "::/0"
egress_only_gateway_id = aws_egress_only_internet_gateway.main[0].id
timeouts {
create = "5m"
}
}
# Route Table Associations
resource "aws_route_table_association" "public" {
count = var.public_subnet_count
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public[0].id
}
resource "aws_route_table_association" "private" {
count = var.private_subnet_count
subnet_id = aws_subnet.private[count.index].id
route_table_id = aws_route_table.private[count.index].id
}
resource "aws_route_table_association" "database" {
count = var.database_subnet_count > 0 && var.create_database_subnet_route_table ? var.database_subnet_count : 0
subnet_id = aws_subnet.database[count.index].id
route_table_id = aws_route_table.database[0].id
}
# Database Subnet Group
resource "aws_db_subnet_group" "database" {
count = var.database_subnet_count > 0 && var.create_database_subnet_group ? 1 : 0
name = "${var.name}-database-subnet-group"
subnet_ids = aws_subnet.database[*].id
tags = merge(var.tags, {
Name = "${var.name}-database-subnet-group"
Type = "database-subnet-group"
})
}
# VPC Flow Logs
resource "aws_flow_log" "vpc" {
count = var.enable_flow_log ? 1 : 0
iam_role_arn = aws_iam_role.flow_log[0].arn
log_destination = aws_cloudwatch_log_group.vpc_flow_log[0].arn
traffic_type = var.flow_log_traffic_type
vpc_id = aws_vpc.main.id
tags = merge(var.tags, {
Name = "${var.name}-vpc-flow-log"
Type = "vpc-flow-log"
})
}
resource "aws_cloudwatch_log_group" "vpc_flow_log" {
count = var.enable_flow_log ? 1 : 0
name = "/aws/vpc/flow-logs/${var.name}"
retention_in_days = var.flow_log_retention_in_days
kms_key_id = var.flow_log_kms_key_id
tags = merge(var.tags, {
Name = "${var.name}-vpc-flow-log-group"
Type = "cloudwatch-log-group"
})
}
resource "aws_iam_role" "flow_log" {
count = var.enable_flow_log ? 1 : 0
name = "${var.name}-vpc-flow-log-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "vpc-flow-logs.amazonaws.com"
}
}
]
})
tags = merge(var.tags, {
Name = "${var.name}-vpc-flow-log-role"
Type = "iam-role"
})
}
resource "aws_iam_role_policy" "flow_log" {
count = var.enable_flow_log ? 1 : 0
name = "${var.name}-vpc-flow-log-policy"
role = aws_iam_role.flow_log[0].id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
"logs:DescribeLogGroups",
"logs:DescribeLogStreams"
]
Effect = "Allow"
Resource = "*"
}
]
})
}
# VPC Endpoints
resource "aws_vpc_endpoint" "s3" {
count = var.enable_s3_endpoint ? 1 : 0
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.${data.aws_region.current.name}.s3"
tags = merge(var.tags, {
Name = "${var.name}-s3-endpoint"
Type = "vpc-endpoint"
})
}
resource "aws_vpc_endpoint_route_table_association" "s3_private" {
count = var.enable_s3_endpoint ? var.private_subnet_count : 0
vpc_endpoint_id = aws_vpc_endpoint.s3[0].id
route_table_id = aws_route_table.private[count.index].id
}
resource "aws_vpc_endpoint" "dynamodb" {
count = var.enable_dynamodb_endpoint ? 1 : 0
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.${data.aws_region.current.name}.dynamodb"
tags = merge(var.tags, {
Name = "${var.name}-dynamodb-endpoint"
Type = "vpc-endpoint"
})
}
resource "aws_vpc_endpoint_route_table_association" "dynamodb_private" {
count = var.enable_dynamodb_endpoint ? var.private_subnet_count : 0
vpc_endpoint_id = aws_vpc_endpoint.dynamodb[0].id
route_table_id = aws_route_table.private[count.index].id
}
# Interface VPC Endpoints
resource "aws_vpc_endpoint" "interface_endpoints" {
for_each = var.interface_endpoints
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.${data.aws_region.current.name}.${each.key}"
vpc_endpoint_type = "Interface"
subnet_ids = aws_subnet.private[*].id
security_group_ids = [aws_security_group.vpc_endpoint[0].id]
private_dns_enabled = true
policy = each.value.policy
tags = merge(var.tags, {
Name = "${var.name}-${each.key}-endpoint"
Type = "vpc-endpoint"
})
}
resource "aws_security_group" "vpc_endpoint" {
count = length(var.interface_endpoints) > 0 ? 1 : 0
name_prefix = "${var.name}-vpc-endpoint-"
vpc_id = aws_vpc.main.id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = [var.vpc_cidr]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = merge(var.tags, {
Name = "${var.name}-vpc-endpoint-sg"
Type = "security-group"
})
}
EKS Module with Advanced Configuration
# modules/eks/main.tf
locals {
cluster_name = "${var.cluster_name}-${var.environment}"
# Node group configurations
node_groups = {
for k, v in var.node_groups : k => merge({
instance_types = ["t3.medium"]
capacity_type = "ON_DEMAND"
disk_size = 50
disk_type = "gp3"
scaling_config = {
desired_size = 2
max_size = 10
min_size = 1
}
update_config = {
max_unavailable_percentage = 25
}
# Kubernetes labels
labels = {}
# Kubernetes taints
taints = []
# Launch template
launch_template = {}
# User data
user_data = ""
# Security groups
additional_security_group_ids = []
# Subnets
subnet_ids = []
# Tags
tags = {}
}, v)
}
}
# Data sources
data "aws_caller_identity" "current" {}
data "aws_region" "current" {}
data "aws_partition" "current" {}
data "aws_iam_policy_document" "cluster_assume_role_policy" {
statement {
actions = ["sts:AssumeRole"]
principals {
type = "Service"
identifiers = ["eks.amazonaws.com"]
}
}
}
data "aws_iam_policy_document" "node_group_assume_role_policy" {
statement {
actions = ["sts:AssumeRole"]
principals {
type = "Service"
identifiers = ["ec2.amazonaws.com"]
}
}
}
# KMS key for EKS cluster encryption
resource "aws_kms_key" "eks" {
count = var.create_kms_key ? 1 : 0
description = "EKS Secret Encryption Key for ${local.cluster_name}"
deletion_window_in_days = var.kms_key_deletion_window_in_days
enable_key_rotation = var.enable_kms_key_rotation
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "Enable IAM User Permissions"
Effect = "Allow"
Principal = {
AWS = "arn:${data.aws_partition.current.partition}:iam::${data.aws_caller_identity.current.account_id}:root"
}
Action = "kms:*"
Resource = "*"
},
{
Sid = "Allow EKS Service"
Effect = "Allow"
Principal = {
Service = "eks.amazonaws.com"
}
Action = [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey"
]
Resource = "*"
}
]
})
tags = merge(var.tags, {
Name = "${local.cluster_name}-eks-key"
Type = "kms-key"
})
}
resource "aws_kms_alias" "eks" {
count = var.create_kms_key ? 1 : 0
name = "alias/${local.cluster_name}-eks"
target_key_id = aws_kms_key.eks[0].key_id
}
# EKS Cluster IAM Role
resource "aws_iam_role" "cluster" {
name = "${local.cluster_name}-cluster-role"
assume_role_policy = data.aws_iam_policy_document.cluster_assume_role_policy.json
tags = merge(var.tags, {
Name = "${local.cluster_name}-cluster-role"
Type = "iam-role"
})
}
resource "aws_iam_role_policy_attachment" "cluster_AmazonEKSClusterPolicy" {
policy_arn = "arn:${data.aws_partition.current.partition}:iam::aws:policy/AmazonEKSClusterPolicy"
role = aws_iam_role.cluster.name
}
resource "aws_iam_role_policy_attachment" "cluster_AmazonEKSVPCResourceController" {
policy_arn = "arn:${data.aws_partition.current.partition}:iam::aws:policy/AmazonEKSVPCResourceController"
role = aws_iam_role.cluster.name
}
# Additional cluster policies
resource "aws_iam_role_policy" "cluster_additional" {
count = length(var.cluster_additional_policies) > 0 ? 1 : 0
name = "${local.cluster_name}-cluster-additional-policy"
role = aws_iam_role.cluster.id
policy = jsonencode({
Version = "2012-10-17"
Statement = var.cluster_additional_policies
})
}
# EKS Cluster Security Group
resource "aws_security_group" "cluster" {
name_prefix = "${local.cluster_name}-cluster-"
vpc_id = var.vpc_id
description = "EKS cluster security group"
# Allow all outbound traffic
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = merge(var.tags, {
Name = "${local.cluster_name}-cluster-sg"
Type = "security-group"
})
}
# Cluster security group rules
resource "aws_security_group_rule" "cluster_ingress_workstation_https" {
count = length(var.cluster_endpoint_private_access_cidrs) > 0 ? 1 : 0
description = "Allow workstation to communicate with the cluster API Server"
type = "ingress"
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = var.cluster_endpoint_private_access_cidrs
security_group_id = aws_security_group.cluster.id
}
# Node group security group
resource "aws_security_group" "node_group" {
name_prefix = "${local.cluster_name}-node-group-"
vpc_id = var.vpc_id
description = "EKS node group security group"
# Allow all outbound traffic
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = merge(var.tags, {
Name = "${local.cluster_name}-node-group-sg"
Type = "security-group"
"kubernetes.io/cluster/${local.cluster_name}" = "owned"
})
}
# Node group security group rules
resource "aws_security_group_rule" "node_group_ingress_self" {
description = "Allow node to communicate with each other"
type = "ingress"
from_port = 0
to_port = 65535
protocol = "-1"
source_security_group_id = aws_security_group.node_group.id
security_group_id = aws_security_group.node_group.id
}
resource "aws_security_group_rule" "node_group_ingress_cluster_https" {
description = "Allow pods to communicate with the cluster API Server"
type = "ingress"
from_port = 443
to_port = 443
protocol = "tcp"
source_security_group_id = aws_security_group.cluster.id
security_group_id = aws_security_group.node_group.id
}
resource "aws_security_group_rule" "node_group_ingress_cluster_kubelet" {
description = "Allow cluster control plane to communicate with worker node kubelet"
type = "ingress"
from_port = 10250
to_port = 10250
protocol = "tcp"
source_security_group_id = aws_security_group.cluster.id
security_group_id = aws_security_group.node_group.id
}
resource "aws_security_group_rule" "cluster_ingress_node_group_https" {
description = "Allow pods to communicate with the cluster API Server"
type = "ingress"
from_port = 443
to_port = 443
protocol = "tcp"
source_security_group_id = aws_security_group.node_group.id
security_group_id = aws_security_group.cluster.id
}
# EKS Cluster
resource "aws_eks_cluster" "main" {
name = local.cluster_name
role_arn = aws_iam_role.cluster.arn
version = var.cluster_version
vpc_config {
subnet_ids = var.subnet_ids
endpoint_private_access = var.cluster_endpoint_private_access
endpoint_public_access = var.cluster_endpoint_public_access
public_access_cidrs = var.cluster_endpoint_public_access_cidrs
security_group_ids = [aws_security_group.cluster.id]
}
# Encryption configuration
dynamic "encryption_config" {
for_each = var.cluster_encryption_config
content {
provider {
key_arn = var.create_kms_key ? aws_kms_key.eks[0].arn : encryption_config.value.provider_key_arn
}
resources = encryption_config.value.resources
}
}
# Logging configuration
enabled_cluster_log_types = var.cluster_enabled_log_types
# Add-ons will be managed separately
depends_on = [
aws_iam_role_policy_attachment.cluster_AmazonEKSClusterPolicy,
aws_iam_role_policy_attachment.cluster_AmazonEKSVPCResourceController,
aws_cloudwatch_log_group.cluster,
]
tags = merge(var.tags, {
Name = local.cluster_name
Type = "eks-cluster"
})
}
# CloudWatch Log Group for EKS cluster logs
resource "aws_cloudwatch_log_group" "cluster" {
name = "/aws/eks/${local.cluster_name}/cluster"
retention_in_days = var.cloudwatch_log_group_retention_in_days
kms_key_id = var.cloudwatch_log_group_kms_key_id
tags = merge(var.tags, {
Name = "${local.cluster_name}-cluster-logs"
Type = "cloudwatch-log-group"
})
}
# EKS Node Group IAM Role
resource "aws_iam_role" "node_group" {
name = "${local.cluster_name}-node-group-role"
assume_role_policy = data.aws_iam_policy_document.node_group_assume_role_policy.json
tags = merge(var.tags, {
Name = "${local.cluster_name}-node-group-role"
Type = "iam-role"
})
}
resource "aws_iam_role_policy_attachment" "node_group_AmazonEKSWorkerNodePolicy" {
policy_arn = "arn:${data.aws_partition.current.partition}:iam::aws:policy/AmazonEKSWorkerNodePolicy"
role = aws_iam_role.node_group.name
}
resource "aws_iam_role_policy_attachment" "node_group_AmazonEKS_CNI_Policy" {
policy_arn = "arn:${data.aws_partition.current.partition}:iam::aws:policy/AmazonEKS_CNI_Policy"
role = aws_iam_role.node_group.name
}
resource "aws_iam_role_policy_attachment" "node_group_AmazonEC2ContainerRegistryReadOnly" {
policy_arn = "arn:${data.aws_partition.current.partition}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
role = aws_iam_role.node_group.name
}
resource "aws_iam_role_policy_attachment" "node_group_AmazonSSMManagedInstanceCore" {
count = var.enable_ssm ? 1 : 0
policy_arn = "arn:${data.aws_partition.current.partition}:iam::aws:policy/AmazonSSMManagedInstanceCore"
role = aws_iam_role.node_group.name
}
# Additional node group policies
resource "aws_iam_role_policy" "node_group_additional" {
count = length(var.node_group_additional_policies) > 0 ? 1 : 0
name = "${local.cluster_name}-node-group-additional-policy"
role = aws_iam_role.node_group.id
policy = jsonencode({
Version = "2012-10-17"
Statement = var.node_group_additional_policies
})
}
# Launch template for node groups
resource "aws_launch_template" "node_group" {
for_each = local.node_groups
name_prefix = "${local.cluster_name}-${each.key}-"
vpc_security_group_ids = concat(
[aws_security_group.node_group.id],
each.value.additional_security_group_ids
)
user_data = base64encode(templatefile("${path.module}/user_data.sh", {
cluster_name = local.cluster_name
cluster_endpoint = aws_eks_cluster.main.endpoint
cluster_ca = aws_eks_cluster.main.certificate_authority[0].data
bootstrap_arguments = each.value.bootstrap_arguments
user_data_script = each.value.user_data
}))
dynamic "block_device_mappings" {
for_each = each.value.block_device_mappings
content {
device_name = block_device_mappings.value.device_name
ebs {
volume_size = block_device_mappings.value.ebs.volume_size
volume_type = block_device_mappings.value.ebs.volume_type
iops = block_device_mappings.value.ebs.iops
throughput = block_device_mappings.value.ebs.throughput
encrypted = block_device_mappings.value.ebs.encrypted
kms_key_id = block_device_mappings.value.ebs.kms_key_id
delete_on_termination = block_device_mappings.value.ebs.delete_on_termination
}
}
}
dynamic "metadata_options" {
for_each = each.value.metadata_options != null ? [each.value.metadata_options] : []
content {
http_endpoint = metadata_options.value.http_endpoint
http_tokens = metadata_options.value.http_tokens
http_put_response_hop_limit = metadata_options.value.http_put_response_hop_limit
instance_metadata_tags = metadata_options.value.instance_metadata_tags
}
}
dynamic "monitoring" {
for_each = each.value.enable_monitoring ? [1] : []
content {
enabled = true
}
}
tag_specifications {
resource_type = "instance"
tags = merge(var.tags, each.value.tags, {
Name = "${local.cluster_name}-${each.key}-node"
Type = "eks-node"
})
}
tag_specifications {
resource_type = "volume"
tags = merge(var.tags, each.value.tags, {
Name = "${local.cluster_name}-${each.key}-volume"
Type = "eks-node-volume"
})
}
tags = merge(var.tags, {
Name = "${local.cluster_name}-${each.key}-lt"
Type = "launch-template"
})
}
# EKS Node Groups
resource "aws_eks_node_group" "main" {
for_each = local.node_groups
cluster_name = aws_eks_cluster.main.name
node_group_name = "${local.cluster_name}-${each.key}"
node_role_arn = aws_iam_role.node_group.arn
subnet_ids = length(each.value.subnet_ids) > 0 ? each.value.subnet_ids : var.subnet_ids
instance_types = each.value.instance_types
capacity_type = each.value.capacity_type
disk_size = each.value.disk_size
ami_type = each.value.ami_type
release_version = each.value.release_version
version = each.value.version
scaling_config {
desired_size = each.value.scaling_config.desired_size
max_size = each.value.scaling_config.max_size
min_size = each.value.scaling_config.min_size
}
update_config {
max_unavailable_percentage = each.value.update_config.max_unavailable_percentage
}
# Launch template
launch_template {
id = aws_launch_template.node_group[each.key].id
version = aws_launch_template.node_group[each.key].latest_version
}
# Labels
labels = merge(each.value.labels, {
"node-group" = each.key
})
# Taints
dynamic "taint" {
for_each = each.value.taints
content {
key = taint.value.key
value = taint.value.value
effect = taint.value.effect
}
}
depends_on = [
aws_iam_role_policy_attachment.node_group_AmazonEKSWorkerNodePolicy,
aws_iam_role_policy_attachment.node_group_AmazonEKS_CNI_Policy,
aws_iam_role_policy_attachment.node_group_AmazonEC2ContainerRegistryReadOnly,
]
tags = merge(var.tags, each.value.tags, {
Name = "${local.cluster_name}-${each.key}"
Type = "eks-node-group"
})
}
# EKS Add-ons
resource "aws_eks_addon" "main" {
for_each = var.cluster_addons
cluster_name = aws_eks_cluster.main.name
addon_name = each.key
addon_version = each.value.addon_version
resolve_conflicts = each.value.resolve_conflicts
service_account_role_arn = each.value.service_account_role_arn
depends_on = [aws_eks_node_group.main]
tags = merge(var.tags, {
Name = "${local.cluster_name}-${each.key}-addon"
Type = "eks-addon"
})
}
# OIDC Identity Provider
data "tls_certificate" "cluster" {
url = aws_eks_cluster.main.identity[0].oidc[0].issuer
}
resource "aws_iam_openid_connect_provider" "cluster" {
count = var.enable_irsa ? 1 : 0
client_id_list = ["sts.amazonaws.com"]
thumbprint_list = [data.tls_certificate.cluster.certificates[0].sha1_fingerprint]
url = aws_eks_cluster.main.identity[0].oidc[0].issuer
tags = merge(var.tags, {
Name = "${local.cluster_name}-oidc-provider"
Type = "oidc-provider"
})
}
Security and Compliance
Terraform Security Scanning
# .github/workflows/terraform-security.yml
name: Terraform Security Scan
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
jobs:
security-scan:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.5.0
- name: Terraform Format Check
run: terraform fmt -check -recursive
- name: Terraform Init
run: terraform init -backend=false
- name: Terraform Validate
run: terraform validate
- name: Run Checkov
uses: bridgecrewio/checkov-action@master
with:
directory: .
framework: terraform
output_format: sarif
output_file_path: checkov-results.sarif
- name: Upload Checkov results to GitHub Security
uses: github/codeql-action/upload-sarif@v2
if: always()
with:
sarif_file: checkov-results.sarif
- name: Run TFSec
uses: aquasecurity/tfsec-action@v1.0.3
with:
soft_fail: true
- name: Run Terrascan
uses: tenable/terrascan-action@main
with:
iac_type: terraform
iac_version: v14
policy_type: aws
only_warn: true
- name: Run Infracost
uses: infracost/actions/setup@v2
with:
api-key: ${{ secrets.INFRACOST_API_KEY }}
- name: Generate Infracost diff
run: |
infracost breakdown --path . \
--format json \
--out-file infracost-base.json
- name: Post Infracost comment
uses: infracost/actions/comment@v1
with:
path: infracost-base.json
behavior: update
Policy as Code with Sentinel
# policies/sentinel/aws-security-policies.sentinel
import "tfplan/v2" as tfplan
import "strings"
import "types"
# Helper functions
get_resources = func(resource_type) {
resources = {}
for tfplan.resource_changes as address, rc {
if rc.type is resource_type and
rc.mode is "managed" and
(rc.change.actions contains "create" or rc.change.actions contains "update") {
resources[address] = rc
}
}
return resources
}
# Policy: Ensure S3 buckets are encrypted
s3_buckets_encrypted = rule {
all get_resources("aws_s3_bucket") as address, rc {
rc.change.after.server_side_encryption_configuration is not null and
length(rc.change.after.server_side_encryption_configuration) > 0
}
}
# Policy: Ensure RDS instances are encrypted
rds_instances_encrypted = rule {
all get_resources("aws_db_instance") as address, rc {
rc.change.after.storage_encrypted is true
}
}
# Policy: Ensure EBS volumes are encrypted
ebs_volumes_encrypted = rule {
all get_resources("aws_ebs_volume") as address, rc {
rc.change.after.encrypted is true
}
}
# Policy: Ensure security groups don't allow 0.0.0.0/0 on port 22
no_ssh_from_anywhere = rule {
all get_resources("aws_security_group") as address, rc {
all rc.change.after.ingress as ingress {
not (ingress.from_port <= 22 and ingress.to_port >= 22 and
ingress.protocol is "tcp" and
"0.0.0.0/0" in ingress.cidr_blocks)
}
}
}
# Policy: Ensure security groups don't allow 0.0.0.0/0 on port 3389
no_rdp_from_anywhere = rule {
all get_resources("aws_security_group") as address, rc {
all rc.change.after.ingress as ingress {
not (ingress.from_port <= 3389 and ingress.to_port >= 3389 and
ingress.protocol is "tcp" and
"0.0.0.0/0" in ingress.cidr_blocks)
}
}
}
# Policy: Ensure IAM policies don't grant admin access
no_admin_policies = rule {
all get_resources("aws_iam_policy") as address, rc {
policy_doc = json.unmarshal(rc.change.after.policy)
all policy_doc.Statement as statement {
not (statement.Effect is "Allow" and
statement.Action contains "*" and
statement.Resource contains "*")
}
}
}
# Policy: Ensure resources have required tags
required_tags = ["Environment", "Project", "Owner", "CostCenter"]
resources_have_required_tags = rule {
all get_resources("aws_instance") as address, rc {
all required_tags as tag {
rc.change.after.tags contains tag
}
} and
all get_resources("aws_s3_bucket") as address, rc {
all required_tags as tag {
rc.change.after.tags contains tag
}
} and
all get_resources("aws_rds_instance") as address, rc {
all required_tags as tag {
rc.change.after.tags contains tag
}
}
}
# Policy: Ensure EKS clusters have logging enabled
eks_logging_enabled = rule {
all get_resources("aws_eks_cluster") as address, rc {
rc.change.after.enabled_cluster_log_types is not null and
length(rc.change.after.enabled_cluster_log_types) > 0
}
}
# Policy: Ensure VPC flow logs are enabled
vpc_flow_logs_enabled = rule {
vpcs = get_resources("aws_vpc")
flow_logs = get_resources("aws_flow_log")
all vpcs as vpc_address, vpc_rc {
any flow_logs as fl_address, fl_rc {
fl_rc.change.after.vpc_id is vpc_rc.change.after.id
}
}
}
# Main policy
main = rule {
s3_buckets_encrypted and
rds_instances_encrypted and
ebs_volumes_encrypted and
no_ssh_from_anywhere and
no_rdp_from_anywhere and
no_admin_policies and
resources_have_required_tags and
eks_logging_enabled and
vpc_flow_logs_enabled
}
CI/CD Integration
GitLab CI Pipeline for Terraform
# .gitlab-ci.yml
stages:
- validate
- plan
- security-scan
- apply
- destroy
variables:
TF_ROOT: ${CI_PROJECT_DIR}
TF_ADDRESS: ${CI_API_V4_URL}/projects/${CI_PROJECT_ID}/terraform/state/${CI_ENVIRONMENT_NAME}
TF_IN_AUTOMATION: "true"
TF_INPUT: "false"
TF_CLI_ARGS: "-no-color"
cache:
key: "${CI_COMMIT_REF_SLUG}"
paths:
- ${TF_ROOT}/.terraform
before_script:
- cd ${TF_ROOT}
- terraform --version
- terraform init -backend-config="address=${TF_ADDRESS}" -backend-config="lock_address=${TF_ADDRESS}/lock" -backend-config="unlock_address=${TF_ADDRESS}/lock" -backend-config="username=${GITLAB_USER_LOGIN}" -backend-config="password=${CI_JOB_TOKEN}" -backend-config="lock_method=POST" -backend-config="unlock_method=DELETE" -backend-config="retry_wait_min=5"
validate:
stage: validate
script:
- terraform fmt -check -recursive
- terraform validate
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
plan:
stage: plan
script:
- terraform plan -var-file="environments/${CI_ENVIRONMENT_NAME}.tfvars" -out="planfile"
- terraform show -json planfile > plan.json
artifacts:
name: plan
paths:
- ${TF_ROOT}/planfile
- ${TF_ROOT}/plan.json
expire_in: 1 week
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
security-scan:
stage: security-scan
image: bridgecrew/checkov:latest
script:
- checkov -f plan.json --framework terraform_plan --output cli --output junitxml --output-file-path console,checkov-report.xml
artifacts:
reports:
junit: checkov-report.xml
paths:
- checkov-report.xml
expire_in: 1 week
dependencies:
- plan
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
cost-estimation:
stage: security-scan
image: infracost/infracost:ci-0.10
script:
- infracost breakdown --path plan.json --format json --out-file infracost.json
- infracost output --path infracost.json --format table
- infracost output --path infracost.json --format html --out-file infracost-report.html
artifacts:
paths:
- infracost.json
- infracost-report.html
expire_in: 1 week
dependencies:
- plan
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
apply:
stage: apply
script:
- terraform apply -auto-approve planfile
dependencies:
- plan
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
when: manual
environment:
name: ${CI_ENVIRONMENT_NAME}
action: start
destroy:
stage: destroy
script:
- terraform destroy -var-file="environments/${CI_ENVIRONMENT_NAME}.tfvars" -auto-approve
rules:
- if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
when: manual
environment:
name: ${CI_ENVIRONMENT_NAME}
action: stop
Automated Deployment Script
#!/bin/bash
# scripts/deploy.sh
set -e
# Configuration
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROJECT_ROOT="$(dirname "$SCRIPT_DIR")"
ENVIRONMENTS_DIR="$PROJECT_ROOT/environments"
MODULES_DIR="$PROJECT_ROOT/modules"
# Default values
ENVIRONMENT=""
ACTION="plan"
AUTO_APPROVE=false
DESTROY=false
WORKSPACE=""
VAR_FILE=""
BACKEND_CONFIG=""
PARALLELISM=10
REFRESH=true
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Logging functions
log_info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
log_success() {
echo -e "${GREEN}[SUCCESS]${NC} $1"
}
log_warning() {
echo -e "${YELLOW}[WARNING]${NC} $1"
}
log_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
# Help function
show_help() {
cat << EOF
Terraform Deployment Script
Usage: $0 [OPTIONS]
OPTIONS:
-e, --environment ENVIRONMENT Target environment (dev, staging, production)
-a, --action ACTION Action to perform (plan, apply, destroy)
-w, --workspace WORKSPACE Terraform workspace to use
-f, --var-file FILE Variables file to use
-b, --backend-config FILE Backend configuration file
-p, --parallelism NUMBER Number of parallel operations (default: 10)
--auto-approve Auto approve apply/destroy operations
--no-refresh Skip refresh during plan/apply
-h, --help Show this help message
EXAMPLES:
$0 -e dev -a plan
$0 -e production -a apply --auto-approve
$0 -e staging -a destroy --auto-approve
$0 -e dev -w feature-branch -a plan
EOF
}
# Parse command line arguments
while [[ $# -gt 0 ]]; do
case $1 in
-e|--environment)
ENVIRONMENT="$2"
shift 2
;;
-a|--action)
ACTION="$2"
shift 2
;;
-w|--workspace)
WORKSPACE="$2"
shift 2
;;
-f|--var-file)
VAR_FILE="$2"
shift 2
;;
-b|--backend-config)
BACKEND_CONFIG="$2"
shift 2
;;
-p|--parallelism)
PARALLELISM="$2"
shift 2
;;
--auto-approve)
AUTO_APPROVE=true
shift
;;
--no-refresh)
REFRESH=false
shift
;;
-h|--help)
show_help
exit 0
;;
*)
log_error "Unknown option: $1"
show_help
exit 1
;;
esac
done
# Validate required parameters
if [[ -z "$ENVIRONMENT" ]]; then
log_error "Environment is required. Use -e or --environment option."
exit 1
fi
# Set default var file if not specified
if [[ -z "$VAR_FILE" ]]; then
VAR_FILE="$ENVIRONMENTS_DIR/$ENVIRONMENT.tfvars"
fi
# Validate environment directory exists
ENV_DIR="$ENVIRONMENTS_DIR/$ENVIRONMENT"
if [[ ! -d "$ENV_DIR" ]]; then
log_error "Environment directory not found: $ENV_DIR"
exit 1
fi
# Validate var file exists
if [[ ! -f "$VAR_FILE" ]]; then
log_error "Variables file not found: $VAR_FILE"
exit 1
fi
# Change to environment directory
cd "$ENV_DIR"
log_info "Starting Terraform deployment for environment: $ENVIRONMENT"
log_info "Action: $ACTION"
log_info "Variables file: $VAR_FILE"
# Initialize Terraform
log_info "Initializing Terraform..."
INIT_ARGS=()
if [[ -n "$BACKEND_CONFIG" ]]; then
INIT_ARGS+=("-backend-config=$BACKEND_CONFIG")
fi
if ! terraform init "${INIT_ARGS[@]}"; then
log_error "Terraform initialization failed"
exit 1
fi
# Select or create workspace
if [[ -n "$WORKSPACE" ]]; then
log_info "Selecting workspace: $WORKSPACE"
terraform workspace select "$WORKSPACE" || terraform workspace new "$WORKSPACE"
fi
# Perform the requested action
case $ACTION in
plan)
log_info "Running Terraform plan..."
PLAN_ARGS=(
"-var-file=$VAR_FILE"
"-parallelism=$PARALLELISM"
"-out=tfplan"
)
if [[ "$REFRESH" == "false" ]]; then
PLAN_ARGS+=("-refresh=false")
fi
if terraform plan "${PLAN_ARGS[@]}"; then
log_success "Terraform plan completed successfully"
# Show plan summary
log_info "Plan summary:"
terraform show -json tfplan | jq -r '
.resource_changes[] |
select(.change.actions[] | . != "no-op") |
"\(.change.actions | join(",")) \(.address)"
' | sort
else
log_error "Terraform plan failed"
exit 1
fi
;;
apply)
log_info "Running Terraform apply..."
APPLY_ARGS=(
"-var-file=$VAR_FILE"
"-parallelism=$PARALLELISM"
)
if [[ "$REFRESH" == "false" ]]; then
APPLY_ARGS+=("-refresh=false")
fi
if [[ "$AUTO_APPROVE" == "true" ]]; then
APPLY_ARGS+=("-auto-approve")
fi
# Check if plan file exists
if [[ -f "tfplan" ]]; then
log_info "Using existing plan file"
APPLY_ARGS=("tfplan")
fi
if terraform apply "${APPLY_ARGS[@]}"; then
log_success "Terraform apply completed successfully"
# Show outputs
log_info "Terraform outputs:"
terraform output
else
log_error "Terraform apply failed"
exit 1
fi
;;
destroy)
log_warning "This will destroy all resources in environment: $ENVIRONMENT"
if [[ "$AUTO_APPROVE" != "true" ]]; then
read -p "Are you sure you want to continue? (yes/no): " -r
if [[ ! $REPLY =~ ^[Yy][Ee][Ss]$ ]]; then
log_info "Destroy operation cancelled"
exit 0
fi
fi
log_info "Running Terraform destroy..."
DESTROY_ARGS=(
"-var-file=$VAR_FILE"
"-parallelism=$PARALLELISM"
)
if [[ "$AUTO_APPROVE" == "true" ]]; then
DESTROY_ARGS+=("-auto-approve")
fi
if terraform destroy "${DESTROY_ARGS[@]}"; then
log_success "Terraform destroy completed successfully"
else
log_error "Terraform destroy failed"
exit 1
fi
;;
*)
log_error "Unknown action: $ACTION"
log_info "Supported actions: plan, apply, destroy"
exit 1
;;
esac
log_success "Deployment script completed successfully"
Best Practices and Operational Excellence
State Management Best Practices
- Remote State Storage: Always use remote state with encryption and versioning
- State Locking: Implement state locking to prevent concurrent modifications
- State Backup: Regular automated backups of state files
- Workspace Strategy: Use workspaces for environment isolation
- State File Security: Restrict access to state files containing sensitive data
Module Development Guidelines
- Single Responsibility: Each module should have a clear, single purpose
- Versioning: Use semantic versioning for module releases
- Documentation: Comprehensive README with examples and variable descriptions
- Testing: Implement automated testing with tools like Terratest
- Validation: Input validation and output consistency
Security Hardening
- Least Privilege: Apply principle of least privilege to IAM roles and policies
- Encryption: Enable encryption at rest and in transit for all resources
- Network Security: Implement proper network segmentation and security groups
- Secrets Management: Use dedicated secret management services
- Compliance: Regular compliance scanning and policy enforcement
Performance Optimization
- Parallelism: Optimize Terraform parallelism settings
- Resource Dependencies: Minimize unnecessary dependencies
- Provider Caching: Use provider plugin caching
- State Refresh: Optimize state refresh operations
- Resource Targeting: Use targeted operations when appropriate
Conclusion
Terraform provides a powerful foundation for Infrastructure as Code, enabling teams to manage complex cloud environments with consistency, reliability, and security. This comprehensive guide covers enterprise-scale implementations, from basic project structure to advanced security patterns and CI/CD integration.
Key takeaways for successful Terraform adoption:
- Start with solid foundations: Proper project structure, state management, and security practices
- Embrace modularity: Develop reusable, well-tested modules for common patterns
- Implement governance: Use policy as code and automated security scanning
- Automate everything: CI/CD pipelines, testing, and deployment processes
- Monitor and optimize: Continuous improvement of performance and cost efficiency
By following these practices and patterns, organizations can build robust, scalable infrastructure automation that supports their cloud-native journey and operational excellence goals.