IaC Lab - A Wannabe Modular Terraform Framework

The Problem

I kept rebuilding the same infrastructure patterns for different projects. VPC setup, load balancers, ECS clusters, RDS databases. Copy-paste Terraform from old projects, spend hours debugging why it doesn’t work in the new environment.

I wanted a playground where I could experiment with different compute patterns (EC2, ECS, EKS) without starting from scratch each time. Something that handles the boring networking setup and lets me focus on the interesting parts.

What I Built

A modular Terraform lab that supports three different compute patterns (EC2 Auto Scaling, ECS, EKS) with the same networking and database modules. You can deploy the same application architecture using different compute platforms just by changing a variable.

The Structure

terraform-playground/
├── modules/
│   ├── networking/            # VPC, subnets, security groups
│   ├── compute/asg/           # Auto Scaling Groups
│   ├── ecs/                   # ECS clusters and services
│   ├── eks/                   # EKS clusters
│   ├── database/              # RDS MySQL
│   ├── loadbalancer/          # ALB with blue-green support
│   ├── oidc/                  # GitHub Actions authentication
│   └── secrets/               # Parameter Store integration
├── environments/
│   ├── global/                # Shared resources (ECR, OIDC provider)
│   └── terraform/             # Main infrastructure
└── app/                       # Python Flask app for testing

State Management

I use Terraform workspaces with a single S3 backend. Each environment (dev/staging/production) gets its own workspace and tfvars file:

1
2
3
4
5
terraform workspace select dev
terraform apply -var-file=working_ecs_dev.tfvars

terraform workspace select staging  
terraform apply -var-file=working_ecs_staging.tfvars

Same modules, different configurations. No more hoping staging matches production.

Multi-Platform Architecture

The interesting part is conditional resource creation. You can deploy the same app on different compute platforms:

Implementation Details

Platform Switching

The same networking and database modules work with any compute platform:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# In your tfvars file, just toggle these
platform = "ecs"  # or "eks" or "asg"
enable_ecs = true
enable_eks = false
enable_asg = false

# Conditional resource creation
resource "aws_ecs_cluster" "main" {
  count = var.enable_ecs ? 1 : 0
  name  = "${var.environment}-cluster"
  
  setting {
    name  = "containerInsights"
    value = "enabled"
  }
}

resource "aws_eks_cluster" "main" {
  count   = var.enable_eks ? 1 : 0
  name    = "${var.environment}-cluster"
  version = "1.27"
}

I can test ECS in dev, EKS in staging, and switch between them without rebuilding the entire infrastructure.

Blue-Green Deployment Support

The load balancer module supports blue-green deployments with traffic switching:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# Two target groups for blue-green
resource "aws_lb_target_group" "blue" {
  name     = "${var.environment}-blue"
  port     = 80
  protocol = "HTTP"
  vpc_id   = var.vpc_id
  
  health_check {
    path                = "/health"
    healthy_threshold   = 2
    unhealthy_threshold = 2
  }
}

resource "aws_lb_target_group" "green" {
  name     = "${var.environment}-green"
  port     = 80
  protocol = "HTTP"
  vpc_id   = var.vpc_id
}

I have scripts that switch traffic between blue and green target groups for zero-downtime deployments.

Global vs Environment Resources

Global environment handles shared resources that don’t change between environments:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Global: environments/global/main.tf
resource "aws_ecr_repository" "app" {
  name = "tf-playground-app"
}

resource "aws_iam_openid_connect_provider" "github" {
  url = "https://token.actions.githubusercontent.com"
}

# Environment: environments/terraform/main.tf  
data "terraform_remote_state" "global" {
  backend = "s3"
  config = {
    bucket = "tf-playground-state-vexus"
    key    = "global/terraform.tfstate"
    region = "us-east-2"
  }
}

# Reference global ECR repo
locals {
  ecr_repository_url = data.terraform_remote_state.global.outputs.ecr_repository_url
}

This prevents accidentally destroying the ECR registry when tearing down dev environments.

Session Manager Instead of SSH

No SSH keys anywhere. EC2 instances use SSM Session Manager for shell access:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# Instance role allows SSM access
resource "aws_iam_role" "ec2_role" {
  name = "${var.environment}-ec2-role"
  
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Action = "sts:AssumeRole"
      Effect = "Allow"
      Principal = {
        Service = "ec2.amazonaws.com"
      }
    }]
  })
}

resource "aws_iam_role_policy_attachment" "ssm_managed_instance_core" {
  role       = aws_iam_role.ec2_role.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}

Access instances with: aws ssm start-session --target i-1234567890abcdef0

No key pairs, no open SSH ports, full audit trail.

What I Learned

Cost Optimization Tricks

The lab costs about $55/month total across all environments. Key savings:

Parameter Store instead of Secrets Manager (free vs $0.40/secret/month)
t3.micro instances everywhere
Public subnets in dev (no NAT gateway costs)
Single ECR repo shared across environments

Workspace vs Directory Structure

Terraform workspaces work better than separate directories for environment isolation. Same backend, same modules, different state files. Much cleaner than copying entire directory structures.

Blue-Green Deployment Reality

Implemented traffic switching between blue and green target groups. Works great for testing deployment patterns, but the real value is understanding how ALB listener rules work.

Module Dependencies

The networking module outputs are used by compute modules, which are used by database modules. Getting the dependency graph right took several iterations, but now adding new compute platforms is straightforward.

Why This Approach Works

Modular Terraform lets you experiment safely. Want to test EKS? Change enable_eks = true and redeploy. Want to see how blue-green works? Run the failover scripts.

The global/environment split means I can destroy dev environments without losing the ECR registry or OIDC provider. That’s crucial for cost management when experimenting.

Real Project Applications

I’ve used pieces of this lab in actual projects:

The OIDC module for GitHub Actions authentication
The ECS patterns for containerized applications
The blue-green load balancer setup for zero-downtime deployments

The lab validates the patterns before using them with real applications.

View the complete source code on GitHub →

The Problem#

What I Built#

The Structure#

State Management#

Multi-Platform Architecture#

Implementation Details#

Platform Switching#

Blue-Green Deployment Support#

Global vs Environment Resources#

Session Manager Instead of SSH#

What I Learned#

Cost Optimization Tricks#

Workspace vs Directory Structure#

Blue-Green Deployment Reality#

Module Dependencies#

Why This Approach Works#

Real Project Applications#