The Problem

Every DevOps engineer knows the credential management nightmare. AWS access keys stored in GitHub secrets. Rotation schedules nobody follows. That moment of panic when someone leaves the team and you realize they could still have access. And worst of all, giving CI/CD pipelines AdministratorAccess because figuring out exact permissions takes forever.

I needed a better way. Something secure, maintainable, and actually practical for real projects.

The Solution: GitHub Actions OIDC

What is OIDC?

OpenID Connect (OIDC) allows GitHub Actions to authenticate directly with AWS without storing any credentials. GitHub generates a signed JWT token for each workflow run, AWS verifies it came from GitHub, and grants temporary credentials that expire in minutes, not years.

The Authentication Flow

sequenceDiagram participant GHA as GitHub Actions participant OIDC as GitHub OIDC participant STS as AWS STS participant IAM as IAM Role participant AWS as AWS Services GHA->>OIDC: Request JWT token OIDC->>GHA: Signed JWT with repo/branch info GHA->>STS: AssumeRoleWithWebIdentity + JWT STS->>IAM: Verify trust policy IAM->>STS: Validation success STS->>GHA: Temporary credentials (15 min) GHA->>AWS: Access resources with temp creds

No secrets stored. No rotation needed. Just cryptographic proof that the request came from your GitHub repository.

Architecture Overview

I built a three-tier permission model:

  1. Global Environment: Contains OIDC provider and base IAM roles (manually managed)
  2. Environment Roles: Separate roles for dev, staging, and production
  3. Branch Restrictions: Production only from main, dev from any branch

Implementation Details

Step 1: Setting Up the OIDC Provider

The OIDC provider is the trust relationship between GitHub and AWS. Here’s my actual Terraform configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# GitHub OIDC Provider (conditional to avoid conflicts)
resource "aws_iam_openid_connect_provider" "github" {
  count = var.create_oidc_provider ? 1 : 0
  url   = "https://token.actions.githubusercontent.com"

  client_id_list = [
    "sts.amazonaws.com"
  ]

  thumbprint_list = [
    "6938fd4d98bab03faadb97b34396831e3780aea1",
    "1c58a3a8518e8759bf075b76b750d4f2df264fcd"
  ]

  tags = {
    Name        = "github-actions-oidc"
    Environment = var.environment
    Project     = "tf-playground"
    ManagedBy   = "terraform"
  }
}

The conditional creation prevents conflicts when multiple environments try to create the same provider. Create once, reference everywhere.

Step 2: Creating Environment-Specific IAM Roles

Each environment gets its own IAM role with tailored permissions:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# IAM Role for GitHub Actions (from modules/oidc/main.tf)
resource "aws_iam_role" "github_actions" {
  name = "github-actions-${var.environment}"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRoleWithWebIdentity"
        Effect = "Allow"
        Principal = {
          Federated = var.create_oidc_provider ? aws_iam_openid_connect_provider.github[0].arn : data.aws_iam_openid_connect_provider.github[0].arn
        }
        Condition = {
          StringEquals = {
            "token.actions.githubusercontent.com:aud" : "sts.amazonaws.com"
          }
          StringLike = {
            # This is the magic - it accepts any workflow from my repo
            "token.actions.githubusercontent.com:sub" : "repo:${var.github_repository}:*"
          }
        }
      }
    ]
  })
}

The wildcard pattern (repo:owner/repo:*) accepts any branch or tag. For production, you’d restrict to ref:refs/heads/main.

Step 3: Configuring GitHub Actions Workflows

Here’s the workflow:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
name: Production Terraform

env:
  TF_VERSION: "1.12.0"
  AWS_REGION: "us-east-2"
  ENVIRONMENT: "production"
  WORKSPACE: "production"

permissions:
  id-token: write      # This is crucial for OIDC
  contents: read
  pull-requests: write

jobs:
  terraform-apply:
    name: Terraform Apply
    runs-on: ubuntu-latest
    
    steps:
      - name: Checkout
        uses: actions/checkout@v4

      # This is where the magic happens. No secrets!
      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123324351829:role/github-actions-global
          aws-region: ${{ env.AWS_REGION }}
          role-session-name: production-apply-${{ github.run_id }}-${{ github.run_number }}

      - name: Terraform Init
        working-directory: environments/terraform
        run: terraform init

      - name: Select Production Workspace
        working-directory: environments/terraform
        run: terraform workspace select production

      - name: Terraform Apply
        working-directory: environments/terraform
        run: |
          terraform apply \
            -var-file=working_ecs_production.tfvars \
            -auto-approve

Notice: No AWS credentials anywhere. The role-to-assume ARN is public information.

My Permission Discovery Method

The Trial-and-Error Approach That Actually Works

Instead of guessing permissions or using overly broad policies, I discovered exactly what my Terraform deployments need through systematic failure analysis:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# From modules/oidc/main.tf. These are the ACTUAL permissions I discovered
resource "aws_iam_role_policy" "terraform_permissions" {
  name = "terraform-permissions-${var.environment}"
  role = aws_iam_role.github_actions.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      # S3 permissions for state management
      {
        Effect = "Allow"
        Action = [
          "s3:GetObject",
          "s3:PutObject",
          "s3:DeleteObject",
          "s3:ListBucket"
        ]
        Resource = [
          "arn:aws:s3:::${var.state_bucket}",
          "arn:aws:s3:::${var.state_bucket}/*"
        ]
      },
      # DynamoDB for state locking. Discovered I needed these three
      {
        Effect = "Allow"
        Action = [
          "dynamodb:GetItem",
          "dynamodb:PutItem",
          "dynamodb:DeleteItem"
        ]
        Resource = "arn:aws:dynamodb:${var.aws_region}:*:table/${var.state_lock_table}"
      },
      # EC2/VPC/ELB (yeah, these need wildcards for Terraform)
      {
        Effect = "Allow"
        Action = [
          "ec2:*",
          "elasticloadbalancing:*",
          "autoscaling:*"
        ]
        Resource = "*"
      },
      # IAM (but ONLY for specific role patterns)
      {
        Effect = "Allow"
        Action = [
          "iam:GetRole",
          "iam:PassRole",
          "iam:CreateRole",
          "iam:CreatePolicy",
          "iam:AttachRolePolicy",
          # ... more IAM actions
        ]
        Resource = [
          "arn:aws:iam::*:role/tf-playground-*",
          "arn:aws:iam::*:role/github-actions-*",
          "arn:aws:iam::*:role/staging-*",
          "arn:aws:iam::*:role/dev-*",
          "arn:aws:iam::*:role/production-*"
        ]
      }
    ]
  })
}

The Discovery Process

  1. Start with minimal permissions (just S3 for state)
  2. Run terraform apply in GitHub Actions
  3. Read the error message: It tells you exactly what’s missing
  4. Add that permission to the IAM policy
  5. Deploy the updated policy (manual terraform apply in global environment)
  6. Re-run the failed job
  7. Repeat until success

Example error message that guides you:

1
2
3
Error: creating EKS Cluster: AccessDeniedException: 
User: arn:aws:sts::123456789:assumed-role/github-actions-global/dev-apply-123-1 
is not authorized to perform: eks:CreateCluster

Clear, actionable, and no guessing required.

Environment-Specific Security Controls

Different environments need different security postures:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# This is how you actually lock it down
Condition = {
  StringEquals = {
    "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
  }
  StringLike = {
    # Dev environment (any branch)
    "token.actions.githubusercontent.com:sub" = "repo:KajiMaster/terraform-playground:*"
    
    # Production. Main branch only (uncomment for production)
    # "token.actions.githubusercontent.com:sub" = "repo:KajiMaster/terraform-playground:ref:refs/heads/main"
  }
}

Key Implementation Challenges & Solutions

Challenge 1: Docker & ECR Authentication

Docker builds need to push to ECR, but docker login expects credentials. Here’s how OIDC handles it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Get ECR repository URL from Terraform output
cd environments/terraform
ECR_REPO=$(terraform output -raw ecr_repository_url 2>/dev/null || echo "")

# Extract registry URL from ECR repository URL
REGISTRY_URL=$(echo "$ECR_REPO" | cut -d'/' -f1)

# Login to ECR using the OIDC credentials
aws ecr get-login-password --region ${{ env.AWS_REGION }} | \
  docker login --username AWS --password-stdin $REGISTRY_URL

# Now you can push
docker push $ECR_REPO:latest

The OIDC credentials are automatically used by AWS CLI to generate the docker login token.

Challenge 2: Managing Multiple Environments

I use Terraform workspaces for environments. Each workflow selects its workspace:

1
2
3
- name: Select Workspace
  working-directory: environments/terraform
  run: terraform workspace select production

Benefits:

  • Single S3 bucket for all environments
  • Isolated state files prevent cross-environment conflicts
  • Easy environment promotion with consistent workflows

Challenge 3: Avoiding Unnecessary Deployments

Don’t trigger infrastructure deploys for docs changes:

1
2
3
4
5
6
7
8
on:
  push:
    branches: [main]
    paths-ignore:
      - 'docs/**'
      - 'scripts/**'
      - '**.md'
      - 'LICENSE'

This saves CI/CD minutes and prevents infrastructure changes from documentation updates.

Results & Benefits

Security Improvements

MetricBefore OIDCAfter OIDCImprovement
Stored Credentials~8 secrets0 secrets100% reduction
Credential LifetimeYears/Never expire15 minutesEffectively temporary
Permission ScopeOften overly broadDiscovered via errorsRight-sized
Audit TrailBasicCloudWatch with session IDsFull traceability
Rotation BurdenManual quarterlyNone neededZero maintenance

Operational Benefits

  1. Zero Credential Management

    • No rotation schedules
    • No panic when employees leave
    • No secrets in developer environments
  2. Precise Permissions

    • Discovered through actual usage
    • Environment-specific restrictions
    • Branch-based access control
  3. Complete Audit Trail

    • Every action tagged with workflow run ID
    • CloudWatch logs for compliance
    • Clear attribution of all changes
  4. Developer Experience

    • No local AWS credentials needed
    • Same workflows across all environments
    • Clear error messages guide permission fixes

Cost Analysis

  • Setup Time: 1 weekend (8 hours)
  • Monthly Savings: 2-3 hours of credential management
  • Security Value: Eliminated credential compromise risk

Lessons Learned

What Worked Well

The trial-and-error permission discovery.
Sounds hacky, but terraform errors are so clear that it’s actually faster than trying to predict permissions upfront.

Global environment pattern.
Having OIDC infrastructure separate from application infrastructure prevents circular dependencies.

Workspace-based environment isolation.
Simple, effective, and works with existing Terraform patterns.

What I’d Do Differently

Start with dev environment only.
Get the pattern working before adding staging and production complexity.

Document discovered permissions immediately.
I had to rediscover some permissions when setting up new services.

Consider AWS SSO for human access too.
OIDC solved CI/CD, but developers still use long-lived credentials locally.

Implementation Guide

Quick Start for Your Project

Create OIDC Provider (one-time setup)

1
terraform apply -target=aws_iam_openid_connect_provider.github

Create IAM Role with trust relationship to GitHub

1
"token.actions.githubusercontent.com:sub": "repo:YOUR_ORG/YOUR_REPO:*"

Update GitHub Workflow

1
2
3
4
- uses: aws-actions/configure-aws-credentials@v4
  with:
    role-to-assume: arn:aws:iam::ACCOUNT:role/github-actions
    aws-region: us-east-1

Iterate on Permissions using the failure messages

Common Pitfalls to Avoid

  • Don’t forget id-token: write permission in workflow
  • Thumbprints occasionally change (GitHub provides updates)
  • Role session names help with CloudWatch debugging
  • Start with wildcards, tighten after it works

Next Steps

This OIDC implementation is just the beginning. Future enhancements could include:

  • Federated access for developers using GitHub as identity provider
  • Cost allocation tags based on workflow metadata
  • Automated permission discovery using CloudTrail analysis
  • Multi-cloud support (Azure and GCP have similar OIDC capabilities)

“Security doesn’t have to be complex. Sometimes the simplest solution (stop storing secrets) is the best solution. OIDC proved that for me.”

Want to implement this in your infrastructure? Check out the complete code: terraform-playground on GitHub

Need help with your CI/CD security? Contact me for consultation.