The Staging Environment Bottleneck
Does this sound familiar? Your team has one, maybe two, shared staging environments. Developer A is using it for a big feature test. Developer B needs to verify a quick bug fix, but has to wait. A product manager wants to review a new UI, but the environment is currently broken. This contention creates a massive bottleneck, slowing down review cycles and frustrating everyone.
Shared, static environments are a relic of a past era. We decided to kill our staging server and replace it with something far more powerful: dynamic, on-demand ephemeral environments created for every single pull request.
The Pull Request-Powered Workflow
The concept is simple but transformative: the pull request becomes the source of truth for a live, running version of the code. The lifecycle is completely automated by GitHub Actions and managed by Terraform.
Here’s how it works:
graph TD
A[Dev Opens PR] --> B{GitHub Action Triggered};
B -- on `pull_request` --> C[Terraform Plan & Apply];
C --> D[Create Isolated Environment];
D --> E{Post URL to PR Comment};
E --> F[Reviewers Test Live Preview];
F --> G{PR Merged/Closed};
G --> H{GitHub Action Triggered};
H -- on `closed` --> I[Terraform Destroy];
I --> J[Tear Down Environment];
How We Built It: Terraform Workspaces + GitHub Actions
This entire system hinges on two key technologies:
-
Terraform Workspaces: This is the secret sauce. Workspaces allow you to manage multiple, distinct states of the same Terraform configuration. We create a new workspace for each pull request (e.g.,
pr-123), which lets Terraform spin up a completely isolated copy of our infrastructure (databases, services, etc.) without interfering with production or other PRs. -
GitHub Actions: The orchestrator that ties everything together. A workflow file listens for
pull_requestevents and runs the appropriate Terraform commands.
Here’s a more realistic look at the GitHub Actions workflow:
# .github/workflows/preview-env.yml
name: Preview Environment
on:
pull_request:
types: [opened, synchronize, closed]
jobs:
preview:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Setup Terraform
uses: hashicorp/setup-terraform@v2
- name: 'Terraform: Create or Destroy Environment'
env:
# Create a unique workspace for this PR
TF_WORKSPACE: pr-${{ github.event.number }}
run: |
terraform init
if [ "${{ github.event.action }}" == "closed" ]; then
echo "PR closed, destroying environment..."
terraform destroy -auto-approve
else
echo "PR opened/updated, creating environment..."
terraform apply -auto-approve
fi
We also add a ttl (Time To Live) tag to all resources, with a cleanup script that automatically destroys any environment older than 24 hours to control costs if a PR is abandoned.
The Results: Unlocking Developer Velocity
Switching to ephemeral environments had a massive impact on our workflow:
- Zero Environment Contention: Every PR gets its own isolated stack. No more waiting.
- High-Fidelity Reviews: Product managers, designers, and QA can test changes in a live, production-like environment, leading to better feedback.
- Faster Feedback Loops: Developers get feedback in hours, not days.
- Cost Control: We only pay for what we use, and automated teardown prevents runaway cloud bills.
This isn't just an infrastructure improvement; it's a fundamental upgrade to the developer experience that enables teams to ship better products, faster.