Kubernetes Ephemeral Test Environments: Spin Up a Fresh Cluster Per PR, Tear Down After Merge

The shared staging environment problem is universal: Team A is testing a database migration, Team B is running load tests, and Team C is demonstrating a new feature to a customer. All three teams compete for the same environment, which is now simultaneously broken in three different ways.

Ephemeral environments solve this by giving every pull request or branch its own disposable, isolated environment. Spin it up when the PR opens, run tests against it, use it for review, tear it down when the PR merges. The infrastructure cost is temporary and proportional to active PRs.

This guide covers implementing ephemeral environments with Kubernetes namespace isolation.

Architecture Overview

flowchart TD
    A[Developer opens PR] --> B[CI workflow triggered]
    B --> C[Build Docker images]
    C --> D[Create Kubernetes namespace:\npr-42-feature-auth]
    D --> E[Deploy services to namespace]
    E --> F[Run automated tests]
    F --> G{Tests pass?}
    G -->|Yes| H[Post PR URL as comment]
    G -->|No| I[Post failure report]
    H --> J[Team uses environment for review]
    J --> K[PR merged or closed]
    K --> L[Delete namespace + all resources]
    L --> M[Environment gone]

Each namespace is fully isolated: its own frontend, backend, database, worker services. No shared state between PRs.

Namespace Isolation Pattern

The core Kubernetes primitive is the namespace. Each ephemeral environment gets a namespace derived from the PR number:

# deploy/ephemeral/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: pr-${PR_NUMBER}
  labels:
    environment: ephemeral
    pr-number: '${PR_NUMBER}'
    created-by: 'ci'
  annotations:
    # Auto-cleanup annotation (requires custom controller or scheduled job)
    ttl: '7d'
    created-at: '${CREATED_AT}'

#!/bin/bash
# deploy/scripts/create-ephemeral-env.sh
set -e

PR_NUMBER=$1
IMAGE_TAG=${2:-"pr-${PR_NUMBER}"}
NAMESPACE="pr-${PR_NUMBER}"

echo "Creating ephemeral environment for PR #${PR_NUMBER}..."

# Create namespace
kubectl create namespace "$NAMESPACE" --dry-run=client -o yaml | kubectl apply -f -

# Label for lifecycle management
kubectl label namespace "$NAMESPACE" \
  environment=ephemeral \
  pr-number="${PR_NUMBER}" \
  created-at="$(date -u +%Y-%m-%dT%H:%M:%SZ)"

# Deploy all services with PR-specific image tags
helm upgrade --install "scanly-pr-${PR_NUMBER}" ./helm/scanly \
  --namespace "$NAMESPACE" \
  --set image.tag="${IMAGE_TAG}" \
  --set environment=ephemeral \
  --set database.url="${EPHEMERAL_DB_URL}" \
  --set ingress.host="pr-${PR_NUMBER}.staging.scanlyapp.com" \
  --wait \
  --timeout 5m

echo "Environment ready: https://pr-${PR_NUMBER}.staging.scanlyapp.com"

CI/CD Integration

# .github/workflows/ephemeral-env.yml
name: Ephemeral Environment
on:
  pull_request:
    types: [opened, synchronize, reopened, closed]

jobs:
  deploy-ephemeral:
    if: github.event.action != 'closed'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Configure kubectl
        uses: azure/k8s-set-context@v3
        with:
          kubeconfig: ${{ secrets.KUBECONFIG_STAGING }}

      - name: Build and push images
        run: |
          docker build -t registry.example.com/scanly:pr-${{ github.event.pull_request.number }} .
          docker push registry.example.com/scanly:pr-${{ github.event.pull_request.number }}

      - name: Deploy ephemeral environment
        run: |
          bash deploy/scripts/create-ephemeral-env.sh \
            ${{ github.event.pull_request.number }} \
            "pr-${{ github.event.pull_request.number }}"

      - name: Run smoke tests
        run: |
          PR_URL="https://pr-${{ github.event.pull_request.number }}.staging.scanlyapp.com"

          # Wait for health check
          npx wait-on "$PR_URL/api/health" --timeout 120000

          # Run smoke test suite
          BASE_URL="${PR_URL}" npx playwright test tests/smoke/

      - name: Comment PR with environment URL
        uses: actions/github-script@v7
        with:
          script: |
            const url = `https://pr-${{ github.event.pull_request.number }}.staging.scanlyapp.com`;
            github.rest.issues.createComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
              body: `🚀 **Preview environment ready:** [${url}](${url})\n\nAutomatically tears down when PR is closed.`
            });

  teardown-ephemeral:
    if: github.event.action == 'closed'
    runs-on: ubuntu-latest
    steps:
      - name: Configure kubectl
        uses: azure/k8s-set-context@v3
        with:
          kubeconfig: ${{ secrets.KUBECONFIG_STAGING }}

      - name: Tear down ephemeral environment
        run: |
          NAMESPACE="pr-${{ github.event.pull_request.number }}"
          helm uninstall "scanly-${NAMESPACE}" --namespace "$NAMESPACE" --ignore-not-found
          kubectl delete namespace "$NAMESPACE" --ignore-not-found
          echo "Environment torn down."

Database Isolation Strategies

Each ephemeral environment needs its own data store. Three common approaches:

Strategy	Setup Time	Isolation	Cost	Complexity
Separate DB per namespace	2-3 min	Full	Medium	Low
DB namespace (schema per PR)	30s	Logical	Low	Medium
Shared DB with RLS per env	0s	Logical	Minimal	High
Point-in-time snapshot branch	1-2 min	Full + real data	Medium	Medium

The database-namespace approach is usually best for CI speed:

-- Create a schema-per-PR namespace
-- Run on test DB cluster during environment setup

DO $$
DECLARE
  pr_schema TEXT := 'pr_' || {PR_NUMBER};
BEGIN
  EXECUTE format('CREATE SCHEMA IF NOT EXISTS %I', pr_schema);
  EXECUTE format('SET search_path TO %I, public', pr_schema);
  -- Run migrations in this schema
END $$;

Resource Limits and Cost Control

Ephemeral environments should have strict resource limits to prevent cost runaway:

# deploy/ephemeral/resource-quota.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: ephemeral-quota
  namespace: pr-${PR_NUMBER}
spec:
  hard:
    requests.cpu: '2'
    requests.memory: 4Gi
    limits.cpu: '4'
    limits.memory: 8Gi
    pods: '20'
    services: '10'

# Cleanup stale ephemeral environments (runs as daily cron job)
#!/bin/bash

# Find namespaces older than MAX_AGE_HOURS
MAX_AGE_HOURS=48

kubectl get namespaces -l environment=ephemeral -o json | \
  jq -r '.items[] |
    select(.metadata.annotations["created-at"] != null) |
    .metadata.name + " " + .metadata.annotations["created-at"]' | \
  while read namespace created_at; do
    age_hours=$(( ($(date +%s) - $(date -d "$created_at" +%s)) / 3600 ))

    if [ "$age_hours" -gt "$MAX_AGE_HOURS" ]; then
      echo "Deleting stale namespace: $namespace (age: ${age_hours}h)"
      kubectl delete namespace "$namespace"
    fi
  done

Testing Within Ephemeral Environments

The ephemeral environment enables tests that are impossible on shared staging:

// tests/smoke/ephemeral.test.ts
// Runs in the ephemeral environment context

test('full signup → onboarding → first scan flow', async ({ page }) => {
  const uniqueEmail = `e2e-${Date.now()}@test.example.com`;

  // Test the complete new-user journey in isolated env
  await page.goto('/signup');
  await page.fill('[data-testid="email"]', uniqueEmail);
  await page.fill('[data-testid="password"]', 'TestPassword123!');
  await page.click('[data-testid="signup-btn"]');

  // Complete email verification (ephemeral env has own mailbox)
  const otp = await getEmailOtp(uniqueEmail);
  await page.fill('[data-testid="otp-input"]', otp);
  await page.click('[data-testid="verify-btn"]');

  // Complete onboarding
  await page.fill('[data-testid="project-name"]', 'My Test Project');
  await page.fill('[data-testid="project-url"]', 'https://example.com');
  await page.click('[data-testid="start-scan-btn"]');

  // Wait for first scan to complete (ephemeral env has own worker)
  await expect(page.locator('[data-testid="scan-status"]')).toHaveText('Complete', { timeout: 60_000 });
});

Ephemeral Kubernetes environments are the most powerful testing infrastructure pattern available for SaaS teams. Once set up, they change the entire nature of pre-merge review: every PR has a live, testable, throwaway instance of the full system.

Kubernetes Ephemeral Test Environments: Spin Up a Fresh Cluster Per PR, Tear Down After Merge

Kubernetes Ephemeral Test Environments: Spin Up a Fresh Cluster Per PR, Tear Down After Merge

Architecture Overview

Namespace Isolation Pattern

CI/CD Integration

Database Isolation Strategies

Resource Limits and Cost Control

Testing Within Ephemeral Environments

Further Reading

Related Posts

API Cost Optimisation: How Engineering Teams Cut Cloud Spend by 40%

Chaos Engineering: Break Your System on Purpose Before Your Users Do It for You

Webhook Testing: How to Guarantee Delivery, Retry Logic, and Correct Event Ordering