Back to Blog

Test Data Management: How to Stop Hardcoded Fixtures From Breaking Your CI Pipeline

Master test data management with strategies for creating, managing, and securing test data including fixtures, synthetic data generation, data privacy compliance, and database seeding techniques.

Michael Chen

QA Architect specializing in test infrastructure and data management

Published

15 min read

Reading time

Related articles: Also see database testing practices that keep your test data consistent, AI-generated test data as a scalable source for your data management strategy, and generating the realistic personas your test data strategy needs.

Test Data Management: How to Stop Hardcoded Fixtures From Breaking Your CI Pipeline

Your test suite was green yesterday. Today, 47 tests are failing. Nothing in the code changed.

After hours of debugging, you discover the issue: someone manually deleted test user accounts from the shared database during exploratory testing. Your tests depend on those specific email addresses existing. Now everything is broken.

Sound familiar?

Poor test data management is one of the most common—yet least discussed—causes of test flakiness, slow tests, and maintenance nightmares. As your application grows, managing test data becomes exponentially more complex.

This comprehensive guide teaches you battle-tested strategies for creating, managing, and securing test data—from simple test fixtures to sophisticated synthetic data generation, GDPR-compliant approaches to database seeding, and strategies to ensure your tests remain fast, reliable, and maintainable.

The Test Data Challenge

Why Test Data Is Hard

1. Data Dependencies

Tests often require:

  • Specific user accounts with specific permissions
  • Related data (orders → line items → products)
  • Historical state (created timestamps, modification history)
  • Edge cases (expired trials, overdue invoices, etc.)

2. Environment Management

  • Dev sandbox vs. staging vs. local environments
  • Shared vs. isolated test databases
  • Production-like data vs. synthetic data

3. Privacy and Compliance

  • GDPR, CCPA, HIPAA requirements
  • PII (Personally Identifiable Information) handling
  • Data anonymization and masking

4. Performance

  • Large datasets slow down tests
  • Database seeding takes time
  • Cleanup can be as complex as setup

Test Data Strategy Overview

graph TD
    A[Test Data Strategy] --> B[Static Fixtures]
    A --> C[Dynamic Factories]
    A --> D[Database Seeding]
    A --> E[Synthetic Data]
    A --> F[API-Based Setup]

    B --> B1[JSON files]
    B --> B2[Code constants]

    C --> C1[Factory functions]
    C --> C2[Faker.js]

    D --> D1[SQL scripts]
    D --> D2[ORM seeders]

    E --> E1[AI-generated]
    E --> E2[Production sampling]

    F --> F1[REST APIs]
    F --> F2[GraphQL mutations]

Strategy 1: Test Fixtures

Best for: Simple, predictable data that rarely changes.

JSON Fixtures

// fixtures/users.json
{
  "admin": {
    "id": "user_admin_001",
    "email": "admin@example.com",
    "name": "Admin User",
    "role": "admin",
    "createdAt": "2025-01-01T00:00:00Z"
  },
  "regularUser": {
    "id": "user_regular_001",
    "email": "user@example.com",
    "name": "Regular User",
    "role": "user",
    "createdAt": "2025-01-01T00:00:00Z"
  },
  "trialUser": {
    "id": "user_trial_001",
    "email": "trial@example.com",
    "name": "Trial User",
    "role": "user",
    "subscription": {
      "plan": "trial",
      "expiresAt": "2026-12-31T23:59:59Z"
    }
  }
}
// tests/fixtures.ts
import users from './fixtures/users.json';
import projects from './fixtures/projects.json';

export function getFixture<T>(type: string, key: string): T {
  const fixtures: Record<string, any> = {
    users,
    projects,
  };

  if (!fixtures[type] || !fixtures[type][key]) {
    throw new Error(`Fixture not found: ${type}.${key}`);
  }

  return fixtures[type][key] as T;
}

// Usage in tests
test('admin can delete projects', async ({ page }) => {
  const admin = getFixture('users', 'admin');
  const project = getFixture('projects', 'testProject');

  await loginAs(page, admin);
  await deleteProject(page, project.id);
});

Pros:

  • Version controlled
  • Easy to debug (inspect JSON directly)
  • Fast (no database calls)
  • Predictable

Cons:

  • Brittle (hard-coded IDs, timestamps)
  • Not dynamic (can't generate random data)
  • Can become stale as schema evolves

Code-Based Fixtures

// fixtures/users.ts
export const fixtures = {
  users: {
    admin: () => ({
      id: 'admin-001',
      email: 'admin@example.com',
      name: 'Admin User',
      role: 'admin',
      permissions: ['read', 'write', 'delete', 'admin'],
      createdAt: new Date('2025-01-01'),
    }),

    user: () => ({
      id: 'user-001',
      email: 'user@example.com',
      name: 'Regular User',
      role: 'user',
      permissions: ['read', 'write'],
      createdAt: new Date('2025-01-01'),
    }),
  },

  projects: {
    active: () => ({
      id: 'proj-001',
      name: 'Test Project',
      status: 'active',
      ownerId: fixtures.users.user().id,
      createdAt: new Date('2025-01-01'),
    }),

    archived: () => ({
      id: 'proj-002',
      name: 'Archived Project',
      status: 'archived',
      ownerId: fixtures.users.user().id,
      createdAt: new Date('2024-01-01'),
      archivedAt: new Date('2024-12-01'),
    }),
  },
};

// Type-safe fixture getter
export function getFixture<K extends keyof typeof fixtures>(
  category: K,
  name: keyof (typeof fixtures)[K],
): ReturnType<(typeof fixtures)[K][typeof name]> {
  return fixtures[category][name]();
}

Strategy 2: Factory Functions

Best for: Dynamic test data with randomization and overrides.

Building a Test Factory

// factories/user.factory.ts
import { faker } from '@faker-js/faker';

interface User {
  id: string;
  email: string;
  name: string;
  role: 'user' | 'admin' | 'moderator';
  subscription?: {
    plan: string;
    status: 'active' | 'cancelled' | 'expired';
    expiresAt: Date;
  };
  createdAt: Date;
}

export class UserFactory {
  private defaults: Partial<User> = {
    role: 'user',
    createdAt: new Date(),
  };

  /**
   * Create a user with random data
   */
  build(overrides?: Partial<User>): User {
    return {
      id: faker.string.uuid(),
      email: faker.internet.email(),
      name: faker.person.fullName(),
      role: 'user',
      createdAt: new Date(),
      ...this.defaults,
      ...overrides,
    };
  }

  /**
   * Create multiple users
   */
  buildList(count: number, overrides?: Partial<User>): User[] {
    return Array.from({ length: count }, () => this.build(overrides));
  }

  /**
   * Presets for common user types
   */
  admin(overrides?: Partial<User>): User {
    return this.build({
      role: 'admin',
      ...overrides,
    });
  }

  trialUser(overrides?: Partial<User>): User {
    const expiresAt = new Date();
    expiresAt.setDate(expiresAt.getDate() + 14); // 14-day trial

    return this.build({
      subscription: {
        plan: 'trial',
        status: 'active',
        expiresAt,
      },
      ...overrides,
    });
  }

  expiredTrial(overrides?: Partial<User>): User {
    return this.build({
      subscription: {
        plan: 'trial',
        status: 'expired',
        expiresAt: new Date('2025-01-01'), // Past date
      },
      ...overrides,
    });
  }

  premiumUser(overrides?: Partial<User>): User {
    const expiresAt = new Date();
    expiresAt.setFullYear(expiresAt.getFullYear() + 1); // 1 year subscription

    return this.build({
      subscription: {
        plan: 'premium',
        status: 'active',
        expiresAt,
      },
      ...overrides,
    });
  }
}

// Export singleton instance
export const userFactory = new UserFactory();

Using Factories in Tests

import { test, expect } from '@playwright/test';
import { userFactory } from './factories/user.factory';
import { projectFactory } from './factories/project.factory';

test('admin can view all projects', async ({ page }) => {
  // Create test data
  const admin = userFactory.admin();
  const projects = projectFactory.buildList(5, { ownerId: admin.id });

  // Seed database
  await seedDatabase({ users: [admin], projects });

  // Test
  await loginAs(page, admin);
  await page.goto('/projects');

  // All projects should be visible
  for (const project of projects) {
    await expect(page.getByText(project.name)).toBeVisible();
  }
});

test('user only sees own projects', async ({ page }) => {
  // Create two users with their own projects
  const user1 = userFactory.build();
  const user2 = userFactory.build();

  const user1Projects = projectFactory.buildList(3, { ownerId: user1.id });
  const user2Projects = projectFactory.buildList(2, { ownerId: user2.id });

  await seedDatabase({
    users: [user1, user2],
    projects: [...user1Projects, ...user2Projects],
  });

  // User1 should only see their projects
  await loginAs(page, user1);
  await page.goto('/projects');

  for (const project of user1Projects) {
    await expect(page.getByText(project.name)).toBeVisible();
  }

  for (const project of user2Projects) {
    await expect(page.getByText(project.name)).not.toBeVisible();
  }
});

Advanced Factory with Builder Pattern

// factories/project.factory.ts
interface Project {
  id: string;
  name: string;
  description: string;
  ownerId: string;
  status: 'draft' | 'active' | 'archived';
  tags: string[];
  createdAt: Date;
  updatedAt: Date;
}

export class ProjectBuilder {
  private project: Partial<Project> = {
    status: 'active',
    tags: [],
    createdAt: new Date(),
    updatedAt: new Date(),
  };

  withName(name: string): this {
    this.project.name = name;
    return this;
  }

  withOwner(ownerId: string): this {
    this.project.ownerId = ownerId;
    return this;
  }

  active(): this {
    this.project.status = 'active';
    return this;
  }

  archived(): this {
    this.project.status = 'archived';
    return this;
  }

  withTags(...tags: string[]): this {
    this.project.tags = tags;
    return this;
  }

  createdDaysAgo(days: number): this {
    const date = new Date();
    date.setDate(date.getDate() - days);
    this.project.createdAt = date;
    return this;
  }

  build(): Project {
    return {
      id: faker.string.uuid(),
      name: faker.company.name(),
      description: faker.lorem.paragraph(),
      ownerId: faker.string.uuid(),
      status: 'active',
      tags: [],
      createdAt: new Date(),
      updatedAt: new Date(),
      ...this.project,
    } as Project;
  }
}

// Usage
test('filter projects by tag', async ({ page }) => {
  const user = userFactory.build();

  const projects = [
    new ProjectBuilder().withOwner(user.id).withTags('urgent', 'bug').build(),

    new ProjectBuilder().withOwner(user.id).withTags('feature', 'enhancement').build(),

    new ProjectBuilder().withOwner(user.id).withTags('urgent', 'feature').build(),
  ];

  await seedDatabase({ users: [user], projects });
  await loginAs(page, user);

  // Filter by "urgent" tag
  await page.goto('/projects?tag=urgent');

  // Should see 2 projects
  await expect(page.locator('.project-card')).toHaveCount(2);
});

Strategy 3: Synthetic Data Generation

Best for: Large datasets, realistic data, testing at scale.

Using Faker.js for Realistic Data

// utilities/data-generator.ts
import { faker } from '@faker-js/faker';

export class DataGenerator {
  /**
   * Generate realistic user profiles
   */
  generateUser() {
    const firstName = faker.person.firstName();
    const lastName = faker.person.lastName();

    return {
      id: faker.string.uuid(),
      email: faker.internet.email({ firstName, lastName }),
      name: `${firstName} ${lastName}`,
      username: faker.internet.userName({ firstName, lastName }),
      avatar: faker.image.avatar(),
      bio: faker.person.bio(),
      website: faker.internet.url(),
      company: faker.company.name(),
      jobTitle: faker.person.jobTitle(),
      phone: faker.phone.number(),
      address: {
        street: faker.location.streetAddress(),
        city: faker.location.city(),
        state: faker.location.state(),
        zip: faker.location.zipCode(),
        country: faker.location.country(),
      },
      createdAt: faker.date.past({ years: 2 }),
    };
  }

  /**
   * Generate e-commerce orders
   */
  generateOrder(userId: string) {
    const itemCount = faker.number.int({ min: 1, max: 10 });
    const items = Array.from({ length: itemCount }, () => this.generateOrderItem());

    const subtotal = items.reduce((sum, item) => sum + item.price * item.quantity, 0);
    const tax = subtotal * 0.08;
    const shipping = subtotal > 50 ? 0 : 9.99;
    const total = subtotal + tax + shipping;

    return {
      id: faker.string.uuid(),
      userId,
      orderNumber: faker.string.alphanumeric(10).toUpperCase(),
      status: faker.helpers.arrayElement(['pending', 'shipped', 'delivered', 'cancelled']),
      items,
      subtotal,
      tax,
      shipping,
      total,
      shippingAddress: {
        name: faker.person.fullName(),
        street: faker.location.streetAddress(),
        city: faker.location.city(),
        state: faker.location.state(),
        zip: faker.location.zipCode(),
        country: 'USA',
      },
      createdAt: faker.date.recent({ days: 90 }),
    };
  }

  private generateOrderItem() {
    return {
      id: faker.string.uuid(),
      productId: faker.string.uuid(),
      name: faker.commerce.productName(),
      description: faker.commerce.productDescription(),
      price: parseFloat(faker.commerce.price({ min: 10, max: 500 })),
      quantity: faker.number.int({ min: 1, max: 5 }),
      image: faker.image.url(),
    };
  }

  /**
   * Generate time-series data (e.g., analytics)
   */
  generateMetrics(days: number = 30) {
    const metrics = [];
    const endDate = new Date();

    for (let i = 0; i < days; i++) {
      const date = new Date(endDate);
      date.setDate(date.getDate() - i);

      metrics.push({
        date: date.toISOString().split('T')[0],
        pageViews: faker.number.int({ min: 100, max: 10000 }),
        uniqueVisitors: faker.number.int({ min: 50, max: 5000 }),
        bounceRate: faker.number.float({ min: 0.2, max: 0.8, precision: 0.01 }),
        avgSessionDuration: faker.number.int({ min: 30, max: 600 }), // seconds
        conversions: faker.number.int({ min: 0, max: 100 }),
      });
    }

    return metrics.reverse(); // Oldest first
  }

  /**
   * Generate large dataset for performance testing
   */
  async generateLargeDataset(count: number) {
    console.log(`Generating ${count} records...`);
    const users = [];

    for (let i = 0; i < count; i++) {
      users.push(this.generateUser());

      if (i % 1000 === 0) {
        console.log(`Generated ${i}/${count} records`);
      }
    }

    return users;
  }
}

export const dataGenerator = new DataGenerator();

Seeding Database with Synthetic Data

// scripts/seed-test-data.ts
import { createClient } from '@supabase/supabase-js';
import { dataGenerator } from '../utilities/data-generator';

const supabase = createClient(process.env.SUPABASE_URL!, process.env.SUPABASE_SERVICE_ROLE_KEY!);

async function seedTestData() {
  console.log('Seeding test data...');

  // Generate users
  console.log('Generating users...');
  const users = Array.from({ length: 100 }, () => dataGenerator.generateUser());

  const { data: insertedUsers, error: usersError } = await supabase.from('users').insert(users).select();

  if (usersError) throw usersError;
  console.log(`✓ Inserted ${insertedUsers.length} users`);

  // Generate orders for each user
  console.log('Generating orders...');
  const orders = [];
  for (const user of insertedUsers) {
    const orderCount = Math.floor(Math.random() * 5); // 0-4 orders per user
    for (let i = 0; i < orderCount; i++) {
      orders.push(dataGenerator.generateOrder(user.id));
    }
  }

  const { data: insertedOrders, error: ordersError } = await supabase.from('orders').insert(orders).select();

  if (ordersError) throw ordersError;
  console.log(`✓ Inserted ${insertedOrders.length} orders`);

  // Generate metrics
  console.log('Generating metrics...');
  const metrics = dataGenerator.generateMetrics(90); // 90 days of data

  const { error: metricsError } = await supabase.from('daily_metrics').insert(metrics);

  if (metricsError) throw metricsError;
  console.log(`✓ Inserted ${metrics.length} days of metrics`);

  console.log('✓ Test data seeded successfully!');
}

seedTestData().catch(console.error);

Strategy 4: Privacy-Compliant Test Data

Data Masking

// utilities/data-masking.ts
import { createHash } from 'crypto';

export class DataMasker {
  /**
   * Mask email addresses
   */
  maskEmail(email: string): string {
    const [username, domain] = email.split('@');
    const maskedUsername = username.charAt(0) + '***' + username.charAt(username.length - 1);
    return `${maskedUsername}@${domain}`;
  }

  /**
   * Mask phone numbers
   */
  maskPhone(phone: string): string {
    return phone.replace(/\d(?=\d{4})/g, '*');
  }

  /**
   * Mask credit card numbers
   */
  maskCreditCard(cardNumber: string): string {
    return cardNumber.replace(/\d(?=\d{4})/g, '*');
  }

  /**
   * Hash PII for deterministic anonymization
   */
  hashPII(value: string, salt: string = 'test-salt'): string {
    return createHash('sha256')
      .update(value + salt)
      .digest('hex')
      .substring(0, 16);
  }

  /**
   * Anonymize production data for testing
   */
  anonymizeUser(user: any) {
    return {
      ...user,
      email: `test-${this.hashPII(user.email)}@example.com`,
      name: `Test User ${this.hashPII(user.id)}`,
      phone: this.maskPhone(user.phone || '555-0000'),
      address: {
        ...user.address,
        street: 'Test Street',
        city: 'Test City',
      },
      // Keep non-PII fields
      role: user.role,
      subscription: user.subscription,
      createdAt: user.createdAt,
    };
  }
}

export const dataMasker = new DataMasker();

Copying Production Data Safely

// scripts/copy-prod-data-safely.ts
import { createClient } from '@supabase/supabase-js';
import { dataMasker } from '../utilities/data-masking';

const prodSupabase = createClient(process.env.PROD_SUPABASE_URL!, process.env.PROD_SUPABASE_SERVICE_KEY!);

const testSupabase = createClient(process.env.TEST_SUPABASE_URL!, process.env.TEST_SUPABASE_SERVICE_KEY!);

async function copyProductionDataSafely() {
  console.log('Copying production data with anonymization...');

  // Fetch sample of production users
  const { data: prodUsers, error } = await prodSupabase.from('users').select('*').limit(1000);

  if (error) throw error;

  // Anonymize PII
  const anonymizedUsers = prodUsers.map((user) => dataMasker.anonymizeUser(user));

  // Insert into test database
  const { error: insertError } = await testSupabase.from('users').insert(anonymizedUsers);

  if (insertError) throw insertError;

  console.log(`✓ Copied and anonymized ${anonymizedUsers.length} users`);
}

copyProductionDataSafely().catch(console.error);

Strategy 5: API-Based Test Data Setup

Best for: E2E tests that need realistic workflows.

// utilities/test-setup.ts
import { APIRequestContext } from '@playwright/test';

export class TestDataSetup {
  constructor(private request: APIRequestContext) {}

  /**
   * Create user via API (faster than UI)
   */
  async createUser(userData: { email: string; password: string; name: string }) {
    const response = await this.request.post('/api/auth/signup', {
      data: userData,
    });

    if (!response.ok()) {
      throw new Error(`Failed to create user: ${await response.text()}`);
    }

    return response.json();
  }

  /**
   * Create authenticated API context
   */
  async getAuthenticatedContext(email: string, password: string) {
    const response = await this.request.post('/api/auth/login', {
      data: { email, password },
    });

    const { token } = await response.json();

    return {
      ...this.request,
      headers: {
        ...this.request.headers,
        Authorization: `Bearer ${token}`,
      },
    };
  }

  /**
   * Create project via API
   */
  async createProject(data: { name: string; description: string }) {
    const response = await this.request.post('/api/projects', {
      data,
    });

    if (!response.ok()) {
      throw new Error(`Failed to create project: ${await response.text()}`);
    }

    return response.json();
  }

  /**
   * Setup complete test scenario
   */
  async setupTestScenario() {
    // Create user
    const user = await this.createUser({
      email: faker.internet.email(),
      password: 'TestPassword123!',
      name: faker.person.fullName(),
    });

    // Get authenticated context
    const authedRequest = await this.getAuthenticatedContext(user.email, 'TestPassword123!');

    // Create projects
    const projects = await Promise.all([
      this.createProject({ name: 'Project A', description: 'Test project A' }),
      this.createProject({ name: 'Project B', description: 'Test project B' }),
    ]);

    return { user, projects };
  }
}

// Usage in Playwright test
test('user can view their projects', async ({ page, request }) => {
  const setup = new TestDataSetup(request);
  const { user, projects } = await setup.setupTestScenario();

  // Now navigate UI with pre-setup data
  await page.goto('/login');
  await page.fill('#email', user.email);
  await page.fill('#password', 'TestPassword123!');
  await page.click('button[type="submit"]');

  await page.waitForURL('/dashboard');

  // Verify projects are visible
  for (const project of projects) {
    await expect(page.getByText(project.name)).toBeVisible();
  }
});

Test Data Cleanup Strategies

Strategy A: Isolated Test Databases

Use a fresh database per test run:

// playwright.config.ts
export default defineConfig({
  globalSetup: require.resolve('./global-setup'),
  globalTeardown: require.resolve('./global-teardown'),
});

// global-setup.ts
import { createClient } from '@supabase/supabase-js';

export default async function globalSetup() {
  const testDbName = `test_${Date.now()}`;

  // Create isolated test database
  const supabase = createClient(process.env.SUPABASE_URL!, process.env.SUPABASE_SERVICE_KEY!);

  // Run migrations on test DB
  await runMigrations(testDbName);

  // Store DB name for tests
  process.env.TEST_DB_NAME = testDbName;
}

// global-teardown.ts
export default async function globalTeardown() {
  // Drop test database
  const testDbName = process.env.TEST_DB_NAME;
  await dropDatabase(testDbName);
}

Strategy B: Transactional Tests

Roll back database changes after each test:

// fixtures/database.ts
import { test as base } from '@playwright/test';
import { db } from '../lib/database';

export const test = base.extend({
  db: async ({}, use) => {
    // Start transaction
    await db.raw('BEGIN');

    await use(db);

    // Rollback after test
    await db.raw('ROLLBACK');
  },
});

// Usage
test('create project rolls back', async ({ db }) => {
  await db('projects').insert({ name: 'Test Project' });

  const count = await db('projects').count();
  expect(count).toBe(1);

  // After test, transaction rolls back, no cleanup needed
});

Strategy C: Cleanup Helpers

// utilities/test-cleanup.ts
export class TestCleanup {
  private createdIds: Map<string, string[]> = new Map();

  track(entity: string, id: string) {
    if (!this.createdIds.has(entity)) {
      this.createdIds.set(entity, []);
    }
    this.createdIds.get(entity)!.push(id);
  }

  async cleanupAll() {
    for (const [entity, ids] of this.createdIds.entries()) {
      await this.cleanup(entity, ids);
    }
    this.createdIds.clear();
  }

  private async cleanup(entity: string, ids: string[]) {
    // Delete from database
    await supabase.from(entity).delete().in('id', ids);
    console.log(`Cleaned up ${ids.length} ${entity} records`);
  }
}

// Usage
test('test with auto-cleanup', async ({ page }) => {
  const cleanup = new TestCleanup();

  try {
    const user = await createUser({ email: 'test@example.com' });
    cleanup.track('users', user.id);

    const project = await createProject({ name: 'Test', ownerId: user.id });
    cleanup.track('projects', project.id);

    // Run test...
  } finally {
    await cleanup.cleanupAll();
  }
});

Best Practices Checklist

  • Never use production data directly - Always anonymize/mask PII
  • Use factories over fixtures for dynamic data needs
  • Seed minimal data - Only what the test needs
  • Isolate test data - Each test should have its own data
  • Clean up after tests - Don't leave test debris
  • Version control fixtures - Track changes to test data
  • Document data dependencies - Make relationships clear
  • Use realistic data - Faker.js for genuine edge cases
  • Test with large datasets - Validate performance at scale
  • Automate data generation - Don't manually create test data

Conclusion

Test data management is the unsexy-but-critical foundation of reliable test automation. The strategies in this guide—from simple fixtures to sophisticated synthetic data generation—give you a complete toolkit for any testing scenario.

The key is choosing the right strategy for your context:

  • Fixtures for simple, stable data
  • Factories for dynamic, varied test cases
  • Synthetic data for scale and realism
  • API setup for speed
  • Proper cleanup for reliability

Invest in test data infrastructure early. Your future self (and your team) will thank you when tests are fast, reliable, and maintainable. Ready to build robust test automation? Start your free trial with ScanlyApp and leverage our test data management tools with built-in fixture libraries, factory generators, and automated cleanup—no complex infrastructure required.

Related Posts