Related articles: Also see database testing practices that keep your test data consistent, AI-generated test data as a scalable source for your data management strategy, and generating the realistic personas your test data strategy needs.
Test Data Management: How to Stop Hardcoded Fixtures From Breaking Your CI Pipeline
Your test suite was green yesterday. Today, 47 tests are failing. Nothing in the code changed.
After hours of debugging, you discover the issue: someone manually deleted test user accounts from the shared database during exploratory testing. Your tests depend on those specific email addresses existing. Now everything is broken.
Sound familiar?
Poor test data management is one of the most common—yet least discussed—causes of test flakiness, slow tests, and maintenance nightmares. As your application grows, managing test data becomes exponentially more complex.
This comprehensive guide teaches you battle-tested strategies for creating, managing, and securing test data—from simple test fixtures to sophisticated synthetic data generation, GDPR-compliant approaches to database seeding, and strategies to ensure your tests remain fast, reliable, and maintainable.
The Test Data Challenge
Why Test Data Is Hard
1. Data Dependencies
Tests often require:
- Specific user accounts with specific permissions
- Related data (orders → line items → products)
- Historical state (created timestamps, modification history)
- Edge cases (expired trials, overdue invoices, etc.)
2. Environment Management
- Dev sandbox vs. staging vs. local environments
- Shared vs. isolated test databases
- Production-like data vs. synthetic data
3. Privacy and Compliance
- GDPR, CCPA, HIPAA requirements
- PII (Personally Identifiable Information) handling
- Data anonymization and masking
4. Performance
- Large datasets slow down tests
- Database seeding takes time
- Cleanup can be as complex as setup
Test Data Strategy Overview
graph TD
A[Test Data Strategy] --> B[Static Fixtures]
A --> C[Dynamic Factories]
A --> D[Database Seeding]
A --> E[Synthetic Data]
A --> F[API-Based Setup]
B --> B1[JSON files]
B --> B2[Code constants]
C --> C1[Factory functions]
C --> C2[Faker.js]
D --> D1[SQL scripts]
D --> D2[ORM seeders]
E --> E1[AI-generated]
E --> E2[Production sampling]
F --> F1[REST APIs]
F --> F2[GraphQL mutations]
Strategy 1: Test Fixtures
Best for: Simple, predictable data that rarely changes.
JSON Fixtures
// fixtures/users.json
{
"admin": {
"id": "user_admin_001",
"email": "admin@example.com",
"name": "Admin User",
"role": "admin",
"createdAt": "2025-01-01T00:00:00Z"
},
"regularUser": {
"id": "user_regular_001",
"email": "user@example.com",
"name": "Regular User",
"role": "user",
"createdAt": "2025-01-01T00:00:00Z"
},
"trialUser": {
"id": "user_trial_001",
"email": "trial@example.com",
"name": "Trial User",
"role": "user",
"subscription": {
"plan": "trial",
"expiresAt": "2026-12-31T23:59:59Z"
}
}
}
// tests/fixtures.ts
import users from './fixtures/users.json';
import projects from './fixtures/projects.json';
export function getFixture<T>(type: string, key: string): T {
const fixtures: Record<string, any> = {
users,
projects,
};
if (!fixtures[type] || !fixtures[type][key]) {
throw new Error(`Fixture not found: ${type}.${key}`);
}
return fixtures[type][key] as T;
}
// Usage in tests
test('admin can delete projects', async ({ page }) => {
const admin = getFixture('users', 'admin');
const project = getFixture('projects', 'testProject');
await loginAs(page, admin);
await deleteProject(page, project.id);
});
Pros:
- Version controlled
- Easy to debug (inspect JSON directly)
- Fast (no database calls)
- Predictable
Cons:
- Brittle (hard-coded IDs, timestamps)
- Not dynamic (can't generate random data)
- Can become stale as schema evolves
Code-Based Fixtures
// fixtures/users.ts
export const fixtures = {
users: {
admin: () => ({
id: 'admin-001',
email: 'admin@example.com',
name: 'Admin User',
role: 'admin',
permissions: ['read', 'write', 'delete', 'admin'],
createdAt: new Date('2025-01-01'),
}),
user: () => ({
id: 'user-001',
email: 'user@example.com',
name: 'Regular User',
role: 'user',
permissions: ['read', 'write'],
createdAt: new Date('2025-01-01'),
}),
},
projects: {
active: () => ({
id: 'proj-001',
name: 'Test Project',
status: 'active',
ownerId: fixtures.users.user().id,
createdAt: new Date('2025-01-01'),
}),
archived: () => ({
id: 'proj-002',
name: 'Archived Project',
status: 'archived',
ownerId: fixtures.users.user().id,
createdAt: new Date('2024-01-01'),
archivedAt: new Date('2024-12-01'),
}),
},
};
// Type-safe fixture getter
export function getFixture<K extends keyof typeof fixtures>(
category: K,
name: keyof (typeof fixtures)[K],
): ReturnType<(typeof fixtures)[K][typeof name]> {
return fixtures[category][name]();
}
Strategy 2: Factory Functions
Best for: Dynamic test data with randomization and overrides.
Building a Test Factory
// factories/user.factory.ts
import { faker } from '@faker-js/faker';
interface User {
id: string;
email: string;
name: string;
role: 'user' | 'admin' | 'moderator';
subscription?: {
plan: string;
status: 'active' | 'cancelled' | 'expired';
expiresAt: Date;
};
createdAt: Date;
}
export class UserFactory {
private defaults: Partial<User> = {
role: 'user',
createdAt: new Date(),
};
/**
* Create a user with random data
*/
build(overrides?: Partial<User>): User {
return {
id: faker.string.uuid(),
email: faker.internet.email(),
name: faker.person.fullName(),
role: 'user',
createdAt: new Date(),
...this.defaults,
...overrides,
};
}
/**
* Create multiple users
*/
buildList(count: number, overrides?: Partial<User>): User[] {
return Array.from({ length: count }, () => this.build(overrides));
}
/**
* Presets for common user types
*/
admin(overrides?: Partial<User>): User {
return this.build({
role: 'admin',
...overrides,
});
}
trialUser(overrides?: Partial<User>): User {
const expiresAt = new Date();
expiresAt.setDate(expiresAt.getDate() + 14); // 14-day trial
return this.build({
subscription: {
plan: 'trial',
status: 'active',
expiresAt,
},
...overrides,
});
}
expiredTrial(overrides?: Partial<User>): User {
return this.build({
subscription: {
plan: 'trial',
status: 'expired',
expiresAt: new Date('2025-01-01'), // Past date
},
...overrides,
});
}
premiumUser(overrides?: Partial<User>): User {
const expiresAt = new Date();
expiresAt.setFullYear(expiresAt.getFullYear() + 1); // 1 year subscription
return this.build({
subscription: {
plan: 'premium',
status: 'active',
expiresAt,
},
...overrides,
});
}
}
// Export singleton instance
export const userFactory = new UserFactory();
Using Factories in Tests
import { test, expect } from '@playwright/test';
import { userFactory } from './factories/user.factory';
import { projectFactory } from './factories/project.factory';
test('admin can view all projects', async ({ page }) => {
// Create test data
const admin = userFactory.admin();
const projects = projectFactory.buildList(5, { ownerId: admin.id });
// Seed database
await seedDatabase({ users: [admin], projects });
// Test
await loginAs(page, admin);
await page.goto('/projects');
// All projects should be visible
for (const project of projects) {
await expect(page.getByText(project.name)).toBeVisible();
}
});
test('user only sees own projects', async ({ page }) => {
// Create two users with their own projects
const user1 = userFactory.build();
const user2 = userFactory.build();
const user1Projects = projectFactory.buildList(3, { ownerId: user1.id });
const user2Projects = projectFactory.buildList(2, { ownerId: user2.id });
await seedDatabase({
users: [user1, user2],
projects: [...user1Projects, ...user2Projects],
});
// User1 should only see their projects
await loginAs(page, user1);
await page.goto('/projects');
for (const project of user1Projects) {
await expect(page.getByText(project.name)).toBeVisible();
}
for (const project of user2Projects) {
await expect(page.getByText(project.name)).not.toBeVisible();
}
});
Advanced Factory with Builder Pattern
// factories/project.factory.ts
interface Project {
id: string;
name: string;
description: string;
ownerId: string;
status: 'draft' | 'active' | 'archived';
tags: string[];
createdAt: Date;
updatedAt: Date;
}
export class ProjectBuilder {
private project: Partial<Project> = {
status: 'active',
tags: [],
createdAt: new Date(),
updatedAt: new Date(),
};
withName(name: string): this {
this.project.name = name;
return this;
}
withOwner(ownerId: string): this {
this.project.ownerId = ownerId;
return this;
}
active(): this {
this.project.status = 'active';
return this;
}
archived(): this {
this.project.status = 'archived';
return this;
}
withTags(...tags: string[]): this {
this.project.tags = tags;
return this;
}
createdDaysAgo(days: number): this {
const date = new Date();
date.setDate(date.getDate() - days);
this.project.createdAt = date;
return this;
}
build(): Project {
return {
id: faker.string.uuid(),
name: faker.company.name(),
description: faker.lorem.paragraph(),
ownerId: faker.string.uuid(),
status: 'active',
tags: [],
createdAt: new Date(),
updatedAt: new Date(),
...this.project,
} as Project;
}
}
// Usage
test('filter projects by tag', async ({ page }) => {
const user = userFactory.build();
const projects = [
new ProjectBuilder().withOwner(user.id).withTags('urgent', 'bug').build(),
new ProjectBuilder().withOwner(user.id).withTags('feature', 'enhancement').build(),
new ProjectBuilder().withOwner(user.id).withTags('urgent', 'feature').build(),
];
await seedDatabase({ users: [user], projects });
await loginAs(page, user);
// Filter by "urgent" tag
await page.goto('/projects?tag=urgent');
// Should see 2 projects
await expect(page.locator('.project-card')).toHaveCount(2);
});
Strategy 3: Synthetic Data Generation
Best for: Large datasets, realistic data, testing at scale.
Using Faker.js for Realistic Data
// utilities/data-generator.ts
import { faker } from '@faker-js/faker';
export class DataGenerator {
/**
* Generate realistic user profiles
*/
generateUser() {
const firstName = faker.person.firstName();
const lastName = faker.person.lastName();
return {
id: faker.string.uuid(),
email: faker.internet.email({ firstName, lastName }),
name: `${firstName} ${lastName}`,
username: faker.internet.userName({ firstName, lastName }),
avatar: faker.image.avatar(),
bio: faker.person.bio(),
website: faker.internet.url(),
company: faker.company.name(),
jobTitle: faker.person.jobTitle(),
phone: faker.phone.number(),
address: {
street: faker.location.streetAddress(),
city: faker.location.city(),
state: faker.location.state(),
zip: faker.location.zipCode(),
country: faker.location.country(),
},
createdAt: faker.date.past({ years: 2 }),
};
}
/**
* Generate e-commerce orders
*/
generateOrder(userId: string) {
const itemCount = faker.number.int({ min: 1, max: 10 });
const items = Array.from({ length: itemCount }, () => this.generateOrderItem());
const subtotal = items.reduce((sum, item) => sum + item.price * item.quantity, 0);
const tax = subtotal * 0.08;
const shipping = subtotal > 50 ? 0 : 9.99;
const total = subtotal + tax + shipping;
return {
id: faker.string.uuid(),
userId,
orderNumber: faker.string.alphanumeric(10).toUpperCase(),
status: faker.helpers.arrayElement(['pending', 'shipped', 'delivered', 'cancelled']),
items,
subtotal,
tax,
shipping,
total,
shippingAddress: {
name: faker.person.fullName(),
street: faker.location.streetAddress(),
city: faker.location.city(),
state: faker.location.state(),
zip: faker.location.zipCode(),
country: 'USA',
},
createdAt: faker.date.recent({ days: 90 }),
};
}
private generateOrderItem() {
return {
id: faker.string.uuid(),
productId: faker.string.uuid(),
name: faker.commerce.productName(),
description: faker.commerce.productDescription(),
price: parseFloat(faker.commerce.price({ min: 10, max: 500 })),
quantity: faker.number.int({ min: 1, max: 5 }),
image: faker.image.url(),
};
}
/**
* Generate time-series data (e.g., analytics)
*/
generateMetrics(days: number = 30) {
const metrics = [];
const endDate = new Date();
for (let i = 0; i < days; i++) {
const date = new Date(endDate);
date.setDate(date.getDate() - i);
metrics.push({
date: date.toISOString().split('T')[0],
pageViews: faker.number.int({ min: 100, max: 10000 }),
uniqueVisitors: faker.number.int({ min: 50, max: 5000 }),
bounceRate: faker.number.float({ min: 0.2, max: 0.8, precision: 0.01 }),
avgSessionDuration: faker.number.int({ min: 30, max: 600 }), // seconds
conversions: faker.number.int({ min: 0, max: 100 }),
});
}
return metrics.reverse(); // Oldest first
}
/**
* Generate large dataset for performance testing
*/
async generateLargeDataset(count: number) {
console.log(`Generating ${count} records...`);
const users = [];
for (let i = 0; i < count; i++) {
users.push(this.generateUser());
if (i % 1000 === 0) {
console.log(`Generated ${i}/${count} records`);
}
}
return users;
}
}
export const dataGenerator = new DataGenerator();
Seeding Database with Synthetic Data
// scripts/seed-test-data.ts
import { createClient } from '@supabase/supabase-js';
import { dataGenerator } from '../utilities/data-generator';
const supabase = createClient(process.env.SUPABASE_URL!, process.env.SUPABASE_SERVICE_ROLE_KEY!);
async function seedTestData() {
console.log('Seeding test data...');
// Generate users
console.log('Generating users...');
const users = Array.from({ length: 100 }, () => dataGenerator.generateUser());
const { data: insertedUsers, error: usersError } = await supabase.from('users').insert(users).select();
if (usersError) throw usersError;
console.log(`✓ Inserted ${insertedUsers.length} users`);
// Generate orders for each user
console.log('Generating orders...');
const orders = [];
for (const user of insertedUsers) {
const orderCount = Math.floor(Math.random() * 5); // 0-4 orders per user
for (let i = 0; i < orderCount; i++) {
orders.push(dataGenerator.generateOrder(user.id));
}
}
const { data: insertedOrders, error: ordersError } = await supabase.from('orders').insert(orders).select();
if (ordersError) throw ordersError;
console.log(`✓ Inserted ${insertedOrders.length} orders`);
// Generate metrics
console.log('Generating metrics...');
const metrics = dataGenerator.generateMetrics(90); // 90 days of data
const { error: metricsError } = await supabase.from('daily_metrics').insert(metrics);
if (metricsError) throw metricsError;
console.log(`✓ Inserted ${metrics.length} days of metrics`);
console.log('✓ Test data seeded successfully!');
}
seedTestData().catch(console.error);
Strategy 4: Privacy-Compliant Test Data
Data Masking
// utilities/data-masking.ts
import { createHash } from 'crypto';
export class DataMasker {
/**
* Mask email addresses
*/
maskEmail(email: string): string {
const [username, domain] = email.split('@');
const maskedUsername = username.charAt(0) + '***' + username.charAt(username.length - 1);
return `${maskedUsername}@${domain}`;
}
/**
* Mask phone numbers
*/
maskPhone(phone: string): string {
return phone.replace(/\d(?=\d{4})/g, '*');
}
/**
* Mask credit card numbers
*/
maskCreditCard(cardNumber: string): string {
return cardNumber.replace(/\d(?=\d{4})/g, '*');
}
/**
* Hash PII for deterministic anonymization
*/
hashPII(value: string, salt: string = 'test-salt'): string {
return createHash('sha256')
.update(value + salt)
.digest('hex')
.substring(0, 16);
}
/**
* Anonymize production data for testing
*/
anonymizeUser(user: any) {
return {
...user,
email: `test-${this.hashPII(user.email)}@example.com`,
name: `Test User ${this.hashPII(user.id)}`,
phone: this.maskPhone(user.phone || '555-0000'),
address: {
...user.address,
street: 'Test Street',
city: 'Test City',
},
// Keep non-PII fields
role: user.role,
subscription: user.subscription,
createdAt: user.createdAt,
};
}
}
export const dataMasker = new DataMasker();
Copying Production Data Safely
// scripts/copy-prod-data-safely.ts
import { createClient } from '@supabase/supabase-js';
import { dataMasker } from '../utilities/data-masking';
const prodSupabase = createClient(process.env.PROD_SUPABASE_URL!, process.env.PROD_SUPABASE_SERVICE_KEY!);
const testSupabase = createClient(process.env.TEST_SUPABASE_URL!, process.env.TEST_SUPABASE_SERVICE_KEY!);
async function copyProductionDataSafely() {
console.log('Copying production data with anonymization...');
// Fetch sample of production users
const { data: prodUsers, error } = await prodSupabase.from('users').select('*').limit(1000);
if (error) throw error;
// Anonymize PII
const anonymizedUsers = prodUsers.map((user) => dataMasker.anonymizeUser(user));
// Insert into test database
const { error: insertError } = await testSupabase.from('users').insert(anonymizedUsers);
if (insertError) throw insertError;
console.log(`✓ Copied and anonymized ${anonymizedUsers.length} users`);
}
copyProductionDataSafely().catch(console.error);
Strategy 5: API-Based Test Data Setup
Best for: E2E tests that need realistic workflows.
// utilities/test-setup.ts
import { APIRequestContext } from '@playwright/test';
export class TestDataSetup {
constructor(private request: APIRequestContext) {}
/**
* Create user via API (faster than UI)
*/
async createUser(userData: { email: string; password: string; name: string }) {
const response = await this.request.post('/api/auth/signup', {
data: userData,
});
if (!response.ok()) {
throw new Error(`Failed to create user: ${await response.text()}`);
}
return response.json();
}
/**
* Create authenticated API context
*/
async getAuthenticatedContext(email: string, password: string) {
const response = await this.request.post('/api/auth/login', {
data: { email, password },
});
const { token } = await response.json();
return {
...this.request,
headers: {
...this.request.headers,
Authorization: `Bearer ${token}`,
},
};
}
/**
* Create project via API
*/
async createProject(data: { name: string; description: string }) {
const response = await this.request.post('/api/projects', {
data,
});
if (!response.ok()) {
throw new Error(`Failed to create project: ${await response.text()}`);
}
return response.json();
}
/**
* Setup complete test scenario
*/
async setupTestScenario() {
// Create user
const user = await this.createUser({
email: faker.internet.email(),
password: 'TestPassword123!',
name: faker.person.fullName(),
});
// Get authenticated context
const authedRequest = await this.getAuthenticatedContext(user.email, 'TestPassword123!');
// Create projects
const projects = await Promise.all([
this.createProject({ name: 'Project A', description: 'Test project A' }),
this.createProject({ name: 'Project B', description: 'Test project B' }),
]);
return { user, projects };
}
}
// Usage in Playwright test
test('user can view their projects', async ({ page, request }) => {
const setup = new TestDataSetup(request);
const { user, projects } = await setup.setupTestScenario();
// Now navigate UI with pre-setup data
await page.goto('/login');
await page.fill('#email', user.email);
await page.fill('#password', 'TestPassword123!');
await page.click('button[type="submit"]');
await page.waitForURL('/dashboard');
// Verify projects are visible
for (const project of projects) {
await expect(page.getByText(project.name)).toBeVisible();
}
});
Test Data Cleanup Strategies
Strategy A: Isolated Test Databases
Use a fresh database per test run:
// playwright.config.ts
export default defineConfig({
globalSetup: require.resolve('./global-setup'),
globalTeardown: require.resolve('./global-teardown'),
});
// global-setup.ts
import { createClient } from '@supabase/supabase-js';
export default async function globalSetup() {
const testDbName = `test_${Date.now()}`;
// Create isolated test database
const supabase = createClient(process.env.SUPABASE_URL!, process.env.SUPABASE_SERVICE_KEY!);
// Run migrations on test DB
await runMigrations(testDbName);
// Store DB name for tests
process.env.TEST_DB_NAME = testDbName;
}
// global-teardown.ts
export default async function globalTeardown() {
// Drop test database
const testDbName = process.env.TEST_DB_NAME;
await dropDatabase(testDbName);
}
Strategy B: Transactional Tests
Roll back database changes after each test:
// fixtures/database.ts
import { test as base } from '@playwright/test';
import { db } from '../lib/database';
export const test = base.extend({
db: async ({}, use) => {
// Start transaction
await db.raw('BEGIN');
await use(db);
// Rollback after test
await db.raw('ROLLBACK');
},
});
// Usage
test('create project rolls back', async ({ db }) => {
await db('projects').insert({ name: 'Test Project' });
const count = await db('projects').count();
expect(count).toBe(1);
// After test, transaction rolls back, no cleanup needed
});
Strategy C: Cleanup Helpers
// utilities/test-cleanup.ts
export class TestCleanup {
private createdIds: Map<string, string[]> = new Map();
track(entity: string, id: string) {
if (!this.createdIds.has(entity)) {
this.createdIds.set(entity, []);
}
this.createdIds.get(entity)!.push(id);
}
async cleanupAll() {
for (const [entity, ids] of this.createdIds.entries()) {
await this.cleanup(entity, ids);
}
this.createdIds.clear();
}
private async cleanup(entity: string, ids: string[]) {
// Delete from database
await supabase.from(entity).delete().in('id', ids);
console.log(`Cleaned up ${ids.length} ${entity} records`);
}
}
// Usage
test('test with auto-cleanup', async ({ page }) => {
const cleanup = new TestCleanup();
try {
const user = await createUser({ email: 'test@example.com' });
cleanup.track('users', user.id);
const project = await createProject({ name: 'Test', ownerId: user.id });
cleanup.track('projects', project.id);
// Run test...
} finally {
await cleanup.cleanupAll();
}
});
Best Practices Checklist
- Never use production data directly - Always anonymize/mask PII
- Use factories over fixtures for dynamic data needs
- Seed minimal data - Only what the test needs
- Isolate test data - Each test should have its own data
- Clean up after tests - Don't leave test debris
- Version control fixtures - Track changes to test data
- Document data dependencies - Make relationships clear
- Use realistic data - Faker.js for genuine edge cases
- Test with large datasets - Validate performance at scale
- Automate data generation - Don't manually create test data
Conclusion
Test data management is the unsexy-but-critical foundation of reliable test automation. The strategies in this guide—from simple fixtures to sophisticated synthetic data generation—give you a complete toolkit for any testing scenario.
The key is choosing the right strategy for your context:
- Fixtures for simple, stable data
- Factories for dynamic, varied test cases
- Synthetic data for scale and realism
- API setup for speed
- Proper cleanup for reliability
Invest in test data infrastructure early. Your future self (and your team) will thank you when tests are fast, reliable, and maintainable. Ready to build robust test automation? Start your free trial with ScanlyApp and leverage our test data management tools with built-in fixture libraries, factory generators, and automated cleanup—no complex infrastructure required.
