Back to Blog

Mutation Testing: Are Your Tests Actually Effective? A Practical Guide

Code coverage isn't enough. Discover how mutation testing reveals whether your tests actually catch bugs by systematically introducing defects and measuring if your test suite detects them. Learn to use StrykerJS to improve test quality.

Scanly App

Published

8 min read

Reading time

Related articles: Also see coverage metrics mutation testing exposes as misleading, property-based testing as another technique for finding coverage gaps, and design patterns that produce the kind of tests mutation testing rewards.

Mutation Testing: Are Your Tests Actually Effective? A Practical Guide

You have 95% code coverage. Your CI pipeline is green. But are your tests actually good? Do they catch bugs, or are they just exercising code without truly validating behavior?

This is where mutation testing comes in�a powerful technique that puts your tests to the test. Instead of asking "do my tests run?", mutation testing asks "do my tests detect bugs?"

The concept is simple but profound: introduce small, deliberate bugs (mutations) into your code, then check if your tests catch them. If a mutation survives (tests still pass despite the bug), you have a weakness in your test suite.

The Problem with Code Coverage

Code coverage measures which lines of code are executed during testing. It's a useful metric, but it has a critical flaw: it doesn't measure the quality of assertions.

Consider this example:

function calculateDiscount(price, discountPercent) {
  if (discountPercent > 100) {
    throw new Error('Invalid discount');
  }
  return price - (price * discountPercent) / 100;
}

// A poor test that achieves 100% code coverage
test('calculateDiscount runs without error', () => {
  calculateDiscount(100, 20);
  // No assertions! Test passes but doesn't validate anything
});

This test achieves 100% coverage but doesn't verify the discount calculation at all. Code coverage can't tell you this test is worthless.

Metric What It Measures What It Misses
Code Coverage Lines/branches executed by tests Whether assertions actually validate logic
Mutation Score % of mutations detected (killed) by tests Nothing�directly measures test quality

What is Mutation Testing?

Mutation testing works by:

  1. Creating mutants: Automated tools introduce small changes (mutations) to your code�changing operators, modifying conditions, removing statements, etc.
  2. Running tests: Your test suite runs against each mutant.
  3. Scoring results:
    • Killed mutant: Tests failed (good! Your tests detected the bug)
    • Survived mutant: Tests passed (bad! Your tests missed the bug)
    • Timeout/error mutant: Mutation caused infinite loops or crashes

The mutation score is:

$$ \text{Mutation Score} = \frac{\text{Killed Mutants}}{\text{Total Mutants}} \times 100% $$

A higher score means more effective tests.

Common Mutation Operators

Mutation testing tools apply various mutation operators to your code:

Operator Type Example Mutation Purpose
Arithmetic + ? -, * ? / Test calculation logic
Conditional > ? >=, === ? !== Test boundary conditions
Logical && ? ||, !condition ? condition Test boolean logic
Statement Removal Remove return, remove function calls Test essential behavior
Constant Replacement true ? false, 0 ? 1, "" ? "Stryker" Test data validation
Assignment x = y ? x = 0 Test variable assignments

Introducing StrykerJS

StrykerJS is the leading mutation testing framework for JavaScript and TypeScript. It supports multiple test runners (Jest, Mocha, Jasmine, Vitest) and provides detailed HTML reports.

Installation

npm install --save-dev @stryker-mutator/core
npx stryker init

The init command creates a stryker.conf.json configuration file tailored to your project.

Basic Configuration

{
  "$schema": "./node_modules/@stryker-mutator/core/schema/stryker-schema.json",
  "packageManager": "npm",
  "testRunner": "jest",
  "coverageAnalysis": "perTest",
  "mutate": ["src/**/*.js", "!src/**/*.test.js", "!src/**/*.spec.js"],
  "concurrency": 4,
  "timeoutMS": 10000
}

Running Mutation Tests

npx stryker run

Stryker will:

  1. Run your tests once to establish a baseline
  2. Create mutations of your source code
  3. Run tests against each mutant
  4. Generate a detailed report

Practical Example: Testing a User Validator

Let's test a simple user validation function:

// src/userValidator.js
export function validateUser(user) {
  if (!user) {
    return { valid: false, error: 'User is required' };
  }

  if (!user.email || !user.email.includes('@')) {
    return { valid: false, error: 'Invalid email' };
  }

  if (typeof user.age !== 'number' || user.age < 18) {
    return { valid: false, error: 'User must be 18+' };
  }

  if (!user.username || user.username.length < 3) {
    return { valid: false, error: 'Username must be 3+ characters' };
  }

  return { valid: true };
}

Weak Tests (Low Mutation Score)

// Poor tests - focus on happy path only
describe('validateUser - weak tests', () => {
  test('accepts valid user', () => {
    const result = validateUser({
      email: 'test@example.com',
      age: 25,
      username: 'testuser',
    });
    expect(result.valid).toBe(true);
  });

  test('rejects user without email', () => {
    const result = validateUser({
      age: 25,
      username: 'testuser',
    });
    expect(result.valid).toBe(false);
  });
});

Mutation score: ~40%

Stryker would create mutations like:

  • Changing user.age < 18 ? user.age <= 18 (survives!)
  • Changing username.length < 3 ? username.length <= 3 (survives!)
  • Removing !user.email.includes('@') (survives!)

Strong Tests (High Mutation Score)

// Comprehensive tests - cover edge cases
describe('validateUser - strong tests', () => {
  test('accepts valid user', () => {
    const result = validateUser({
      email: 'test@example.com',
      age: 25,
      username: 'testuser',
    });
    expect(result.valid).toBe(true);
    expect(result.error).toBeUndefined();
  });

  test('rejects null user', () => {
    const result = validateUser(null);
    expect(result.valid).toBe(false);
    expect(result.error).toContain('required');
  });

  test('rejects email without @', () => {
    const result = validateUser({
      email: 'bademail',
      age: 25,
      username: 'testuser',
    });
    expect(result.valid).toBe(false);
    expect(result.error).toContain('email');
  });

  test('rejects user aged exactly 17', () => {
    const result = validateUser({
      email: 'test@example.com',
      age: 17,
      username: 'testuser',
    });
    expect(result.valid).toBe(false);
    expect(result.error).toContain('18+');
  });

  test('accepts user aged exactly 18', () => {
    const result = validateUser({
      email: 'test@example.com',
      age: 18,
      username: 'testuser',
    });
    expect(result.valid).toBe(true);
  });

  test('rejects username of length 2', () => {
    const result = validateUser({
      email: 'test@example.com',
      age: 25,
      username: 'ab',
    });
    expect(result.valid).toBe(false);
  });

  test('accepts username of exactly 3 characters', () => {
    const result = validateUser({
      email: 'test@example.com',
      age: 25,
      username: 'abc',
    });
    expect(result.valid).toBe(true);
  });
});

Mutation score: ~95%

These tests cover boundary conditions, validate error messages, and test both sides of each conditional.

The Mutation Testing Workflow

graph TD
    A[Write Initial Tests] --> B[Run Mutation Testing];
    B --> C{Review Mutation Report};
    C --> D[Identify Survived Mutants];
    D --> E{Is Mutant Valid?};
    E -- "Bug in Code" --> F[Fix Application Code];
    E -- "Missing Test" --> G[Add/Improve Tests];
    E -- "Equivalent Mutant" --> H[Document & Skip];
    F --> B;
    G --> B;
    H --> I[Accept Current Score];

Interpreting Results

When you find survived mutants:

  1. Missing test cases: Add tests for uncovered scenarios
  2. Weak assertions: Strengthen existing tests with more specific assertions
  3. Equivalent mutants: Sometimes mutations don't change behavior (e.g., i++ ? ++i in certain contexts). These are false positives.
  4. Actual bugs: Occasionally, survived mutants reveal real bugs in your code!

Best Practices

1. Start Small

Don't run mutation testing on your entire codebase at once. Start with:

  • Critical business logic functions
  • Utility libraries
  • Bug-prone areas

2. Set Realistic Targets

Code Type Target Mutation Score
Critical business logic 90-100%
Utility functions 80-95%
UI components 60-80%
Integration code 50-70%

3. Integrate into CI (Carefully)

Mutation testing is slow. Instead of running on every commit:

# .github/workflows/mutation-tests.yml
name: Mutation Testing
on:
  schedule:
    - cron: '0 2 * * 1' # Weekly, Monday 2 AM
  workflow_dispatch: # Manual trigger

jobs:
  mutation:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
      - run: npm ci
      - run: npx stryker run
      - uses: actions/upload-artifact@v4
        with:
          name: mutation-report
          path: reports/mutation/html

4. Use Incremental Mode

Stryker can run incrementally, testing only changed files:

{
  "incremental": true,
  "incrementalFile": ".stryker-tmp/incremental.json"
}

5. Exclude Low-Value Code

Don't waste time mutating:

  • Trivial getters/setters
  • Configuration files
  • Auto-generated code
  • Boilerplate

Mutation Testing vs. Other Techniques

Technique Strengths Use Case
Code Coverage Fast, simple to understand Baseline quality check
Mutation Testing Validates assertion quality Critical logic validation
Property-Based Testing Explores wide input space Pure functions, algorithms
Snapshot Testing Detects unintended UI changes Component output verification

Mutation testing is most valuable when combined with other techniques, not as a replacement.

Limitations

  1. Performance: Mutation testing is computationally expensive (10-100x slower than normal tests)
  2. Equivalent mutants: Some mutations don't actually change behavior, inflating survival rates
  3. Diminishing returns: Getting from 80% to 100% mutation score may not be worth the effort
  4. Doesn't replace other testing: Mutation testing improves unit tests but doesn't catch integration issues

Conclusion

Mutation testing shifts the conversation from "do we have tests?" to "are our tests effective?" It's a reality check for your test suite�revealing weaknesses that code coverage can't see.

While it's not a silver bullet, mutation testing is invaluable for critical code paths where bugs have high costs. By systematically introducing defects and checking if your tests catch them, you build confidence that your test suite is truly protecting your users.

Start small, focus on high-value code, and use mutation scores as a guide�not an obsession. Your goal isn't 100% mutation coverage; it's tests that actually catch bugs.

Ready to elevate your testing strategy? Sign up for ScanlyApp and integrate advanced QA techniques into your workflow.

Related Posts