Database Locks and Deadlocks: How to Diagnose the Silent Killer of API Performance
The checkout flow worked perfectly in your staging environment. Thousands of automated test runs, never a failure. Then you released to production and started seeing sporadic 500 errors with the message: ERROR: deadlock detected.
Deadlocks and lock contention failures are the quintessential concurrency bugs: you cannot reproduce them single-threaded, they depend on precise timing, and they only emerge at production volumes. Yet they are entirely testable once you know the patterns.
This guide focuses on PostgreSQL, which is the database underlying most modern SaaS platforms (including Supabase), but the concepts apply broadly.
What Is a Deadlock?
A deadlock occurs when two or more transactions each hold a lock that the other transaction needs, and none can proceed:
sequenceDiagram
participant T1 as Transaction 1
participant A as Row A
participant B as Row B
participant T2 as Transaction 2
T1->>A: LOCK (acquired)
T2->>B: LOCK (acquired)
T1->>B: WAITING for lock...
T2->>A: WAITING for lock...
Note over T1,T2: 💀 DEADLOCK<br/>PostgreSQL kills one transaction
PostgreSQL detects this cycle and terminates one of the transactions with an error. The loser transaction must be retried. If your application does not handle this retry, the user sees a 500.
Lock Types in PostgreSQL
Not all locks conflict with each other. Understanding the lock matrix is essential for writing correct concurrent queries:
| Lock Mode | Conflicts With |
|---|---|
ACCESS SHARE (SELECT) |
ACCESS EXCLUSIVE only |
ROW SHARE (SELECT FOR UPDATE) |
EXCLUSIVE, ACCESS EXCLUSIVE |
ROW EXCLUSIVE (INSERT/UPDATE/DELETE) |
SHARE, SHARE ROW EXCLUSIVE, EXCLUSIVE, ACCESS EXCLUSIVE |
SHARE (CREATE INDEX) |
ROW EXCLUSIVE, SHARE ROW EXCLUSIVE, EXCLUSIVE |
EXCLUSIVE |
Everything except ACCESS SHARE |
ACCESS EXCLUSIVE (ALTER TABLE, DROP) |
Everything |
The practical implication: SELECT FOR UPDATE + concurrent UPDATE = contention. ALTER TABLE = full table lock.
Common Deadlock Patterns in Application Code
Pattern 1: Inconsistent Update Order
The most common deadlock in application code: two transactions updating the same rows in different orders:
// ❌ DEADLOCK RISK: Two paths update rows in different orders
// Handler A: processes order first, then inventory
async function handleOrderA(orderId: string) {
await db.transaction(async (trx) => {
await trx('orders').where({ id: orderId }).update({ status: 'processing' });
await trx('inventory').where({ product_id: '123' }).decrement('quantity', 1);
});
}
// Handler B: updates inventory first, then order
async function handleOrderB(orderId: string) {
await db.transaction(async (trx) => {
await trx('inventory').where({ product_id: '123' }).decrement('quantity', 1);
await trx('orders').where({ id: orderId }).update({ status: 'confirmed' });
});
}
Fix: Always update tables/rows in the same canonical order throughout your codebase:
// ✅ Consistent order: inventory first, then orders — always
async function handleOrderConsistent(orderId: string) {
await db.transaction(async (trx) => {
// Always acquire locks in alphabetical or defined canonical order
await trx('inventory').where({ product_id: '123' }).decrement('quantity', 1);
await trx('orders').where({ id: orderId }).update({ status: 'processing' });
});
}
Pattern 2: Missing Index on FK with Concurrent Deletes
-- Parent-child delete without index on FK = table lock cascade
DELETE FROM orders WHERE user_id = $1;
-- If order_items.order_id has no index, PostgreSQL must scan
-- the entire order_items table to find related rows
-- → full table lock during delete
-- Fix: CREATE INDEX CONCURRENTLY ON order_items(order_id);
Testing for Deadlocks
Concurrent Request Simulation with Playwright
// tests/concurrency/deadlock.test.ts
import { test, expect } from '@playwright/test';
test('concurrent checkout does not deadlock', async ({ browser }) => {
// Create two independent contexts (two browser sessions)
const contextA = await browser.newContext();
const contextB = await browser.newContext();
const pageA = await contextA.newPage();
const pageB = await contextB.newPage();
// Authenticate both sessions
await authenticateAs(pageA, 'user-a@test.com');
await authenticateAs(pageB, 'user-b@test.com');
// Navigate both to the same limited-stock product
await Promise.all([pageA.goto('/products/limited-item'), pageB.goto('/products/limited-item')]);
// Both attempt to add to cart simultaneously
await Promise.all([pageA.click('[data-testid="add-to-cart"]'), pageB.click('[data-testid="add-to-cart"]')]);
// Both attempt checkout simultaneously
const [responseA, responseB] = await Promise.all([
pageA.waitForResponse('/api/checkout/initiate'),
pageB.waitForResponse('/api/checkout/initiate'),
]);
// Both should succeed or at most one should fail gracefully (not 500)
const statuses = [responseA.status(), responseB.status()];
// At least one should succeed
expect(statuses).toContain(200);
// Neither should return a 500 (unhandled deadlock)
expect(statuses).not.toContain(500);
await contextA.close();
await contextB.close();
});
API-Level Concurrent Request Test
// tests/concurrency/api-concurrent.test.ts
test('inventory decrement is safe under concurrent requests', async ({ request }) => {
const productId = 'test-product-with-qty-1';
// Reset inventory to exactly 1 unit
await request.post('/api/test/reset-inventory', {
data: { productId, quantity: 1 },
});
// Fire 10 concurrent requests to purchase the single item
const responses = await Promise.all(
Array.from({ length: 10 }, () =>
request.post('/api/cart/purchase', {
data: { productId, quantity: 1 },
}),
),
);
const statusCodes = responses.map((r) => r.status());
const successes = statusCodes.filter((s) => s === 200).length;
const expectedFailures = statusCodes.filter((s) => s === 409).length; // Out of stock
const serverErrors = statusCodes.filter((s) => s >= 500).length;
// Exactly 1 should succeed (only 1 unit in stock)
expect(successes).toBe(1);
// The rest should get a graceful 409, not a 500
expect(serverErrors).toBe(0);
expect(expectedFailures).toBe(9);
});
PostgreSQL Deadlock Diagnostics
When a deadlock occurs in production, PostgreSQL logs it. Here's how to enable and parse those logs:
-- Enable deadlock logging in postgresql.conf
-- (or via Supabase dashboard → Database → Connection Pool → Logs)
log_lock_waits = on
deadlock_timeout = 500ms -- Log any lock wait > 500ms
-- Query current locks and what's blocking what
SELECT
blocked_locks.pid AS blocked_pid,
blocking_locks.pid AS blocking_pid,
blocked_activity.query AS blocked_query,
blocking_activity.query AS blocking_query
FROM pg_catalog.pg_locks blocked_locks
JOIN pg_catalog.pg_locks blocking_locks
ON blocking_locks.locktype = blocked_locks.locktype
AND blocking_locks.relation = blocked_locks.relation
AND blocking_locks.granted
AND NOT blocked_locks.granted
JOIN pg_catalog.pg_stat_activity blocked_activity
ON blocked_activity.pid = blocked_locks.pid
JOIN pg_catalog.pg_stat_activity blocking_activity
ON blocking_activity.pid = blocking_locks.pid;
Connection Pool Configuration
Connection pool exhaustion produces errors that look like database errors but are actually infrastructure issues:
// Recommended Supabase/pg connection pool configuration
// server/db.ts
import { Pool } from 'pg';
const pool = new Pool({
connectionString: process.env.DATABASE_URL,
// Max connections this service will open
// Typically: (CPU cores × 2) + active disk spindles
// For a 2-core server: max 8-10 connections per instance
max: 10,
// Kill idle connections after 10s to free pool slots
idleTimeoutMillis: 10_000,
// Fail fast if connection cannot be obtained in 5s
// (better to fail immediately than have users waiting)
connectionTimeoutMillis: 5_000,
// Retry queries on transient errors (including deadlocks)
// Pool-level retry is not built-in; use application-level retry below
});
// Application-level retry wrapper for deadlock-safe queries
export async function withRetry<T>(fn: () => Promise<T>, maxRetries = 3, delayMs = 100): Promise<T> {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await fn();
} catch (err: unknown) {
const isDeadlock = (err as Error).message?.includes('deadlock detected');
const isRetryable = (err as Error).message?.includes('could not serialize');
if ((isDeadlock || isRetryable) && attempt < maxRetries - 1) {
await new Promise((resolve) => setTimeout(resolve, delayMs * 2 ** attempt));
continue;
}
throw err;
}
}
throw new Error('Unreachable');
}
Related articles: Also see the full database performance optimisation checklist beyond lock management, testing practices that catch concurrency issues before production, and distributed transaction challenges in microservices that cause deadlocks.
Prevention Checklist
| Category | Action | Priority |
|---|---|---|
| Query order | Canonical lock acquisition order | 🔴 Critical |
| Indexes | FK indexes on all foreign keys | 🔴 Critical |
| Transactions | Keep transactions short-lived | 🔴 Critical |
| Retry logic | Handle deadlock errors + exponential backoff | 🔴 Critical |
| DDL changes | Run schema changes in maintenance windows | 🟡 High |
| SELECT FOR UPDATE | Use skip locked for queue patterns |
🟡 High |
| Monitoring | Log and alert on lock waits > 1s | 🟡 High |
Deadlocks and lock contention are reproducible and testable when you instrument your staging database correctly and run concurrent load tests.
Test your database layer under concurrent load: Try ScanlyApp free and validate your API endpoints stay healthy under simultaneous user sessions.
