Related articles: Also see diagnosing locks and deadlocks that block database performance, testing practices that surface performance issues before production, and Node.js memory leaks that interact with database connection pooling.
Database Performance Tuning: A 12-Step Checklist That Cuts Slow Query Times in Half
Your application is slow. Users are complaining. You check the logs and see it: database queries taking 5 seconds, 10 seconds, sometimes timing out entirely. Your server resources are maxed out, but the database is the bottleneck.
Sound familiar?
Database performance issues are among the most common�and most fixable�problems in software development. A poorly optimized query can bring an entire application to its knees. But with the right techniques, you can transform that 10-second query into a 10-millisecond query, handling 100x more load on the same hardware.
This comprehensive guide provides a systematic approach to database performance tuning, covering query optimization, indexing strategies, connection management, and diagnostic tools for PostgreSQL, MySQL, and MongoDB.
The Performance Tuning Mindset
Before diving into specific techniques, understand this fundamental principle:
Premature optimization is the root of all evil, but measurement is not.
Always:
- Measure first: Identify slow queries with real data
- Optimize strategically: Focus on the queries that matter most
- Test changes: Verify improvements with benchmarks
- Monitor continuously: Performance degrades over time
Query Optimization Fundamentals
The Query Execution Pipeline
graph LR
A[SQL Query] --> B[Parser];
B --> C[Query Planner];
C --> D[Execution Engine];
D --> E[Storage Layer];
E --> F[Results];
style C fill:#ffff99
style E fill:#ffcccc
Most performance issues occur at the Query Planner (choosing execution strategy) or Storage Layer (disk I/O).
The N+1 Query Problem
The most common performance killer.
// ? BAD: N+1 queries (1 + N where N = number of users)
const users = await db.query('SELECT * FROM users LIMIT 100');
for (const user of users) {
const posts = await db.query('SELECT * FROM posts WHERE user_id = ?', [user.id]);
user.posts = posts;
}
// Result: 101 queries for 100 users!
// ? GOOD: Single query with JOIN
const usersWithPosts = await db.query(`
SELECT u.*, p.id as post_id, p.title, p.content
FROM users u
LEFT JOIN posts p ON p.user_id = u.id
LIMIT 100
`);
// Result: 1 query
In ORMs:
// Sequelize: Use eager loading
const users = await User.findAll({
include: [{ model: Post }], // Prevents N+1
limit: 100,
});
// Prisma: Use include
const users = await prisma.user.findMany({
include: { posts: true },
take: 100,
});
Indexing Strategies
Indexes are the single most powerful performance tool. But they're not free�they slow writes and consume storage.
Index Types Comparison
| Index Type | Use Case | PostgreSQL | MySQL | MongoDB |
|---|---|---|---|---|
| B-Tree | Equality, range queries, sorting | ? Default | ? Default | ? |
| Hash | Exact equality only (WHERE col = value) | ? | ? | ? |
| GIN/Full-Text | Text search, JSONB, arrays | ? GIN | ? Full-Text | ? Text |
| Partial | Index only subset of rows (WHERE condition) | ? | ? | ? |
| Covering | Index includes all queried columns | ? | ? | ? |
| Geospatial | Location-based queries | ? PostGIS | ? | ? |
When to Add an Index
-- ? Index on WHERE clause columns
SELECT * FROM orders WHERE customer_id = 123;
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
-- ? Index on JOIN columns
SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id;
CREATE INDEX idx_orders_customer_id ON orders(customer_id);
CREATE INDEX idx_customers_id ON customers(id); -- Primary key, likely already indexed
-- ? Index on ORDER BY columns
SELECT * FROM posts ORDER BY created_at DESC LIMIT 10;
CREATE INDEX idx_posts_created_at ON posts(created_at DESC);
-- ? Composite index for multi-column queries
SELECT * FROM orders WHERE customer_id = 123 AND status = 'pending';
CREATE INDEX idx_orders_customer_status ON orders(customer_id, status);
Index Column Order Matters
For composite indexes, order columns by selectivity (most selective first):
-- ? BAD: status (low selectivity) first
CREATE INDEX idx_bad ON orders(status, customer_id);
-- Only 3-5 statuses (pending/shipped/delivered)
-- ? GOOD: customer_id (high selectivity) first
CREATE INDEX idx_good ON orders(customer_id, status);
-- Thousands of customers, better filtering
Covering Indexes (INCLUDE Columns)
Include non-filter columns in the index to avoid table lookups:
-- Query: SELECT product_name, price FROM products WHERE category_id = 5;
-- ? Without covering index: Index scan + table lookup
CREATE INDEX idx_products_category ON products(category_id);
-- ? With covering index: Index-only scan (faster!)
CREATE INDEX idx_products_category_covering
ON products(category_id) INCLUDE (product_name, price);
Partial Indexes
Index only the rows you query:
-- Only index active users (saves space, faster writes)
CREATE INDEX idx_users_active_email
ON users(email) WHERE status = 'active';
-- Query must match the WHERE condition to use index
SELECT * FROM users WHERE email = 'user@example.com' AND status = 'active';
Analyzing Query Performance
PostgreSQL: EXPLAIN ANALYZE
EXPLAIN ANALYZE
SELECT p.*, c.name AS category_name
FROM products p
JOIN categories c ON p.category_id = c.id
WHERE p.price > 50
ORDER BY p.created_at DESC
LIMIT 20;
Key things to look for:
| Indicator | Meaning | Action |
|---|---|---|
| Seq Scan | Full table scan (slow for large tables) | Add index |
| Index Scan | Using index (good!) | Monitor selectivity |
| Nested Loop | Join method (can be slow for large datasets) | Consider Hash Join |
| High cost | Query planner estimates expensive operation | Analyze statistics, add indexes |
| High actual time | Measured execution time | Focus optimization here |
MySQL: EXPLAIN
EXPLAIN SELECT * FROM orders WHERE customer_id = 123;
Look for:
type: ALL? Full table scan (bad)type: index? Full index scan (acceptable for small tables)type: range? Index range scan (good)type: ref? Index lookup (very good)type: const? Primary key lookup (excellent)
MongoDB: explain()
db.orders.find({ customer_id: 123 }).explain('executionStats');
Check:
executionStats.totalDocsExamined? Should be close tonReturnedIXSCANinwinningPlan? Using index (good)COLLSCANinwinningPlan? Collection scan (bad)
Query Optimization Patterns
1. Avoid SELECT *
-- ? BAD: Fetches all columns (more I/O, more network transfer)
SELECT * FROM users WHERE id = 123;
-- ? GOOD: Only fetch needed columns
SELECT id, email, name FROM users WHERE id = 123;
2. Use LIMIT
-- ? BAD: Fetches all matching rows (could be millions)
SELECT * FROM logs WHERE severity = 'INFO';
-- ? GOOD: Limit results, paginate if needed
SELECT * FROM logs WHERE severity = 'INFO'
ORDER BY created_at DESC LIMIT 100;
3. Avoid Functions in WHERE Clauses
-- ? BAD: Function prevents index usage
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
-- ? GOOD: Use functional index OR store lowercase
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- Or simply:
SELECT * FROM users WHERE email = 'user@example.com';
4. Use EXISTS Instead of COUNT
-- ? BAD: Counts all rows (slow)
SELECT IF(COUNT(*) > 0, 'exists', 'not exists')
FROM orders WHERE customer_id = 123;
-- ? GOOD: Stops at first match
SELECT EXISTS(SELECT 1 FROM orders WHERE customer_id = 123 LIMIT 1);
5. Batch Inserts/Updates
-- ? BAD: 1000 separate INSERT statements
INSERT INTO logs (message) VALUES ('Log 1');
INSERT INTO logs (message) VALUES ('Log 2');
-- ... 998 more
-- ? GOOD: Single batch INSERT
INSERT INTO logs (message) VALUES
('Log 1'), ('Log 2'), ... ('Log 1000');
Connection Pooling
Database connections are expensive to create. Reuse them with connection pooling.
Node.js Example (pg)
const { Pool } = require('pg');
const pool = new Pool({
host: 'localhost',
database: 'mydb',
user: 'myuser',
password: 'mypassword',
max: 20, // Maximum pool size
idleTimeoutMillis: 30000, // Close idle connections after 30s
connectionTimeoutMillis: 2000, // Fail fast if no connection available
});
// Use the pool
async function getUser(id) {
const client = await pool.connect();
try {
const result = await client.query('SELECT * FROM users WHERE id = $1', [id]);
return result.rows[0];
} finally {
client.release(); // Return connection to pool
}
}
Pool Sizing
Rule of thumb: connections = (core_count * 2) + effective_spindle_count
For cloud databases:
- PostgreSQL: 10-20 connections per application server
- MySQL: 50-100 connections per application server
- MongoDB: 100-200 connections per application server
Database-Specific Optimizations
PostgreSQL
Vacuum and Analyze
-- Reclaim space and update statistics
VACUUM ANALYZE users;
-- Aggressive vacuum (slower, more thorough)
VACUUM FULL users;
-- Auto-vacuum tuning (postgresql.conf)
autovacuum = on
autovacuum_naptime = 30s
autovacuum_vacuum_scale_factor = 0.05
Prepared Statements
// Reduces query planning overhead
const preparedQuery = {
name: 'get-user',
text: 'SELECT * FROM users WHERE id = $1',
};
const result = await client.query(preparedQuery, [123]);
MySQL
Query Cache (Deprecated in 8.0, use Redis instead)
// Application-level caching with Redis
const redis = require('redis').createClient();
async function getUser(id) {
const cached = await redis.get(`user:${id}`);
if (cached) return JSON.parse(cached);
const user = await db.query('SELECT * FROM users WHERE id = ?', [id]);
await redis.setex(`user:${id}`, 3600, JSON.stringify(user));
return user;
}
Query Optimization
-- Show slow queries
SHOW VARIABLES LIKE 'slow_query_log';
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1; -- Log queries > 1 second
MongoDB
Aggregation Pipeline Optimization
// ? BAD: $lookup (join) without indexes
db.orders.aggregate([
{ $lookup: { from: 'customers', localField: 'customer_id', foreignField: '_id', as: 'customer' } }
]);
// ? GOOD: Ensure indexes on both sides
db.orders.createIndex({ customer_id: 1 });
db.customers.createIndex({ _id: 1 }); // Already exists for _id
// Also: Use $match early to filter data before expensive operations
db.orders.aggregate([
{ $match: { status: 'pending' } }, // Filter first
{ $lookup: { ... } } // Then join
]);
Monitoring and Diagnostics
Key Metrics to Track
| Metric | What It Means | Target |
|---|---|---|
| Query response time | How long queries take | p95 < 100ms |
| Slow query count | Number of queries > threshold | < 1% of total |
| Connection pool usage | % of connections in use | < 80% |
| Cache hit ratio | % of queries served from cache | > 90% |
| Deadlock frequency | Database lock conflicts | Near zero |
| Disk I/O wait | Time spent waiting for disk | < 10% of CPU time |
Tools
PostgreSQL:
pg_stat_statementsextension- pgBadger log analyzer
- pg_top for real-time monitoring
MySQL:
- Slow query log
- Performance Schema
- MySQL Workbench query analyzer
MongoDB:
- Database Profiler (
db.setProfilingLevel(1)) - MongoDB Compass
mongostatandmongotop
The Performance Tuning Checklist
? Identify slow queries (logs, APM tools)
? Analyze with EXPLAIN (understand execution plan)
? Add indexes strategically (WHERE, JOIN, ORDER BY columns)
? Eliminate N+1 queries (use JOINs or eager loading)
? Fetch only needed columns (avoid SELECT *)
? Use connection pooling (reuse connections)
? Batch operations (bulk inserts/updates)
? Cache frequently accessed data (Redis, Memcached)
? Update statistics regularly (VACUUM ANALYZE, ANALYZE TABLE)
? Monitor continuously (set up alerts for slow queries)
Conclusion
Database performance tuning is an iterative process. Start with the slow queries that impact users most, measure their performance, apply optimizations systematically, and verify improvements with real data.
Remember: a well-optimized database isn't just faster�it's cheaper to run, easier to scale, and more reliable under load. Invest time in tuning now, and you'll reap the benefits for years.
Ready to optimize your entire stack? Sign up for ScanlyApp and integrate performance testing into your development workflow.
