Back to Blog

Caching Strategies That Cut Response Times by 90%: A Practical Web Developer Guide

Effective caching can reduce database load by 90% and slash response times from seconds to milliseconds. Learn battle-tested caching strategies using Redis, CDN, and application-level caching—with code examples and decision frameworks.

Published

15 min read

Reading time

Caching Strategies That Cut Response Times by 90%: A Practical Web Developer Guide

Your database is melting. Every page load triggers 20 queries. Response times hover around 800ms on a good day, spike to 3 seconds during traffic bursts. Your infrastructure costs are climbing as you scale up database instances. Sound familiar?

Then you implement caching. Suddenly:

  • Database queries drop by 95%
  • Response times plummet to 50ms
  • Your servers handle 10x the traffic
  • Infrastructure costs decrease

Caching is often called "the closest thing to magic in computer science"—it's one of the few optimization techniques that can deliver 10-100x performance improvements with relatively straightforward implementation. But caching isn't just "add Redis and hope for the best." The wrong caching strategy can make things worse, serving stale data, introducing race conditions, or consuming memory without providing benefits.

This guide covers battle-tested caching strategies for modern web applications, from browser caching to distributed Redis patterns, with practical code examples and decision frameworks to choose the right approach for your use case.

The Caching Hierarchy

Modern web applications have multiple caching layers:

graph TD
    A[User Request] --> B{Browser Cache}
    B -->|Miss| C{CDN Cache}
    C -->|Miss| D{Application Cache<br/>Redis/Memory}
    D -->|Miss| E{Database Query Cache}
    E -->|Miss| F[Database]

    B -->|Hit| G[Return Cached]
    C -->|Hit| G
    D -->|Hit| G
    E -->|Hit| G
    F --> G

    style B fill:#c5e1a5
    style C fill:#bbdefb
    style D fill:#fff9c4
    style E fill:#ffccbc
    style F fill:#f8bbd0

Each layer has different characteristics:

Layer Speed Scope Size Limit Control Best For
Browser Cache Fastest (0ms) Per-user ~100MB Low Static assets, public content
CDN Cache Very Fast (< 50ms) Global Large Medium Static assets, public APIs
Application Cache (in-memory) Fast (< 1ms) Per-server Limited by RAM High Server-side computations
Application Cache (Redis) Fast (< 5ms) Shared Large High Session data, computed results
Database Query Cache Medium (10-50ms) Per-DB Moderate Low Repeated queries

Core Caching Patterns

1. Cache-Aside (Lazy Loading)

The application manages the cache explicitly. On read: check cache, if miss, fetch from database, populate cache.

graph LR
    A[Request Data] --> B{Check Cache}
    B -->|Hit| C[Return Cached Data]
    B -->|Miss| D[Query Database]
    D --> E[Store in Cache]
    E --> F[Return Data]

    style B fill:#c5e1a5
    style D fill:#ffccbc

Implementation:

// cache-aside.ts
import { Redis } from 'ioredis';

const redis = new Redis({
  host: process.env.REDIS_HOST,
  port: 6379,
  db: 0,
});

interface User {
  id: string;
  name: string;
  email: string;
}

async function getUserById(userId: string): Promise<User | null> {
  const cacheKey = `user:${userId}`;

  // 1. Try cache first
  const cached = await redis.get(cacheKey);
  if (cached) {
    console.log('Cache hit');
    return JSON.parse(cached);
  }

  // 2. Cache miss - fetch from database
  console.log('Cache miss - fetching from DB');
  const user = await db.query('SELECT * FROM users WHERE id = $1', [userId]);

  if (user) {
    // 3. Store in cache with expiration
    await redis.setex(cacheKey, 3600, JSON.stringify(user)); // 1 hour TTL
  }

  return user;
}

// Usage
const user = await getUserById('user_123');

Pros:

  • Simple to implement and understand
  • Works well for read-heavy workloads
  • Cache failures don't break the application

Cons:

  • Cache miss penalty (extra latency)
  • Potential cache stampede on popular items
  • Stale data possible if not invalidated

2. Write-Through Cache

Data is written to cache and database simultaneously. Cache is always consistent with the database.

// write-through.ts
async function updateUser(userId: string, updates: Partial<User>): Promise<User> {
  const cacheKey = `user:${userId}`;

  // 1. Update database
  const updatedUser = await db.query('UPDATE users SET name = $1, email = $2 WHERE id = $3 RETURNING *', [
    updates.name,
    updates.email,
    userId,
  ]);

  // 2. Immediately update cache (or invalidate)
  if (updatedUser) {
    await redis.setex(cacheKey, 3600, JSON.stringify(updatedUser));
  }

  return updatedUser;
}

Pros:

  • Cache always consistent
  • Reduces cache miss rate

Cons:

  • Write latency (must write to both)
  • Cache pollution (writing data that's never read)

3. Write-Behind (Write-Back) Cache

Write to cache immediately, asynchronously write to database. Maximize write performance.

// write-behind.ts
import { Queue, Worker } from 'bullmq';

const writeQueue = new Queue('database-writes', {
  connection: { host: 'redis', port: 6379 },
});

async function updateUserWriteBehind(userId: string, updates: Partial<User>): Promise<void> {
  const cacheKey = `user:${userId}`;

  // 1. Update cache immediately
  const currentUser = JSON.parse((await redis.get(cacheKey)) || '{}');
  const updatedUser = { ...currentUser, ...updates };
  await redis.setex(cacheKey, 3600, JSON.stringify(updatedUser));

  // 2. Queue database write (async)
  await writeQueue.add('update-user', {
    userId,
    updates,
    timestamp: Date.now(),
  });
}

// Background worker persists to database
const worker = new Worker(
  'database-writes',
  async (job) => {
    const { userId, updates } = job.data;

    await db.query('UPDATE users SET name = $1, email = $2 WHERE id = $3', [updates.name, updates.email, userId]);
  },
  {
    connection: { host: 'redis', port: 6379 },
  },
);

Pros:

  • Extremely fast writes
  • Can batch database writes

Cons:

  • Risk of data loss if cache fails
  • Complex to implement correctly
  • Eventual consistency

4. Read-Through Cache

Cache sits between application and database. Application only talks to cache; cache handles database fetches.

// read-through.ts
class ReadThroughCache<T> {
  constructor(
    private redis: Redis,
    private loader: (key: string) => Promise<T | null>,
    private ttl: number = 3600,
  ) {}

  async get(key: string): Promise<T | null> {
    // Check cache
    const cached = await this.redis.get(key);
    if (cached) {
      return JSON.parse(cached);
    }

    // Cache miss - load from source
    const value = await this.loader(key);

    if (value) {
      // Populate cache
      await this.redis.setex(key, this.ttl, JSON.stringify(value));
    }

    return value;
  }
}

// Usage
const userCache = new ReadThroughCache<User>(
  redis,
  async (userId) => {
    return await db.query('SELECT * FROM users WHERE id = $1', [userId]);
  },
  3600,
);

const user = await userCache.get('user:123');

Advanced Caching Strategies

Cache Warming

Pre-populate cache with frequently accessed data before traffic arrives:

// cache-warming.ts
import cron from 'node-cron';

async function warmPopularUserCache() {
  console.log('Starting cache warming...');

  // Get top 1000 most active users
  const popularUsers = await db.query(`
    SELECT user_id, COUNT(*) as activity_count
    FROM user_activity
    WHERE created_at > NOW() - INTERVAL '24 hours'
    GROUP BY user_id
    ORDER BY activity_count DESC
    LIMIT 1000
  `);

  // Pre-load into cache
  const promises = popularUsers.map(async ({ user_id }) => {
    const user = await db.query('SELECT * FROM users WHERE id = $1', [user_id]);
    if (user) {
      await redis.setex(`user:${user_id}`, 3600, JSON.stringify(user));
    }
  });

  await Promise.all(promises);
  console.log(`Warmed cache with ${popularUsers.length} users`);
}

// Run cache warming daily at 5am (before traffic peak)
cron.schedule('0 5 * * *', warmPopularUserCache);

// Also warm on application startup
warmPopularUserCache();

Cache Stampede Prevention

When a popular cache key expires, multiple requests might simultaneously try to refresh it, overwhelming the database.

Solution: Locking and Early Recomputation

// cache-stampede-prevention.ts
async function getWithStampedePrevention<T>(key: string, loader: () => Promise<T>, ttl: number = 3600): Promise<T> {
  const lockKey = `lock:${key}`;
  const lockTTL = 10; // 10 second lock

  // Try to get from cache
  const cached = await redis.get(key);
  if (cached) {
    return JSON.parse(cached);
  }

  // Acquire lock
  const lockAcquired = await redis.set(lockKey, '1', 'EX', lockTTL, 'NX');

  if (lockAcquired) {
    // We got the lock - we're responsible for loading
    try {
      const value = await loader();
      await redis.setex(key, ttl, JSON.stringify(value));
      return value;
    } finally {
      await redis.del(lockKey);
    }
  } else {
    // Someone else is loading - wait a bit and retry
    await new Promise((resolve) => setTimeout(resolve, 100));
    return getWithStampedePrevention(key, loader, ttl);
  }
}

// Usage
const user = await getWithStampedePrevention(
  'user:123',
  () => db.query('SELECT * FROM users WHERE id = $1', ['123']),
  3600,
);

Probabilistic Early Expiration

Refresh cache before it expires for popular items:

// probabilistic-early-refresh.ts
async function getWithProbabilisticRefresh<T>(key: string, loader: () => Promise<T>, ttl: number = 3600): Promise<T> {
  const cached = await redis.get(key);
  const ttlRemaining = await redis.ttl(key);

  if (cached) {
    // Probabilistically refresh before expiration
    const delta = ttl - ttlRemaining;
    const probability = delta / ttl;

    // As key gets older, higher chance of refresh
    if (Math.random() < probability) {
      // Refresh asynchronously (don't wait)
      loader().then((value) => {
        redis.setex(key, ttl, JSON.stringify(value));
      });
    }

    return JSON.parse(cached);
  }

  // Cache miss - load and cache
  const value = await loader();
  await redis.setex(key, ttl, JSON.stringify(value));
  return value;
}

Multi-Tier Caching

Combine in-memory (L1) and Redis (L2) for best performance:

// multi-tier-cache.ts
import NodeCache from 'node-cache';

const l1Cache = new NodeCache({
  stdTTL: 60, // 1 minute in-memory
  checkperiod: 120,
  useClones: false, // For performance
});

async function getFromL1L2Cache<T>(key: string, loader: () => Promise<T>): Promise<T> {
  // L1 check (in-memory)
  const l1Value = l1Cache.get<T>(key);
  if (l1Value !== undefined) {
    console.log('L1 cache hit');
    return l1Value;
  }

  // L2 check (Redis)
  const l2Value = await redis.get(key);
  if (l2Value) {
    console.log('L2 cache hit');
    const parsed = JSON.parse(l2Value);

    // Populate L1
    l1Cache.set(key, parsed);
    return parsed;
  }

  // Full cache miss
  console.log('Cache miss - loading from source');
  const value = await loader();

  // Populate both layers
  l1Cache.set(key, value);
  await redis.setex(key, 3600, JSON.stringify(value));

  return value;
}

// Usage
const product = await getFromL1L2Cache('product:123', () => db.query('SELECT * FROM products WHERE id = $1', ['123']));

Cache Invalidation Strategies

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

Time-Based Expiration (TTL)

Simplest approach: let cache entries expire after a fixed time:

// TTL-based expiration
await redis.setex('user:123', 300, JSON.stringify(user)); // 5 minutes

Pros: Simple, prevents stale data Cons: Arbitrary TTL, potential inconsistency

Event-Based Invalidation

Invalidate cache when source data changes:

// event-based-invalidation.ts
import { EventEmitter } from 'events';

const cacheInvalidator = new EventEmitter();

// Invalidate on user update
async function updateUser(userId: string, updates: Partial<User>) {
  const updatedUser = await db.query('UPDATE users SET name = $1, email = $2 WHERE id = $3 RETURNING *', [
    updates.name,
    updates.email,
    userId,
  ]);

  // Invalidate all related caches
  const cacheKeys = [`user:${userId}`, `user:${userId}:profile`, `user:${userId}:settings`, `user:${userId}:projects`];

  await redis.del(...cacheKeys);

  // Emit event for distributed invalidation
  cacheInvalidator.emit('user:updated', userId);

  return updatedUser;
}

Tag-Based Invalidation

Group related cache entries by tags:

// tag-based-invalidation.ts
class TaggedCache {
  private redis: Redis;

  async set(key: string, value: any, ttl: number, tags: string[]) {
    // Store the value
    await this.redis.setex(key, ttl, JSON.stringify(value));

    // Associate with tags
    const tagPromises = tags.map((tag) => this.redis.sadd(`tag:${tag}`, key));
    await Promise.all(tagPromises);
  }

  async invalidateByTag(tag: string) {
    // Get all keys with this tag
    const keys = await this.redis.smembers(`tag:${tag}`);

    if (keys.length > 0) {
      // Delete all tagged keys
      await this.redis.del(...keys);
    }

    // Delete the tag set itself
    await this.redis.del(`tag:${tag}`);
  }
}

// Usage
const cache = new TaggedCache(redis);

await cache.set('user:123', user, 3600, ['user', 'user_123', 'org_456']);
await cache.set('project:789', project, 3600, ['project', 'user_123', 'org_456']);

// Invalidate all cache entries for organization 456
await cache.invalidateByTag('org_456');

Cache Versioning

Use version numbers in cache keys to invalidate without deletion:

// cache-versioning.ts
let cacheVersion = 1;

function getCacheKey(type: string, id: string): string {
  return `v${cacheVersion}:${type}:${id}`;
}

async function invalidateAllCaches() {
  // Increment version - old caches become inaccessible
  cacheVersion++;

  // Store new version in Redis for distributed systems
  await redis.set('cache:version', cacheVersion);
}

// On app startup, get current version
const storedVersion = await redis.get('cache:version');
cacheVersion = storedVersion ? parseInt(storedVersion) : 1;

CDN and Browser Caching

HTTP Cache Headers

// express-cache-headers.ts
import express from 'express';

const app = express();

// Static assets: aggressive caching
app.use(
  '/static',
  express.static('public', {
    maxAge: '1y', // 1 year
    immutable: true,
  }),
);

// API responses: conditional caching
app.get('/api/products', (req, res) => {
  res.set({
    'Cache-Control': 'public, max-age=300', // 5 minutes
    ETag: generateETag(products),
    Vary: 'Accept-Encoding',
  });

  res.json(products);
});

// User-specific data: no caching
app.get('/api/user/profile', (req, res) => {
  res.set({
    'Cache-Control': 'private, no-cache, no-store, must-revalidate',
    Pragma: 'no-cache',
    Expires: '0',
  });

  res.json(userProfile);
});

// Conditional requests (ETags)
function generateETag(data: any): string {
  const hash = crypto.createHash('md5').update(JSON.stringify(data)).digest('hex');
  return `"${hash}"`;
}

app.get('/api/data', (req, res) => {
  const data = getData();
  const etag = generateETag(data);

  // Check if client has current version
  if (req.headers['if-none-match'] === etag) {
    res.status(304).end(); // Not Modified
    return;
  }

  res.set('ETag', etag);
  res.json(data);
});

Cache-Control Directive Reference

Directive Meaning Use Case
public Can be cached by any cache Public, non-sensitive content
private Cache in browser only, not CDN User-specific data
no-cache Must revalidate on every use Frequently changing data
no-store Never cache Sensitive data
max-age=300 Cache for 300 seconds Moderately fresh data
s-maxage=3600 CDN cache for 1 hour Different TTL for CDN
immutable Never revalidate Fingerprinted assets
must-revalidate Cache must revalidate when stale Ensure freshness

Stale-While-Revalidate

Serve stale content while fetching fresh data in background:

// stale-while-revalidate.ts
app.get('/api/slow-endpoint', async (req, res) => {
  res.set({
    'Cache-Control': 'max-age=60, stale-while-revalidate=300',
  });

  // Takes 2 seconds to compute
  const data = await expensiveComputation();

  res.json(data);
});

// Client gets:
// - First request: waits 2 seconds
// - Within 60s: instant (cached)
// - 60s-360s: instant (stale) + background refresh
// - After 360s: waits 2 seconds (stale expired)

Caching Strategy Decision Tree

graph TD
    A[Need to Cache?] --> B{Data Changes?}
    B -->|Rarely| C[Long TTL<br/>1 hour - 1 day]
    B -->|Occasionally| D[Medium TTL<br/>5-30 minutes]
    B -->|Frequently| E[Short TTL<br/>30-300 seconds]
    B -->|Real-time| F[No Cache or<br/>Stale-While-Revalidate]

    C --> G{Shareable?}
    D --> G
    E --> G

    G -->|Yes| H[Redis/CDN]
    G -->|No| I[In-Memory/Browser]

    H --> J{Invalidation Needed?}
    J -->|Yes| K[Event-Based Invalidation]
    J -->|No| L[TTL Only]

    style C fill:#c5e1a5
    style D fill:#fff9c4
    style E fill:#ffccbc
    style K fill:#bbdefb

Performance Impact: Before and After Caching

Real-world example from a typical web application:

Metric Before Caching After Caching Improvement
Avg Response Time 850ms 45ms 18.9x faster
P95 Response Time 2.3s 120ms 19.2x faster
Database Queries/sec 1,250 85 93% reduction
Max Concurrent Users 500 5,000+ 10x capacity
Infrastructure Cost $2,800/mo $800/mo 71% savings

Common Pitfalls and How to Avoid Them

1. cache.set() Without TTL

// ❌ BAD: No TTL - cache grows forever
await redis.set('user:123', JSON.stringify(user));

// ✅ GOOD: Always set TTL
await redis.setex('user:123', 3600, JSON.stringify(user));

2. Caching Errors

// ❌ BAD: Caching error responses
try {
  const data = await fetchData();
  await redis.setex('data', 300, JSON.stringify(data));
  return data;
} catch (error) {
  // Don't cache errors!
  throw error;
}

// ✅ GOOD: Only cache success
const data = await fetchData();
if (data) {
  await redis.setex('data', 300, JSON.stringify(data));
}
return data;

3. Thundering Herd

// ❌ BAD: All requests refresh simultaneously
const data = await redis.get('popular:data');
if (!data) {
  // 1000 concurrent requests all fetch from DB
  return await expensiveQuery();
}

// ✅ GOOD: Use locking (see stampede prevention above)
return await getWithStampedePrevention('popular:data', expensiveQuery);

Monitoring Cache Effectiveness

// cache-metrics.ts
import { Counter, Histogram } from 'prom-client';

const cacheHits = new Counter({
  name: 'cache_hits_total',
  help: 'Total number of cache hits',
  labelNames: ['cache_type', 'key_prefix'],
});

const cacheMisses = new Counter({
  name: 'cache_misses_total',
  help: 'Total number of cache misses',
  labelNames: ['cache_type', 'key_prefix'],
});

const cacheLatency = new Histogram({
  name: 'cache_operation_duration_seconds',
  help: 'Cache operation latency',
  labelNames: ['operation', 'cache_type'],
  buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5],
});

async function getWithMetrics<T>(key: string, loader: () => Promise<T>, cacheType: string = 'redis'): Promise<T> {
  const keyPrefix = key.split(':')[0];
  const timer = cacheLatency.startTimer({ operation: 'get', cache_type: cacheType });

  const cached = await redis.get(key);
  timer();

  if (cached) {
    cacheHits.inc({ cache_type: cacheType, key_prefix: keyPrefix });
    return JSON.parse(cached);
  }

  cacheMisses.inc({ cache_type: cacheType, key_prefix: keyPrefix });

  const value = await loader();
  await redis.setex(key, 3600, JSON.stringify(value));

  return value;
}

Key Metrics to Track:

  • Hit Rate: hits / (hits + misses) — should be > 80%
  • Miss Rate: misses / (hits + misses) — should be < 20%
  • Eviction Rate: How often cache is full
  • Average TTL: How long items stay cached
  • Cache Latency: p50, p95, p99 response times

Conclusion

Effective caching requires understanding:

  1. What to cache: High-read, low-write data with acceptable staleness
  2. Where to cache: Choose the right layer (browser, CDN, app, database)
  3. How long to cache: Balance freshness vs. performance
  4. When to invalidate: Event-based, time-based, or tag-based

The most successful caching strategies combine multiple approaches:

  • Browser/CDN caching for static assets (aggressive)
  • Application caching for computed data (moderate)
  • Database query caching as last resort
  • Proper invalidation to balance performance and freshness

Start simple with cache-aside and TTL-based expiration, then layer in advanced strategies as needed. Monitor cache effectiveness and iterate based on actual hit rates and performance metrics.

Ready to supercharge your application with intelligent caching strategies? Sign up for ScanlyApp and get comprehensive performance monitoring and caching recommendations integrated into your development workflow.

Related articles: Also see the 2026 web performance guide caching is a key pillar of, testing your caching rules and ensuring cache invalidation works correctly, and TTFB improvements that caching has the largest single impact on.

Related Posts