Performance

Caching Strategies & Architecture

The difference between a sluggish application and a lightning-fast experience lies in how effectively you cache data.

Why Cache?

Caching trades memory usage for reduced latency and lower database load.

In a distributed system, fetching data from a database is slow (disk I/O, network). Fetching from a cache (memory) is orders of magnitude faster. A proper caching layer protects your database from being overwhelmed by read traffic.

🚀 Key Benefits
  • Latency: ~1ms (Redis) vs ~100ms (Database).
  • Throughput: Serve 100k+ req/sec from memory.
  • Cost: Reduce DB provisioned IOPS.
📍 Cache Locations
  • Browser/Client: HTTP Cache (Images, CSS).
  • CDN (Edge): Geographic caching of static assets.
  • Application: Local server memory (heap).
  • Distributed: Shared Redis/Memcached cluster.

Caching Patterns

How do we load data into the cache? Choosing the right pattern is critical for consistency.

1. Cache-Aside (Lazy Loading)

The application is responsible for reading and writing from the cache.

  1. App requests data from Cache.
  2. Miss? App reads from Database.
  3. App writes data to Cache.
  4. App returns data.
Pros: Resilient to cache failure. Only requested data is cached.
Cons: Initial latency (3 trips). Data can become stale.

2. Write-Through

The application writes to the Cache, and the Cache writes to the Database synchronously.

Pros: Data in cache is never stale.
Cons: Higher write latency (two writes). Cold cache on startup (needs cache warming).

3. Write-Back (Write-Behind)

App writes only to Cache (fast). Cache asynchronously writes to Database later.

Risk: If Cache crashes before syncing, DATA IS LOST. Use only for non-critical data (e.g., likes counts, analytics).

Eviction Policies

Cache memory is expensive and finite. When full, what do we delete?

Policy Description Use Case
LRU (Least Recently Used) Removes items that haven't been used for the longest time. Most common. Good for social media (recent news is hot).
LFU (Least Frequently Used) Removes items with the fewest hits. Good for general content that doesn't change often.
FIFO (First In First Out) Removes the oldest item inserted. Simple, but poor performance (evicts hot old data).

Common Pitfalls

🛑 Thundering Herd (Cache Stampede)

When a popular cache key expires, thousands of concurrent requests might get a "MISS" and ALL try to query the database simultaneously, crashing it.

Solution:

  • Mutex Lock: Only let 1 request query the DB; others wait.
  • Probabilistic Early Expiration: Refresh the key slightly before it actually expires.

❄️ Cache Avalanche

When many keys expire at the exact same time (e.g., rigid TTLs).

Solution: Add "Jitter" (randomness) to TTLs. E.g., `TTL = 300s + rand(0-60s)`.

Python Implementation: Cache-Aside with Redis

A production-ready example using `redis-py` covering the "Check -> DB -> Set" logic with TTL.

import redis
import json
import time

# Connect to Redis
r = redis.Redis(host='localhost', port=6379, decode_responses=True)

def get_user_profile(user_id):
    cache_key = f"user:{user_id}"
    
    # 1. Try Cache
    cached_data = r.get(cache_key)
    if cached_data:
        print("Cache HIT")
        return json.loads(cached_data)
    
    # 2. Cache MISS - Fetch User from DB (Simulated)
    print("Cache MISS - Reading DB...")
    # db_data = db.query("SELECT * FROM users WHERE id = ?", user_id)
    time.sleep(0.1) # Simulate DB latency
    db_data = {"id": user_id, "name": "Brijesh", "role": "Architect"}
    
    # 3. Write to Cache with TTL (Time To Live)
    # Expiry 60s to prevent stale data forever
    r.setex(cache_key, 60, json.dumps(db_data))
    
    return db_data

# Simulation
get_user_profile(101) # Miss -> DB -> Cache
get_user_profile(101) # Hit -> Cache

Summary

  • Use **Cache-Aside** for general read-heavy apps.
  • Use **LRU** eviction for most use cases.
  • Always set a **TTL** (Time To Live) to expire data.
  • Add **Jitter** to TTLs to prevent Avalanches.