Why Cache?
Caching trades memory usage for reduced latency and lower database load.
In a distributed system, fetching data from a database is slow (disk I/O, network). Fetching from a cache (memory) is orders of magnitude faster. A proper caching layer protects your database from being overwhelmed by read traffic.
🚀 Key Benefits
- ✓ Latency: ~1ms (Redis) vs ~100ms (Database).
- ✓ Throughput: Serve 100k+ req/sec from memory.
- ✓ Cost: Reduce DB provisioned IOPS.
📍 Cache Locations
- Browser/Client: HTTP Cache (Images, CSS).
- CDN (Edge): Geographic caching of static assets.
- Application: Local server memory (heap).
- Distributed: Shared Redis/Memcached cluster.
Caching Patterns
How do we load data into the cache? Choosing the right pattern is critical for consistency.
1. Cache-Aside (Lazy Loading)
The application is responsible for reading and writing from the cache.
- App requests data from Cache.
- Miss? App reads from Database.
- App writes data to Cache.
- App returns data.
Cons: Initial latency (3 trips). Data can become stale.
2. Write-Through
The application writes to the Cache, and the Cache writes to the Database synchronously.
Cons: Higher write latency (two writes). Cold cache on startup (needs cache warming).
3. Write-Back (Write-Behind)
App writes only to Cache (fast). Cache asynchronously writes to Database later.
Eviction Policies
Cache memory is expensive and finite. When full, what do we delete?
| Policy | Description | Use Case |
|---|---|---|
| LRU (Least Recently Used) | Removes items that haven't been used for the longest time. | Most common. Good for social media (recent news is hot). |
| LFU (Least Frequently Used) | Removes items with the fewest hits. | Good for general content that doesn't change often. |
| FIFO (First In First Out) | Removes the oldest item inserted. | Simple, but poor performance (evicts hot old data). |
Common Pitfalls
🛑 Thundering Herd (Cache Stampede)
When a popular cache key expires, thousands of concurrent requests might get a "MISS" and ALL try to query the database simultaneously, crashing it.
Solution:
- Mutex Lock: Only let 1 request query the DB; others wait.
- Probabilistic Early Expiration: Refresh the key slightly before it actually expires.
❄️ Cache Avalanche
When many keys expire at the exact same time (e.g., rigid TTLs).
Solution: Add "Jitter" (randomness) to TTLs. E.g., `TTL = 300s + rand(0-60s)`.
Python Implementation: Cache-Aside with Redis
A production-ready example using `redis-py` covering the "Check -> DB -> Set" logic with TTL.
import redis
import json
import time
# Connect to Redis
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
def get_user_profile(user_id):
cache_key = f"user:{user_id}"
# 1. Try Cache
cached_data = r.get(cache_key)
if cached_data:
print("Cache HIT")
return json.loads(cached_data)
# 2. Cache MISS - Fetch User from DB (Simulated)
print("Cache MISS - Reading DB...")
# db_data = db.query("SELECT * FROM users WHERE id = ?", user_id)
time.sleep(0.1) # Simulate DB latency
db_data = {"id": user_id, "name": "Brijesh", "role": "Architect"}
# 3. Write to Cache with TTL (Time To Live)
# Expiry 60s to prevent stale data forever
r.setex(cache_key, 60, json.dumps(db_data))
return db_data
# Simulation
get_user_profile(101) # Miss -> DB -> Cache
get_user_profile(101) # Hit -> Cache
Summary
- Use **Cache-Aside** for general read-heavy apps.
- Use **LRU** eviction for most use cases.
- Always set a **TTL** (Time To Live) to expire data.
- Add **Jitter** to TTLs to prevent Avalanches.