Three caching patterns, three failure modes. The one we use most, the one that bit us, and the rule that decides which pattern fits which workload.

On this page

Caching Patterns — Read-Through, Write-Through, Cache-Aside in Practice

Most teams use caching wrong because the differences between caching patterns aren't usually explained in a way that maps to actual decisions. The textbook descriptions say "read-through caches read from the cache; if missing, fetch from DB and populate" — accurate but useless. What matters is the failure modes: what happens when the cache is stale, when the cache is down, when two writes race, when the cache fills up.

After running each of the three main patterns in production, this is the working version.

The three patterns, briefly #

Cache-aside (lazy loading). The application reads from the cache directly. On miss, it reads from the DB, writes the result to the cache, returns it. On write, the application writes to the DB and either invalidates the cache or writes a fresh value.

Read-through. Application reads from the cache via an abstraction; the cache layer is responsible for fetching from the DB on miss. Application doesn't know about misses.

Write-through. Writes go to the cache first; the cache propagates the write to the DB. Cache and DB stay in sync by the cache library's choice of "write to both" or "write to cache, then async DB."

There's also write-behind (writes go to cache; DB writes are deferred and batched) and write-around (writes skip the cache, go straight to DB; cache populates only on read miss). I'll mention these briefly.

Cache-aside: the default #

This is what most teams should use, and what we use for ~80% of our caching. The flow:

python.python

def get_user(user_id):
    cache_key = f"user:{user_id}"
    cached = cache.get(cache_key)
    if cached is not None:
        return cached
    user = db.query(User).filter_by(id=user_id).one_or_none()
    if user is not None:
        cache.set(cache_key, user, ttl=300)  # 5 min TTL
    return user

def update_user(user_id, data):
    db.update(User, user_id, data)
    cache.delete(f"user:{user_id}")  # invalidate

Advantages:

Application has full control over what gets cached and when.
Cache outage = degraded performance, not failure. Direct DB reads still work.
Simple to reason about.

Disadvantages:

Thundering herd on cache miss: if 100 concurrent requests miss for the same key, all 100 hit the DB.
Stale data window: TTL-based expiration means stale until next refresh.
Manual invalidation discipline required.

The thundering herd is the failure mode that bites teams. The fix is request coalescing (single-flight): when one request misses, others wait for it instead of all hitting the DB.

Read-through: rare in our stack #

Read-through wraps the cache + DB behind a single read interface. The application code looks like:

python.python

def get_user(user_id):
    return read_through_cache.get(f"user:{user_id}")  # may fetch from DB internally

The cache library handles miss → fetch → populate.

Advantages:

Cleaner application code.
Centralized cache-population logic (one place to change TTL, key format, etc.).
Often comes with built-in single-flight, expiration policies.

Disadvantages:

Tighter coupling between cache and DB. Cache outage = read failure unless the library handles fallback.
Less flexibility for ad-hoc reads.
Many "read-through" caches in practice need fallback paths anyway.

We use read-through in one specific service where the cache and DB are managed by the same library (Hibernate's second-level cache). For most services it's overkill.

Write-through: niche #

Writes go to the cache first; the cache writes to the DB:

python.python

def update_user(user_id, data):
    write_through_cache.set(f"user:{user_id}", data)  # cache propagates to DB

Advantages:

Cache is always fresh on writes (no stale window).
Simpler invalidation logic (cache library handles it).

Disadvantages:

Write latency = DB latency + cache latency. Slower than cache-aside (where you can write to DB and async-invalidate).
DB outage during write = write fails. Cache becomes the single point of failure for the write path.
All writes flow through the cache, which becomes a load bottleneck.

Write-through fits when reads vastly outnumber writes AND consistency on reads is critical. For typical apps, cache-aside is better.

Write-behind / write-back: dangerous #

Write-behind queues DB writes and applies them async. Fast writes; sometimes data loss when the cache crashes before flushing.

We don't use this for anything where data loss matters. It's appropriate for high-throughput append-mostly counters where you've decided you can lose 30 seconds of writes if the cache crashes.

The consistency window #

All caching patterns have a stale window: a time during which the cache has old data while the DB has new data. The window's size depends on the pattern:

Cache-aside with TTL only: stale window = up to the TTL. 5 min cache means up to 5 min stale.
Cache-aside with explicit invalidation on write: stale window = the time between the write and the invalidation (milliseconds usually).
Write-through: stale window = ~0 (cache and DB updated together).

For most apps, "up to TTL" stale is fine for individual reads but bad for things like "the user just updated their profile and reloaded, sees old data." That's the case where explicit invalidation on write matters.

The cache-stampede problem #

When a hot cache entry expires, the first request after expiration goes to the DB to repopulate. While it's running, every other request for the same key also misses and tries to repopulate. Result: a stampede on the DB.

Fixes:

Single-flight (request coalescing). Only one request triggers the DB fetch; others wait for it. Most Redis client libraries support this; if not, build it with a per-key mutex.
Stale-while-revalidate. Serve stale content while fetching fresh in the background. The first request after expiration serves stale + triggers an async refresh; subsequent requests see fresh.
Probabilistic early refresh. Add jitter — refresh keys before they expire with some probability. Spreads the refresh load across time.

We use stale-while-revalidate for the things that matter; single-flight for everything else.

Cache invalidation by tag (not just key)#

Invalidating by exact key is fine until you have related caches. "User updated their profile" should invalidate:

The user record
The list of all users containing them
The team page including them
Possibly the dashboard summary

Maintaining this list per write gets unmaintainable. Tag-based invalidation: cache writes specify tags (user:42, team:5); invalidation clears all keys with a tag.

Not all cache libraries support this. We use it where available; manual key listing elsewhere.

When NOT to cache #

A few patterns where caching is wrong:

User-specific content with low reuse. Caching user 42's dashboard for 5 minutes only helps if user 42 hits the page within 5 minutes. Many users won't.
Write-heavy workloads. If you invalidate the cache on every write, you're doing extra work without read benefit.
Easily-derivable values. Caching now() or simple computations costs more than it saves.
Things changing more often than your TTL. Cache hit rate ≈ 0; you're just adding latency.

Start without caching. Add it when measurements show it's worth it.

What we monitor #

Cache hit rate per key prefix. Should be > 80% for hot caches; lower means the cache isn't doing its job.
Cache memory usage. Eviction during normal traffic = cache is undersized.
Stale-while-revalidate ratio. How often we serve stale and refresh in background. Healthy systems use SWR meaningfully.
Average miss latency. What does a cold read cost? Surfaces issues with the upstream DB.

What to read next #

CDN cache invalidation strategies — the same patterns applied at the edge
Postgres connection pooling — PgBouncer in front of RDS — when cache hits become "DB is the bottleneck"
Edge databases for low-latency apps — when the cache itself becomes the database
Multi-provider LLM routing — failover, cost, load balancing — same caching patterns applied to LLM responses

The pattern doesn't matter as much as the discipline around stale windows, single-flight, and explicit invalidation. Cache-aside with a good invalidation story is enough for most teams. Reach for the more exotic patterns only when measurements point at a specific bottleneck.

Caching Patterns — Read-Through, Write-Through, Cache-Aside in Practice

Caching Patterns — Read-Through, Write-Through, Cache-Aside in Practice

The three patterns, briefly #

Cache-aside: the default #

Read-through: rare in our stack #

Write-through: niche #

Write-behind / write-back: dangerous #

The consistency window #

The cache-stampede problem #

Cache invalidation by tag (not just key)#

When NOT to cache #

What we monitor #

What to read next #

Stay Updated

Kafka Partition Strategies — Scaling Consumers Without Reshuffling Everything

Linux io_uring — Async I/O Patterns We Use

More from Cloud

Best APM and Observability Tools in 2026 — Compared by Cost and Use Case

Cloud Cost Optimization — The Practical Guide for AWS, Azure, and GCP

How to Reduce Datadog Costs Without Losing Coverage

Best APM and Observability Tools in 2026 — Compared by Cost and Use Case

Cloud Cost Optimization — The Practical Guide for AWS, Azure, and GCP

How to Reduce Datadog Costs Without Losing Coverage

Honeycomb vs Datadog — High-Cardinality Debugging Compared

The Edge Computing Playbook — What to Run at the Edge (and What Not To)

Kubernetes Cost Tools — Kubecost vs OpenCost vs Cast AI

You might have missed

GitOps with Argo CD: Best Practices for 2025

Prompt Engineering Best Practices: Maximizing LLM Performance

Embedding Models Comparison: Choosing the Right Model for Your Use Case

About Kiril Urbonas