There are two hard problems in computer science." We've worked on the cache-invalidation one for a while. The patterns that hold up at scale and the ones that look clean and aren't.
CDN caches are great until you need to update something cached. The "send a purge" approach works until you realize purges are slow, rate-limited, and sometimes don't propagate to every edge. After running CloudFront for a few years across various content types, this post is the patterns that actually hold up — and the ones that look clean in theory but fail in production.
Different content types need different invalidation strategies:
/app.a1b2c3.js): solved problem; the hash IS the version. Cache forever; new version, new URL./api/v1/products/123): mostly cached by HTTP Cache-Control. When the product changes, you need the cache to refresh.Cache-Control: private), though sometimes it is.Each shape has different patterns. Treating them all the same is where teams get into trouble.
Every static asset has a content-hash in its filename. app.css becomes app.a1b2c3.css. When you change the source, the hash changes, the URL changes, the CDN treats it as a new resource.
Cache-Control on these:
Cache-Control: public, max-age=31536000, immutable
One year, immutable (browsers and CDNs will never re-check). The hash gives you correctness; the long TTL gives you performance.
This is the gold standard. If you can move your problem into this shape, do it. Most front-end build tools (Webpack, Vite, esbuild) produce hashed asset names automatically.
The HTML that references those assets needs short TTL (because it changes when you deploy). But that HTML is small and changes less aggressively — easier to invalidate.
For content that can't be content-hashed (blog post pages, dynamic API endpoints):
Cache-Control: public, s-maxage=3600, stale-while-revalidate=86400
What this means:
s-maxage=3600 — CDN keeps for 1 hourstale-while-revalidate=86400 — for another 23 hours after that, CDN serves stale and asynchronously fetches fresh in the backgroundThe stale-while-revalidate is the underrated piece. Users always get a fast response (cached, even if stale); freshness updates happen behind the scenes. We use this on most semi-dynamic content.
The TTL is tuned per content type:
For content that needs to change immediately (a published blog post that had a typo; a withdrawn product), explicit invalidation.
CloudFront supports two mechanisms:
Invalidation paths. aws cloudfront create-invalidation --paths "/blog/the-typo-post". Propagates to all edges within a few minutes. Free for first 1000/month; $0.005 each after that. Use for one-offs.
Versioned origin paths. Instead of invalidating, change the origin URL. Blog post becomes /blog/the-typo-post?v=2 (or with a content-hash in the path). CDN treats it as a new resource. Same caching, no invalidation needed.
We mostly use invalidation paths because it requires no code changes on the client side. For very high-frequency content updates, versioned origin paths scale better.
The pattern we wish CloudFront supported natively: tag responses with logical IDs, invalidate by tag.
For example, every product page gets a Surrogate-Key: product-123 header. When product 123 changes, you "purge tag product-123" and every cached response with that tag is invalidated — across many URLs.
Fastly supports this directly (Surrogate-Key header, purge-key API call). Cloudflare supports it via Cache Tags (Enterprise only). CloudFront doesn't have a direct equivalent; you have to track URLs and invalidate them explicitly.
We've worked around CloudFront's gap by maintaining a small mapping table (in DynamoDB) of "content_id → list of URLs cached." When content changes, look up the URLs, issue an invalidation for each.
It works but it's clunky. For teams whose content has complex many-URLs-per-thing relationships, Fastly or Cloudflare's tag-based invalidation is genuinely better.
Our specific stack and what we cache:
Cache-Control headers from Next.js
s-maxage=3600, stale-while-revalidate=86400 (1h fresh, 1d stale-OK)s-maxage=21600, stale-while-revalidate=86400 (6h fresh)s-maxage=604800 (1 week — they don't change)max-age=31536000, immutablerevalidatePath)When an admin edits a post, the path-level invalidation runs immediately. Cache drops on the next request. Without explicit invalidation, the user would wait up to 1 hour to see the edit live.
For sitewide content changes (a new design, an updated nav), we just deploy — the deploy busts all caches via the new asset hashes.
A few patterns we tried and abandoned:
Short TTLs everywhere. "Just cache for 60 seconds so it's always fresh-ish." Sounds safe. Result: every page request hits origin within a minute of the previous one, which negates most of the caching benefit. The cache only helps if the TTL is meaningfully longer than the inter-request interval for popular content.
Invalidating wildcards aggressively. aws cloudfront create-invalidation --paths "/*" to clear everything after a deploy. Works once. Twice. Then you blow through the free quota and start spending real money on invalidations. Worse: wildcard invalidation is slow (15-30 minutes for full propagation). For deploys, hash your assets so invalidation isn't necessary.
Trying to cache user-specific content. "Cache for 10 seconds per user — bad ideas compound; if the user is logged out for those 10 seconds, the wrong content gets served. We now use Cache-Control: private for anything user-specific; CDN doesn't touch it.
Cookie-based cache keys. "Vary on the auth cookie so logged-in users get their own cache." Quickly explodes cache count (one per session); CDN cache hit rate drops to near zero. Just don't cache logged-in content.
What we monitor:
CloudFront exports these metrics to CloudWatch. We pull them into Grafana for the dashboards.
A cheat sheet:
| Content type | Strategy |
|---|---|
| Compiled JS/CSS with hash | immutable; max-age=1yr |
| Logo, favicons | max-age=1day (changes rarely; deploy bumps if needed) |
| Blog post HTML | s-maxage=1hr, stale-while-revalidate=1day + explicit invalidation on edit |
| Sitemap, RSS | s-maxage=6hr |
| OG images / social-share images | s-maxage=1week |
| User-specific (logged-in pages) | Cache-Control: private — don't cache at CDN |
| API: stable list endpoints | s-maxage=5min |
| API: user-specific | Cache-Control: private |
These are starting points. Adjust based on your content's actual change rate.
Content-hash everything you can. Static asset names with hashes = the invalidation problem solved.
Cache-Control with stale-while-revalidate. The single most underused HTTP cache directive. Read about it.
Different TTLs per content type. Don't pick one number; classify your content.
Explicit invalidation only for user-visible content edits. Not for deploys (use hashing); not for sitewide changes (use deploys).
Don't cache logged-in content at the CDN. The complexity vs benefit isn't worth it.
Monitor cache hit ratio. Catches misconfiguration faster than anything else.
CDN cache invalidation isn't a hard problem per content type — it's a hard problem when you have many content types and one strategy. Pick the right tool per type; the operational pain mostly goes away.
Get the latest tutorials, guides, and insights on AI, DevOps, Cloud, and Infrastructure delivered directly to your inbox.
Embedding indexes degrade silently. The signals that catch drift, how often to re-embed, and the operational patterns we built after one quiet quality regression.
Helm gives you a lot of rope. The patterns we used that backfired, the ones we replaced them with, and what to skip if you're starting today.
Explore more articles in this category
We use Step Functions for batch processing, document ingestion, and a few agentic workflows. The patterns that work, the limits we hit, and where we'd reach for something else.
After two years of running Karpenter on production EKS clusters, the NodePool patterns that survived, the ones we replaced, and the tuning that matters.
A working mental model for AWS VPCs — what each piece does, how they connect, and why "VPC" is the wrong mental model if you came from physical networks.
| API: real-time data |
Cache-Control: no-cache |