Cache invalidation — 保持缓存数据与真实数据源一致 — 在计算机科学中因其复杂性而闻名,是最难的问题之一。挑战在于确保缓存不提供过期数据,同时平衡性能、一致性和复杂性。有几个策略和陷阱值得理解。
核心问题
text
When the source data changes, the cached copy becomes STALE.
→ Serve stale data? (fast but wrong) vs invalidate? (consistent but complex/slower)
→ "There are only two hard things in CS: cache invalidation and naming things."
The difficulty: knowing WHEN and WHAT to invalidate, across distributed systems,
without races, while keeping good cache hit rates.
失效策略
text
1. TTL (expiration) — data auto-expires after a time → eventually consistent
✓ Simple, no tracking ✗ Stale until expiry; pick TTL by staleness tolerance
2. EXPLICIT invalidation — delete/update the cache when source data changes
✓ Fresh quickly ✗ Must reliably catch ALL writes; easy to miss some paths
3. WRITE-THROUGH — update cache and DB together on writes → cache stays fresh
4. EVENT-BASED — invalidate via events/CDC (e.g. DB change → invalidation message)
→ Often combined: explicit invalidation + a TTL as a safety net.
