设计一个大规模系统涉及结合许多概念——处理大规模负载、选择适当的架构、数据库、缓存和管理权衡。使用一个具体的例子(社交媒体信息流)来说明这些部分如何结合在一起。
示例:社交媒体新闻信息流
Requirements: millions of users; post content; see a feed of followed users' posts;
read-HEAVY (far more feed views than posts); low latency; high availability.
High-level components:
→ CLIENTS → LOAD BALANCER → APPLICATION servers (stateless, horizontally scaled)
→ DATABASES → user/post data (sharded); a graph of follows
→ CACHING (Redis) → hot feeds, posts, user data (crucial for read-heavy load)
→ CDN → media (images/videos)
→ MESSAGE QUEUES → async work (fan-out, notifications)
关键设计决策:信息流生成
FAN-OUT ON WRITE (push) → when a user posts, push it to all followers' precomputed feeds:
✓ fast feed READS (precomputed) ✗ expensive for users with millions of followers
(write amplification)
FAN-OUT ON READ (pull) → build the feed when requested (query followed users' posts):
✓ cheap writes ✗ slower reads (compute on demand)
HYBRID → push for most; pull for celebrities (huge followings) → balances the trade-offs
→ illustrates analyzing trade-offs for the specific scale/pattern.
