选择分区策略——事件如何分布在主题的分区中——是影响顺序、并行性和负载分布的重要 Kafka 设计决策。必须仔细选择分区键和分区数。
分区工作原理
A producer's message goes to a partition based on:
→ with a KEY → hash(key) → determines the partition (same key → same partition consistently)
→ no key → distributed (round-robin / sticky) across partitions
→ the KEY choice determines ordering and distribution
选择分区键
The KEY determines two crucial things:
✓ ORDERING → all events with the same key go to the same partition → ordered together
(e.g. key=userId → all of a user's events are ordered)
✓ DISTRIBUTION → keys should spread evenly across partitions (good cardinality) → balanced load
PITFALLS:
✗ LOW cardinality / skewed keys → HOT partitions (one partition overloaded) → bottleneck
✗ Wrong ordering scope → if you need per-X ordering, key by X (but that limits parallelism
within X)
→ choose a key giving the ORDERING you need AND EVEN distribution
