Exactly-once 语义(EOS) 确保每条消息被恰好处理一次——不丢失、无重复——即使发生故障和重试也不例外。Kafka 通过幂等生产者和事务来实现这一点,但这很复杂并且有性能开销。
挑战
Exactly-once is HARD in distributed systems (failures, retries, duplicates are inevitable):
→ producer retries → duplicate messages; consumer reprocessing → duplicate effects
→ naive at-least-once → duplicates; at-most-once → loss
→ exactly-once requires careful mechanisms to avoid BOTH loss AND duplicates.
Kafka 如何实现 exactly-once
1. IDEMPOTENT PRODUCER (enable.idempotence=true):
→ the producer dedupes retries (sequence numbers per partition) → producing the same
message twice (due to a retry) results in ONE write (no duplicates from retries)
2. TRANSACTIONS:
→ group writes (and consumer offset commits) into an ATOMIC transaction → all succeed
or all roll back
→ enables "consume-process-produce" atomically: read, process, write results + commit
offsets together → exactly-once across the pipeline (within Kafka)
→ Kafka Streams uses these for end-to-end exactly-once processing
