Apache Kafka 是一个分布式的事件流平台 — 一个高吞吐量、持久化系统,用于发布、存储和处理事件(记录)的流。它用于消息传递、实时数据管道、事件驱动架构和大规模流处理。
Kafka 是什么
Kafka = a distributed, durable, high-throughput EVENT STREAMING platform:
→ PUBLISH events (producers write) and SUBSCRIBE to them (consumers read)
→ STORE streams of events durably (a distributed, replicated commit LOG)
→ PROCESS streams in real time
→ think of it as a durable, scalable, append-only LOG of events that many systems can
write to and read from
Kafka 的用途
✓ MESSAGING / event streaming → decoupled communication between systems (pub/sub at scale)
✓ DATA PIPELINES → move/stream data between systems reliably (ingestion, ETL)
✓ EVENT-DRIVEN architecture → services react to events; event sourcing
✓ REAL-TIME processing → analytics, monitoring, stream processing (Kafka Streams)
✓ LOG aggregation, metrics, activity tracking, change data capture (CDC)
→ used by huge companies for high-volume, real-time data
