运行 RabbitMQ 需要监控关键指标(queue depth、message rates、consumer health、resources)并使用管理工具。理解监控和管理对于可靠地运行 RabbitMQ 很重要。
要监控的关键指标
✓ QUEUE DEPTH (length) → growing queues = consumers can't keep up (a key signal!) — like
consumer lag; investigate (add consumers, fix slow processing)
✓ MESSAGE RATES → publish rate vs deliver/ack rate (in vs out — are they balanced?)
✓ CONSUMER count and health → are consumers connected and processing?
✓ UNACKED messages → many unacked = slow/stuck consumers
✓ RESOURCES → memory, disk, CPU, connections, file descriptors (RabbitMQ has memory/disk
alarms that block publishing when thresholds are hit!)
✓ DEAD LETTER queue size → failed messages accumulating (signals problems)
