Kleppmann's treatment of message streams, logs, and stream processing is the deepest available. Start with the first half (message brokers). Also: ByteByteGo's "Message Queue" video for visual intuition.
Synchronous architectures are brittle. If your checkout service calls the email service directly, a slow email server makes checkout slow. A crashed email service can crash checkout. Message queues decouple services in time and space — producers enqueue work and move on; consumers process it independently.
One producer → Queue → One consumer
Each message is delivered to
exactly one consumer.
Use: task distribution, work queues
Example: "Process this image"
Tools: SQS, RabbitMQ (default)
Publisher → Topic → Many subscribers
Each subscriber gets a copy
of every message.
Use: event broadcasting, fanout
Example: "User signed up" → email, analytics, CRM
Tools: Kafka, Google Pub/Sub, SNS
| Feature | Kafka | RabbitMQ / SQS |
|---|---|---|
| Model | Distributed log — messages persisted and replayed | Queue — messages deleted after consumption |
| Consumer groups | Multiple groups each get all messages independently | Competing consumers share the queue |
| Ordering | Strict ordering within a partition | FIFO within a queue (SQS FIFO, RabbitMQ) |
| Throughput | Millions of messages/second | Thousands to low millions |
| Replay | Yes — reprocess from any offset | No — consumed messages are gone |
| Complexity | High — needs cluster management, ZooKeeper/KRaft | Low — managed services available |
| Use when | Event streaming, audit logs, data pipelines | Task queues, simple async processing |
Use Kafka when you need replay, high throughput, or fan-out to multiple consumer groups. Use SQS/RabbitMQ for simpler task distribution. Don't over-engineer — SQS works for most things.
| Guarantee | Meaning | Trade-off |
|---|---|---|
| At-most-once | Message delivered 0 or 1 times. May be lost. | Fastest. OK for metrics, logs where loss is acceptable. |
| At-least-once | Message delivered 1 or more times. May be duplicated. | Most common. Consumer must be idempotent. |
| Exactly-once | Delivered exactly once. | Very expensive. Requires distributed transactions. Kafka supports it with significant overhead. |
With at-least-once delivery (the default), your consumer must handle duplicate messages safely. Charge a credit card twice? That's a serious bug. Use a unique message ID to deduplicate: if already_processed(message_id): return.
If consumers are slower than producers, the queue grows unbounded. This is a backpressure problem. Mitigations: