Kafka vs RabbitMQ — When to Use a Log vs a Queue
Kafka is for streaming data at scale; RabbitMQ is for reliable message routing. Pick Kafka if you need logs, RabbitMQ if you need queues.
Kafka
Kafka's log-based architecture handles massive, real-time data streams that RabbitMQ can't match without breaking. It's the clear winner for modern data pipelines.
Log vs Queue: Different Philosophies
Kafka and RabbitMQ aren't direct competitors—they solve different problems. Kafka is a distributed log designed for streaming data at scale, where you append messages and consumers read at their own pace. RabbitMQ is a message broker built for reliable message routing between applications, with features like exchanges and queues. If you think of Kafka as a never-ending tape recorder and RabbitMQ as a postal service, you're on the right track. Most people compare them because both handle messages, but that's like comparing a cargo ship to a delivery van.
Where Kafka Wins
Kafka dominates in high-throughput, real-time data streaming. It handles millions of messages per second with built-in partitioning and replication—try that with RabbitMQ and watch it choke. Features like log compaction let you keep the latest value for each key, perfect for event sourcing. Its consumer groups allow multiple consumers to read the same topic in parallel, scaling horizontally without complex setup. For use cases like clickstream analytics, IoT data ingestion, or building a data lake, Kafka is the only sane choice. RabbitMQ's best effort might hit 50k messages/sec on a good day; Kafka laughs at that limit.
Where RabbitMQ Holds Its Own
RabbitMQ excels at complex routing and guaranteed delivery. Its exchange types (direct, topic, fanout, headers) let you route messages based on content or patterns, something Kafka's simpler topics can't do natively. For request-reply patterns or work queues where each message must be processed exactly once, RabbitMQ's acknowledgments and dead-letter queues are more straightforward. It's also easier to set up—spin up a Docker container and you're done, while Kafka requires ZooKeeper (or KRaft mode) and more tuning. If you're building a microservices app that needs reliable task distribution, RabbitMQ is still the go-to.
The Gotcha: Switching Costs Are Brutal
Moving from RabbitMQ to Kafka isn't an upgrade—it's a rewrite. Kafka's log model means you can't just swap brokers; you need to redesign your producers and consumers to handle partitioning, offsets, and lack of built-in routing. RabbitMQ users will miss message TTLs and priority queues, which Kafka doesn't support natively. Conversely, Kafka's disk-based storage lets you retain data for days or weeks, but that means higher infrastructure costs. If you pick the wrong tool, you'll spend months untangling it. Most teams regret not choosing based on their actual data pattern, not the hype.
If You're Starting Today...
Ask one question: Do you need a log or a queue? For streaming data—like user activity logs, sensor data, or real-time analytics—pick Kafka. Use Confluent Cloud (from $1.50/hour) or self-host with Apache Kafka (free, but painful). For task queues, RPC, or microservices messaging, pick RabbitMQ. Use CloudAMQP (from $20/month) or self-host. Don't try to force Kafka into a queue role with hacks like single-partition topics; you'll hate life. And if you're small-scale, consider Redis for simple queues—it's cheaper and faster for low-volume cases.
What Most Comparisons Get Wrong
Everyone obsesses over throughput numbers, but the real difference is data retention. Kafka keeps messages on disk for as long as you want (hours to years), acting as a source of truth. RabbitMQ typically holds messages in memory until consumed, then discards them. This makes Kafka better for replayability and auditing, but worse for low-latency scenarios. Also, Kafka's exactly-once semantics are overhyped—they add complexity and aren't needed for most apps. RabbitMQ's at-least-once delivery is simpler and sufficient for 90% of use cases. Stop benchmarking and think about your data lifecycle.
Quick Comparison
| Factor | Kafka | Rabbitmq |
|---|---|---|
| Core Architecture | Distributed log with partitioning | Message broker with exchanges/queues |
| Max Throughput | Millions of messages/sec per cluster | ~50k messages/sec per node |
| Data Retention | Disk-based, configurable (hours to years) | Memory-based, typically until consumed |
| Message Routing | Simple topics, no built-in routing | Complex routing via exchange types |
| Ease of Setup | Requires ZooKeeper/KRaft, more configuration | Single Docker container, minimal config |
| Pricing (Managed) | Confluent Cloud from $1.50/hour | CloudAMQP from $20/month |
| Exactly-Once Semantics | Supported with transactions | Not supported natively |
| Use Case Sweet Spot | Real-time data streaming, event sourcing | Task queues, microservices messaging |
The Verdict
Use Kafka if: You're handling high-volume data streams (like logs or IoT data) and need replayability.
Use Rabbitmq if: You're building a microservices app with complex routing or need simple task queues.
Consider: Redis for low-volume, in-memory queues—it's faster and cheaper if you don't need advanced features.
Kafka's log-based architecture handles massive, real-time data streams that RabbitMQ can't match without breaking. It's the clear winner for modern data pipelines.
Related Comparisons
Disagree? nice@nicepick.dev