404: Expertise Not Found

Kafka Isn’t a Task Queue – Stop Making It Pretend to Be One

Lessons from the Kitchen

Back in the mid-90s, I worked in a busy hotel kitchen. Orders flew in from the restaurant, and Freddie(the head chef) had a smart system: whoever was free grabbed the next dish. Simple, efficient, and no one drowned in pasta.

Now imagine if they’d done it the ‘organized’ way: Chef A gets all the pasta, Chef B all the steaks, Chef C all the salads. Sounds OK until 90% of the guests want pasta and Chef A is sweating bullets while the others are scrolling Instagram.

That’s what can happen when you try to make Kafka behave like a task queue. It’s not built to spread the love evenly, it’s built for consistency. If you expect perfect load balancing, you’ll end up with Chef A in meltdown mode and Chef B Googling “how to look busy in a kitchen.”


Why People Expect Kafka to Balance Work

Kafka has “consumers,” and consumers sound like workers. So naturally, people assume Kafka will distribute tasks evenly across them. Spoiler alert: it doesn’t, at least not in the way most people think.

Kafka does provide partitioning strategies to spread events, tasks, or work (whatever your tipple is) across partitions:

  • Default partitioner: Uses the message key for ordering.
  • RoundRobin partitioner: Distributes messages evenly across partitions when no key is provided.

So far, so good. But here’s the catch: consumers are assigned partitions, not individual messages. This means even if partitions receive the same number of messages, the actual processing time can vary wildly.


Example

Imagine:

  • 3 partitions: P0, P1, P2
  • 3 consumers: C0, C1, C2
  • RoundRobin distributes 10K messages to each partition.

Sounds balanced, right? But what if the messages in P0 involve heavy computation (e.g., complex transformations), while P1 and P2 handle lightweight tasks?

Result:

  • C0 (handling P0) is overloaded and runs for hours.
  • C1 and C2 finish quickly and sit idle.

Kafka doesn’t dynamically redistribute partitions based on processing time because its priority is ordering and consistency; not fairness or throughput optimization. If you need that kind of flexibility, you’d have to introduce patterns like work stealing, where idle workers can grab tasks from overloaded ones; but this is not supported by Kafka.


The Load Balancing Problem

Even with perfect message distribution, Kafka can’t guarantee balanced work across consumers because:

  • Partition assignment is static.
  • Processing complexity varies.
  • Kafka doesn’t do dynamic load balancing.
  • Fine-grained work stealing is not supported.

This is why Kafka is not a task queue. Rather it’s a distributed log designed for durability and replayability(is there such a word?).


Why This Isn’t a Bug … It’s a Feature

Kafka was designed for high-throughput event streaming, durability, and replayability. It is not designed for fair task scheduling. It’s brilliant at what it does, but “evenly spreading tasks” isn’t on the menu.


Better Patterns for Task Distribution

If you need true load balancing for tasks:

  • Use a dedicated task queue (RabbitMQ, SQS, Celery) with competing consumers, where workers pull tasks from a shared queue as they become available.

Kafka can still play an important role; it’s excellent for ingesting and organizing event data into your system. But if you’re trying to achieve more optimal distribution of work to workers, consider introducing a task queue with competing consumers to handle that part.


When Kafka Is Still the Right Choice

Kafka shines when you need:

  • Event streaming at scale.
  • Replayability for audit or recovery.
  • Strong ordering guarantees within partitions.

Closing Thought

If you want fairness, you need chefs who can grab the next dish when they’re free; not chefs locked into cooking only pasta or only steaks. Kafka has many strengths, it’s fantastic at organizing and delivering events reliably; but it won’t step in when one chef is drowning in complex orders while others stand idle.