Context
Starting with Apache Kafka 2.4, the default partitioning strategy for messages without a key changed from round-robin to sticky partitioning. This change was introduced to improve batching efficiency and throughput, but it can have unintended consequences for certain workloads—especially batch job orchestration.
Partitioning Strategies in Kafka
| Strategy | Description |
|---|---|
| Key-Based | Messages with the same key go to the same partition (preserves order). |
| Round-Robin | Evenly distributes messages across partitions (pre-2.4 default). |
| Sticky | Sends messages to the same partition until the batch is full (post-2.4 default). |
| Custom | Developers define their own logic for partition assignment. |
Risk of Sticky Partitioning for Batch Jobs
Sticky partitioning can lead to uneven load distribution in scenarios like:
- Start-of-day batch jobs where a burst of messages is produced.
- Long-running tasks (e.g., each message takes ~1–2 minutes to process).
- Multiple consumers expecting parallelism.
Example Scenario
- Topic has 4 partitions and 4 consumers.
- At start of day, 100 messages are produced.
- Each message takes ~1–2 minutes to process.
With sticky partitioning:
- Kafka groups messages into batches before sending them.
- If the majority of messages are produced quickly, they may all be included in the same batch.
- That batch is sent to a single partition, meaning most messages are handled by one consumer.
- Other consumers remain idle, causing delays and underutilization of resources.
Factors That Influence Sticky Partitioning Behavior
| Factor | Description | Impact |
|---|---|---|
linger.ms | Time to wait before sending a batch | Higher values allow more messages to accumulate in a batch, increasing the chance they go to the same partition |
batch.size | Max size of a batch in bytes | Larger batch sizes delay partition switching |
| Message Generation Pattern | Whether messages are bursty or steady | Bursty patterns (e.g., start-of-day jobs) are more likely to overload a single partition |
| Kafka Producer Version | Version 2.4+ uses sticky by default | Older versions use round-robin |
Simple Mitigation Strategies
To avoid uneven load distribution caused by sticky partitioning:
- Add a key to your messages
Ensures messages are distributed more evenly across partitions. - Switch back to round-robin partitioning
Reverts to the pre-2.4 behavior where messages are evenly rotated across partitions. - Use a custom partitioning strategy (if needed)
Gives you full control over how messages are assigned to partitions.