How to guarantee ordering in a Kafka topic with multiple producers

Giulliano Bueno
3 min readJan 17, 2023

--

Kafka is a powerful, distributed streaming platform that allows for the real-time processing of large volumes of data. One of the key features of Kafka is its ability to handle multiple producers and consumers, making it well-suited for use cases such as event-driven architectures and microservices. However, when dealing with multiple producers, it can be challenging to guarantee ordering of messages within a topic.

One way to ensure ordering of messages with multiple producers is to set a unique key for each message and have all producers use the same key for messages that need to be processed in order. This will ensure that all messages with the same key are processed in the order they were received by the topic. Additionally, setting the partition count to 1 and having a single consumer for the topic will also guarantee ordering as messages will be processed in the order they were received by the partition.

Another common scenario is when updates and creates are being sent by different producers and the order of those events is important. In this case, you can use a unique key that combines the ID of the object being updated or created with a timestamp. For example, if you are updating or creating a user object, you could use the user ID as the key and the timestamp as a subkey. This way, when the messages are received by the topic, they will be sorted by the key (user ID) and the subkey (timestamp) and processed in the correct order.

In Java, you can set a subkey for a message by using the ProducerRecord class and setting the key and subkey when creating a new instance. You can add a header to the message to include the timestamp.

It’s important to note that even though you set a key or subkey to sort the messages, the messages within a partition will not be automatically sorted by their key or subkey. Kafka uses a partitioning strategy to determine which partition a message should be written to, based on the message key. If no key is provided, a round-robin strategy is used to determine the partition. Once the messages are in a partition, they are written in the order they were received.

To ensure that the messages are sorted by the key or subkey within a partition, you will need to implement an additional sorting step in your consumer. This can be done by using a data structure such as a TreeMap or PriorityQueue that sorts the messages based on the key or subkey before processing them. Alternatively, you could use a stream processing framework such as Apache Kafka Streams, that allows you to process the messages in the order they are received, and also sort them by key or subkey.

Another possible and more pattern-driven way is to use the Transactional Outbox pattern described by Chris Richardson in his book ‘Microservices Patterns’ (a must-read). This solution will not fix all your problems, but it will certainly help to provide a more transactional solution that ensures order preservation. Depending on how you implement your producer, you can completely avoid the multi-producer problem, but you may have to address scalability issues in a high-volume scenario.

In summary, ordering of messages in a Kafka topic with multiple producers can be achieved by setting unique keys and subkeys, using a single consumer and/or a single partition. In addition, sorting the messages by key or subkey within the partition can be done by implementing an additional sorting step in the consumer or using a stream processing framework.

--

--