Skip to content

Kafka Best Practices πŸ†

Lyes S edited this page Jul 6, 2022 · 4 revisions

Table Of Contents

Managing Partitions Count

Partitioning Sizing

Partition counts cannot be changed after topic creation, so care should be taken in choosing the right size. If the number of partitions are too little, then fewer brokers are involved in handling the topic and there is a lot of serialized processing happening. As result, there could be producer lag and consumer lag.

Some brokers and consumers may be overworked handling big partitions while others may be starved for work as they don't have any partitions assigned to them. On the other hand, if the number of partitions is too high it results in more broker resources like file handles and memory. There is also high replication overhead if the number of partitions are high.

Recommendations

  • Plan partitions counts before creating topics (as it is not possible to change them later).
  • Make sure that the partition count is greater than the expected number of consumers (Partition count >= Number of consumers in a group).
  • Use as many brokers as possible in the cluster as both partition leaders and replicas.
  • Brokers have limitations in how many replicas they can handle including leaders and followers. Keep the total number of replicas per broker < 1 000. If needed, increase the number of brokers.
  • Messages are distributed among partitions based on keys. So the total number of possible unique values for the key should be greater than the number of partitions.
  • Perform load testing to understand the performance characteristics of specific application based on :
    • Overall peak load
    • Partition lag

Managing Messages

Recommendations

  • Keep message sizes small.
  • Message formats ( Kafka, Avro Serialization, and the Schema Registry )
    • Choose binary standard format like Avro.
    • Use a schema repository to share message schema between producers and consumers.
  • Message Keys
    • Choose keys that have more unique values = Better partitioning.
    • Even distribution of values = Better partitioning.
    • Use keys only if message ordering is needed (without keys Kafka will use a round Robin method to evenly distribute messages to partitions).

Managing Consumer Settings

Recommendations

  • Choose sizes based on the expected incoming messages and desired latency.
  • Choose batching parameters based on use case.
    • Batch Processing : use larger batch sizes to reduce network round trips.
    • Real Time Stream Processing : use smaller batch sizes and smaller polling intervals to minimize latency.
  • Use manual commits to ensure reliability of data processing.
  • Use non-blocking processing for higher throughput.
  • Test failure conditions (like a consumer going down or a broker going down) to verify desired results (like messages are not lost and reprocessing happens as desired ).

Managing Resiliency

Recommendations

  • Use topic replication to safeguard against broker failures (It is recommended to have a replication factor of at least 03 to maintain safe replicas).
  • Distribute Kafka brokers across different racks.
  • Use acknowledgements in producer to ensure guaranteed delivery of data to Kafka.
  • Use manual commits in consumers to ensure the offsets are incremented only after data is successfully processed.
  • Use Zookeeper clusters for high availability (it is recommended to install a Zookeeper node in each of the Kafka broker hosts).
  • Use mirroring with Mirrormaker if geo-redundancy is required.
  • Test resiliency use cases to make sure the system recovers as desired by the use case when failures happen.
Clone this wiki locally