Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka Consumer throughput limited to ~2MB/s #262

Open
kouknick opened this issue Nov 29, 2024 · 1 comment
Open

Kafka Consumer throughput limited to ~2MB/s #262

kouknick opened this issue Nov 29, 2024 · 1 comment

Comments

@kouknick
Copy link

kouknick commented Nov 29, 2024

I have been testing Kafka consumers against Azure Event Hub, but I am noticing that the throughput is not utilizing the Event Hub's full capacity. I have tried both the Confluent Kafka Go example and the Sarama Go example, but both of them seem to achieve a throughput of around 1-2MB/s regardless of the partitions and throughput units (TUs) that i configure in Event Hub.

Test Setup:

  • I used the latest versions of the Confluent Kafka Go and Sarama libraries.
  • The app is deployed on Azure and the Event Hub is in the same region.
  • The Event Hub has 40 throughput units (TUs) and 10 partitions.
  • The theoretical maximum Event Hub consumer throughput should be 10 partitions * 2MB/s per partition = 20MB/s.
  • The code to fetch events was copy pasted from the examples and i just added a log to print the consumed MB/s while the events are fetched

Results:

  • My tests report a maximum throughput of just ~2MB/s, regardless of how many partitions are being consumed. I tried both with 1 partition and 10 partitions, still limited to 2mb/s.
  • I validated that the app is correctly connecting to all 10 partitions and consuming events from each partition, but the throughput doesn’t exceed 2MB/s.

Experiment:
I used the Azure Event Hubs Go SDK in the same app and noticed I could easily fetch up to 25MB/s from the Event Hub, which indicates the Event Hub itself is not the limiting factor.
However, when using Kafka consumers, the throughput never goes above 2MB/s, even after adjusting the configuration settings in kafka consumers i couldn't achieve anything more.

Issue:
I would like to continue using Kafka for my consumers but cannot seem to achieve the expected throughput. I have tried playing with different configurations but haven't been able to improve the throughput.

Question:
Is there any advice or configuration change that could help me fully utilize the throughput of the Event Hub using Kafka consumers, or is there a known issue with Kafka's integration with Azure Event Hub that could be causing this limitation?

Code snippet

func confluentKafkaConsumer(ctx context.Context) (string, error) {

	c, err := kafka1.NewConsumer(&kafka1.ConfigMap{
		"bootstrap.servers":        kafkaBrokers[0],
		"broker.address.family":    "v4",
		"group.id":                 consumerGroup,
		"session.timeout.ms":       30000,
		"auto.offset.reset":        "earliest",
		"security.protocol":        "SASL_SSL",
		"sasl.mechanisms":          "PLAIN",
		"sasl.username":            "$ConnectionString",
		"sasl.password":            connectionString,
		"enable.auto.offset.store": false,
		"enable.auto.commit":       false,
	})

	if err != nil {
		fmt.Fprintf(os.Stderr, "Failed to create consumer: %s\n", err)
		return "", nil
	}

	start := time.Now()
	totalBytes := 0
	totalMessages := 0
	err = c.SubscribeTopics([]string{topic}, nil)

	if err != nil {
		fmt.Fprintf(os.Stderr, "Failed to subscribe to topics: %s\n", err)
		os.Exit(1)
	}

	for {
		msg, err := c.ReadMessage(-1)
		if err != nil {
			fmt.Fprintf(os.Stderr, "Consumer error: %v (%v)\n", err, msg)
			break
		}
		totalBytes += len(msg.Value)
		totalMessages++
		if totalMessages%1000 == 0 {
			fmt.Println("MB/s -", float64(totalBytes)/(1024*1024)/time.Since(start).Seconds())
			fmt.Println("Messages per second -", float64(totalMessages)/time.Since(start).Seconds())
			fmt.Println("Total Messages -", totalMessages)
		}
	}
	return "", nil
}
@cpbotha
Copy link

cpbotha commented Feb 4, 2025

We are seeing similar behaviour.

With our Python implementation using the azure event hubs SDK we can read 10 times faster from the exact same hub as our stripped down spark + kafka implementation. We have spent days testing different configurations for the spark + kafka setup, but to no avail. This is currently really blocking our work.

@hmlam any chance you or any of your colleagues could take a look? I will try to make some time to build a self-contained confluent-kafka Python demonstrator as well, but @kouknick already has a pretty simple demonstrator above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants