Confluent Kafka Producer Error - Request Timed out and Timed out MetadataRequest in flight #4039
Replies: 2 comments 5 replies
-
This is most likely an issue with Azure's load balancers that silently close idle connections. Try configuring ConnectionsMaxIdleMs to a value below what the load balancer timeout is, about a year ago it was 4 minutes, but we can see here that the connection is closed almost exactly after 10 minutes, so maybe go with that?
|
Beta Was this translation helpful? Give feedback.
-
Thanks for the quick reply @edenhill. Definitely will try it out and see the differences. Will update you on the result |
Beta Was this translation helpful? Give feedback.
-
For some reason kafka producer is throwing the below error while trying to produce a method
Am using the Confluent Kafka Package with version 1.8.2 to connect to Eventhub.
Below are the producer config for reference.
BootstrapServers = [EH-NAME-SPACE].servicebus.windows.net:9093,
SaslUsername = "$ConnectionString",
SaslPassword = <EH-SHARED_ACCESS-POLICY-PRIMARY-CONNETCION-STRING>,
SaslMechanism = SaslMechanism.Plain,
SecurityProtocol = SecurityProtocol.SaslSsl
MessageMaxBytes = 20971520,
MessageCopyMaxBytes = 20971520,
LogConnectionClose = false,
SocketKeepaliveEnable = false,
BrokerAddressFamily = BrokerAddressFamily.V4,
ConnectionsMaxIdleMs = 0,
Acks = Acks.Leader,
BatchSize = 200000,
LingerMs = 0,
CompressionType = CompressionType.None
SocketNagleDisable = true,
Below are the logs for the reference,
%5|1666936636.307|REQTMOUT|rdkafka#producer-1| [thrd:sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/bo]: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/0: Timed out MetadataRequest in flight (after 60039ms, timeout #0)
%4|1666936636.307|REQTMOUT|rdkafka#producer-1| [thrd:sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/bo]: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/0: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
%3|1666936636.307|FAIL|rdkafka#producer-1| [thrd:sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/bo]: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/0: 1 request(s) timed out: disconnect (after 355567ms in state UP)
←[40m←[32minfo←[39m←[22m←[49m: Logger[0]
Producer Error Handler: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/0: 1 request(s) timed out: disconnect (after 355567ms in state UP)
%5|1666936636.314|REQTMOUT|rdkafka#producer-2| [thrd:sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/bo]: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/0: Timed out MetadataRequest in flight (after 60029ms, timeout #0)
%4|1666936636.314|REQTMOUT|rdkafka#producer-2| [thrd:sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/bo]: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/0: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
%3|1666936636.314|FAIL|rdkafka#producer-2| [thrd:sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/bo]: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/0: 1 request(s) timed out: disconnect (after 354737ms in state UP)
←[40m←[32minfo←[39m←[22m←[49m: Logger[0]
Producer Error Handler: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/0: 1 request(s) timed out: disconnect (after 354737ms in state UP)
%5|1666937236.305|REQTMOUT|rdkafka#producer-1| [thrd:sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/bo]: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/0: Timed out MetadataRequest in flight (after 60037ms, timeout #0)
%4|1666937236.305|REQTMOUT|rdkafka#producer-1| [thrd:sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/bo]: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/0: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
%3|1666937236.305|FAIL|rdkafka#producer-1| [thrd:sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/bo]: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/0: 1 request(s) timed out: disconnect (after 598972ms in state UP, 1 identical error(s) suppressed)
←[40m←[32minfo←[39m←[22m←[49m: Logger[0]
Producer Error Handler: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/0: 1 request(s) timed out: disconnect (after 598972ms in state UP, 1 identical error(s) suppressed)
%5|1666937236.318|REQTMOUT|rdkafka#producer-2| [thrd:sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/10]: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/10: Timed out MetadataRequest in flight (after 60033ms, timeout #0)
%4|1666937236.318|REQTMOUT|rdkafka#producer-2| [thrd:sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/10]: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/10: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
%3|1666937236.318|FAIL|rdkafka#producer-2| [thrd:sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/10]: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/10: 1 request(s) timed out: disconnect (after 598996ms in state UP)
←[40m←[32minfo←[39m←[22m←[49m: Logger[0]
Producer Error Handler: sasl_ssl://<PRODUCER_EH_NAMESPACE>.servicebus.windows.net:9093/10: 1 request(s) timed out: disconnect (after 598996ms in state UP)
%5|1666937470.520|REQTMOUT|rdkafka#consumer-3| [thrd:sasl_ssl://<CONSUMER_EH_NAMESPACE>.servicebus.windows.net:9093/bootst]: sasl_ssl://<CONSUMER_EH_NAMESPACE>.servicebus.windows.net:9093/0: Timed out FetchRequest in flight (after 60882ms, timeout #0)
%4|1666937470.520|REQTMOUT|rdkafka#consumer-3| [thrd:sasl_ssl://<CONSUMER_EH_NAMESPACE>.servicebus.windows.net:9093/bootst]: sasl_ssl://<CONSUMER_EH_NAMESPACE>.servicebus.windows.net:9093/0: Timed out 1 in-flight, 0 retry-queued, 0 out-queue, 0 partially-sent requests
%3|1666937470.520|FAIL|rdkafka#consumer-3| [thrd:sasl_ssl://<CONSUMER_EH_NAMESPACE>.servicebus.windows.net:9093/bootst]: sasl_ssl://<CONSUMER_EH_NAMESPACE>.servicebus.windows.net:9093/0: 1 request(s) timed out: disconnect (after 1193914ms in state UP)
The above error is happening while running the producer in AKS. The same is not happening while running the executable locally on my machine.
Any help much appreciated.
Beta Was this translation helpful? Give feedback.
All reactions