Open
Description
Agent Environment
$ sudo datadog-agent version
Agent 7.54.1 - Commit: 44d1992 - Serialization version: v5.0.114 - Go version: go1.21.9
Describe what happened:
After upgrading to 7.54.0, Kafka consumer lag checks started to fail
Describe what you expected:
Expected Datadog Agent to continue to get Kafka consumer lag offsets from Kafka cluster.
Steps to reproduce the issue:
- Upgrade to v7.54.0 or v7.54.1
- Configure Datadog to check Kafka consumer offsets
$ sudo cat /etc/datadog-agent/conf.d/kafka_consumer.d/conf.yaml
init_config:
instances:
- kafka_connect_str:
- <redacted>
security_protocol: SASL_SSL
sasl_mechanism: PLAIN
sasl_plain_username: <redacted>
sasl_plain_password: <redacted>
kafka_consumer_offsets: true
monitor_unlisted_consumer_groups: true
- perform a check
$ sudo datadog-agent check kafka_consumer
Running Checks
==============
kafka_consumer (4.3.0)
----------------------
Instance ID: kafka_consumer:24b8757764ea1a30 [ERROR]
Configuration Source: file:/etc/datadog-agent/conf.d/kafka_consumer.d/conf.yaml
Total Runs: 1
Metric Samples: Last Run: 0, Total: 0
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 5.099s
Last Execution Date : 2024-06-24 09:11:07 WEST / 2024-06-24 08:11:07 UTC (1719216667000)
Last Successful Execution Date : Never
Error: Unable to connect to the AdminClient. This is likely due to an error in the configuration.
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.11/site-packages/datadog_checks/kafka_consumer/kafka_consumer.py", line 34, in check
self.client.request_metadata_update()
File "/opt/datadog-agent/embedded/lib/python3.11/site-packages/datadog_checks/kafka_consumer/client.py", line 180, in request_metadata_update
self.kafka_client.list_topics(None, timeout=self.config._request_timeout)
File "/opt/datadog-agent/embedded/lib/python3.11/site-packages/confluent_kafka/admin/__init__.py", line 603, in list_topics
return super(AdminClient, self).list_topics(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
cimpl.KafkaException: KafkaError{code=_TRANSPORT,val=-195,str="Failed to get metadata: Local: Broker transport failure"}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.11/site-packages/datadog_checks/base/checks/base.py", line 1224, in run
self.check(instance)
File "/opt/datadog-agent/embedded/lib/python3.11/site-packages/datadog_checks/kafka_consumer/kafka_consumer.py", line 36, in check
raise Exception(
Exception: Unable to connect to the AdminClient. This is likely due to an error in the configuration.
Metadata
========
config.hash: kafka_consumer:24b8757764ea1a30
config.provider: file
Additional environment details (Operating System, Cloud provider, etc):
Metadata
Metadata
Assignees
Labels
No labels