You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ideally, the integration should also scrape the Controller Nodes (which may also expose Prometheus metrics), but it would at least be great to still support scraping the Broker Nodes when KRaft is in use.
(Posting only relevant version + check information, happy to share more details over DM)
===============
Agent (v7.54.0)
===============
Status date: 2024-06-05 12:38:51.71 UTC (1717591131710)
Agent start: 2024-06-05 12:38:46.925 UTC (1717591126925)
Pid: 1
Go Version: go1.21.9
Python Version: 3.11.8
Build arch: amd64
Agent flavor: agent
Log Level: INFO
Running Checks
==============
amazon_msk (4.7.0)
------------------
Instance ID: amazon_msk:c28c17180d3df175 [ERROR]
Configuration Source: kube_services:kube_service://datadog-cluster-checks/[REDACTED]
Total Runs: 36
Metric Samples: Last Run: 0, Total: 0
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 36
Average Execution Time : 293ms
Last Execution Date : 2024-06-05 12:48:45 UTC (1717591725000)
Last Successful Execution Date : Never
Error: 'BrokerNodeInfo'
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.11/site-packages/datadog_checks/base/checks/base.py", line 1224, in run
self.check(instance)
File "/opt/datadog-agent/embedded/lib/python3.11/site-packages/datadog_checks/amazon_msk/amazon_msk.py", line 115, in check
broker_info = node_info['BrokerNodeInfo']
~~~~~~~~~^^^^^^^^^^^^^^^^^^
KeyError: 'BrokerNodeInfo'
Amazon MSK has recently launched support for KRaft Clusters, which adds the Controller Nodes to the output of the
ListNodes
API Call.These node entries do not have a
brokerNodeInfo
entry, which causes the Agent Integration to crash with the following error:Ideally, the integration should also scrape the Controller Nodes (which may also expose Prometheus metrics), but it would at least be great to still support scraping the Broker Nodes when KRaft is in use.
Output of the info page
(Posting only relevant version + check information, happy to share more details over DM)
Additional environment details (Operating System, Cloud provider, etc):
Steps to reproduce the issue:
Describe the results you received:
The check fails with the exception above and no metrics are published to Datadog.
Describe the results you expected:
Ideally: The metrics for both the Controller and the Brokers are published to Datadog.
Desired: The metrics for the Brokers are published to Datadog.
Additional information you deem important (e.g. issue happens only occasionally):
Returned data for the
ListNodes
call in our KRaft-enabled cluster (redacted URLs and Account/Subnet IDs):The text was updated successfully, but these errors were encountered: