Enabling the creation of networkPolicy in the helm charts causes networking (DNS) issues for datadog-clusterchecks #1606

Vassilis0 · 2024-11-12T16:01:05Z

Describe what happened:
Enabling the creation of networkPolicy in the helm charts causes networking (DNS) issues for the datadog-clusterchecks because the policy created does not allow egress traffic on port 53 .

Spec:
  PodSelector:     app=datadog-clusterchecks
  Allowing ingress traffic:
    ‹none> (Selected pods are isolated for ingress connectivity)
  Allowing egress traffic:
    To Port: 443/TCP
    To: <any> (traffic not restricted by destination)
     ------------
    To Port: 5005/TCP
    To:
        PodSelector: app=datadog-cluster-agent
    ------------
    To Port: <any> (traffic allowed to all ports)
    To: <any> (traffic not restricted by destination)
Policy Types: Ingress, Egress

More specifically the errors in datadog-clusterchecks for the endpoinds

• https://api.datadoghq.com/api/v1/check_run
• https://api.datadoghq.com/api/v2/series

2024-11-04 16:21:09 UTC | CORE | ERROR | (comp/forwarder/defaultforwarder/worker.go:187 in process) | Too many errors for endpoint 'https://7-58-0-app.agent.datadoghq.com/api/v1/check_run': retrying later

2024-11-04 16:21:09 UTC | CORE | ERROR | (comp/forwarder/defaultforwarder/worker.go:187 in process) | Too many errors for endpoint 'https://7-58-0-app.agent.datadoghq.com/api/v2/series': retrying later

2024-11-08 14:00:46 UTC | CORE | ERROR | (comp/forwarder/defaultforwarder/worker.go:191 in process) | Error while processing transaction: error while sending transaction, rescheduling it: Post "https://7-58-0-app.agent.datadoghq.com/intake/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

2024-11-06 15:28:47 UTC | CORE | DEBUG | (comp/forwarder/defaultforwarder/transaction/transaction.go:87 in 1) | DNS Lookup failure: lookup 7-58-0-app.agent.datadoghq.com: i/o timeout

Steps to reproduce the issue:
Enable the creation of networkPolicy in charts for a datadog deployment that runs in AWS EKS with the network policy enabled

kubectl get networkpolicy 
kubectl describe networkpolicy <networkpolicy-name>

check for name resolution from within the datadog-clusterchecks pod

Additional environment details (Operating System, Cloud provider, etc):
AWS EKS 1.30
Datadog Helm chart: 3.77

The text was updated successfully, but these errors were encountered:

celenechang · 2024-11-27T16:42:57Z

Hi @Vassilis0 , thanks for opening the issue and for providing those details. Did adding an egress rule on port 53 help resolve the DNS error?

Do you mind opening a support ticket so that our team can investigate further? Thank you in advance.

Vassilis0 · 2024-12-17T14:11:20Z

Thank you for your reply @celenechang
Yes, adding egress rule on port 53 help resolve the DNS error but we end up disabling the creation of the Network policy.

Support ticket: #1970318

Vassilis0 changed the title ~~Enabling the creation of networkPolicy in the helm charts causes networking (DNS) issues for datadog-clusterchecks.~~ Enabling the creation of networkPolicy in the helm charts causes networking (DNS) issues for datadog-clusterchecks Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enabling the creation of networkPolicy in the helm charts causes networking (DNS) issues for datadog-clusterchecks #1606

Enabling the creation of networkPolicy in the helm charts causes networking (DNS) issues for datadog-clusterchecks #1606

Vassilis0 commented Nov 12, 2024

celenechang commented Nov 27, 2024

Vassilis0 commented Dec 17, 2024

Enabling the creation of networkPolicy in the helm charts causes networking (DNS) issues for datadog-clusterchecks #1606

Enabling the creation of networkPolicy in the helm charts causes networking (DNS) issues for datadog-clusterchecks #1606

Comments

Vassilis0 commented Nov 12, 2024

celenechang commented Nov 27, 2024

Vassilis0 commented Dec 17, 2024