Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS resolution failure over TCP for ClickHouse in restricted UDP environment #1561

Open
mahesh-kore opened this issue Nov 15, 2024 · 7 comments

Comments

@mahesh-kore
Copy link

mahesh-kore commented Nov 15, 2024

Description:

In our environment, DNS resolution over UDP is blocked, so we've configured pods to use TCP for DNS resolution instead. Testing with ping confirms that DNS resolution over TCP works, as the service name resolves successfully. However, ClickHouse is unable to resolve the service name over TCP and returns an error.

Steps to Reproduce:

  1. Block UDP DNS resolution in the environment.
  2. Configure pods to use TCP for DNS resolution.
        dnsConfig:
          options:
          - name: use-vc
  1. Run ping to verify TCP DNS resolution, which works as expected.
    ping chi-test-test-1-2.default.svc.cluster.local
  2. Attempt to start or use ClickHouse with the above DNS configuration.

Observed Behavior:

ClickHouse fails to resolve the service name over TCP, generating the following error:

2024.11.14 18:20:17.660787 [ 48 ] {c1e33f52-b6b1-45e6-b1e0-c24514136aa9} <Error> DNSResolver: Cannot resolve host (chi-test-test-1-2.default.svc.cluster.local), error 0: Host not found

However, running ping within the pod resolves the service name as expected:

PING chi-test-test-1-2.default.svc.cluster.local (10.42.0.123) 56(84) bytes of data.
64 bytes from chi-test-test-1-2-0.chi-test-test-1-2.default.svc.cluster.local (10.42.0.123): icmp_seq=1 ttl=64 time=0.038 ms

Expected Behavior:

ClickHouse should be able to resolve service names over TCP in environments where UDP DNS is blocked, similar to the successful resolution observed with ping.

Additional Context:

Are there any known limitations with ClickHouse’s DNS resolver over TCP? Any recommendations or configurations to resolve this issue would be helpful.

@Slach
Copy link
Collaborator

Slach commented Nov 15, 2024

issue, is not related to clickhouse-operator, but i'm not sure will standard golang library which we use in clickhouse-operator also follow use-vc and use DNS over TCP by default.

Typical use case for DNS over TCP is big UDP responses

Why did you restrict a standard DNS approach?

@mahesh-kore
Copy link
Author

mahesh-kore commented Nov 15, 2024

You're correct, but we are working in an environment within an enterprise bank where custom DNS servers (coreDNS/kube-dns) or hosts are not permitted.

This is part of a proof of concept (POC) where we aim to demonstrate our application, which utilizes ClickHouse.

@mahesh-kore
Copy link
Author

@Slach Do you have any suggestions for a potential workaround

@Slach
Copy link
Collaborator

Slach commented Nov 15, 2024

@arthurpassos could you suggest something about DNS over TCP in DNSResolver clickhouse-server?

@mahesh-kore
Copy link
Author

@arthurpassos Can you suggest any possible workarounds

@arthurpassos
Copy link

A setting that control the protocol could be introduced, something like dns_resolution_protocol=[any|udp|tcp].

ClickHouse uses poco lib to perform DNS reoslutions. Poco, under the hood, uses libc getaddrinfo.

getaddrinfo function takes in a addrinfo structure that has the option to set the protocol: any, udp or tcp afaik. The thing is that Poco does not have an abstraction that allows addrinfo to be manually set.

Options available:

  1. stop using poco and call getaddrinfo manually.
  2. submit a pr to poco lib introducing such api, and then update our poco fork.
  3. update our poco fork only.

@arthurpassos
Copy link

A setting that control the protocol could be introduced, something like dns_resolution_protocol=[any|udp|tcp].

ClickHouse uses poco lib to perform DNS reoslutions. Poco, under the hood, uses libc getaddrinfo.

getaddrinfo function takes in a addrinfo structure that has the option to set the protocol: any, udp or tcp afaik. The thing is that Poco does not have an abstraction that allows addrinfo to be manually set.

Options available:

  1. stop using poco and call getaddrinfo manually.
  2. submit a pr to poco lib introducing such api, and then update our poco fork.
  3. update our poco fork only.

I looked at the code again, poco lives in base/poco, no need to submit a PR to poco or update our fork. It is bundled to gether, easier.

Editing the Poco...DNS::hostByName to accept a protocol parameter is easy, tho it won't work on systems that do not have getaddrinfo.

After that, one needs to make sure all DNS function calls specify the protocol based on the setting. Not very scalable, but it is the same thing with proxy support

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants