Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken DNS behind VPN on windows #8156

Open
Nino-K opened this issue Jan 29, 2025 · 10 comments
Open

Broken DNS behind VPN on windows #8156

Nino-K opened this issue Jan 29, 2025 · 10 comments
Assignees
Labels
area/networking kind/bug Something isn't working
Milestone

Comments

@Nino-K
Copy link
Member

Nino-K commented Jan 29, 2025

As previously reported (#8088, #8055, and #8058), the DNS lookup behind the VPN was broken in version 1.17. The root cause was an upgrade to gvisor-tap-vsock v0.8.1. As part of the fix, we have downgraded it to v0.7.5, which should resolve the issue in version 1.17.1.

Additionally, we should investigate the fix in the upstream gvisor-tap-vsock repository and consider contributing a pull request to address the issue.

@Nino-K Nino-K added area/networking kind/bug Something isn't working labels Jan 29, 2025
@Nino-K Nino-K added this to the 1.18 milestone Jan 29, 2025
@Nino-K Nino-K self-assigned this Jan 29, 2025
@Nino-K
Copy link
Member Author

Nino-K commented Jan 29, 2025

Submited the following issue in the upstream and they agreed to revert the PR that caused the issue in here: containers/gvisor-tap-vsock#467

@jankap
Copy link

jankap commented Feb 6, 2025

Not sure if this is DNS related, but I can't pull any images when connected to the corporate VPN.

Commands done in WSL Ubuntu (not rancher-desktop WSL)

docker pull hello-world
Using default tag: latest
Error response from daemon: Get "https://registry-1.docker.io/v2/": dial tcp 44.208.254.194:443: i/o timeout

curl on the same machine works

curl https://registry-1.docker.io/v2/
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":null}]}

curl -v shows that the correct proxy, set via http(s)_proxy are used.

Since the domain is translated into the IP by rancher, this is not a DNS error, correct?

I'm on WSL2, v2.4.10.0, Win 11, rancher 1.17.1., using the new mirrored network mode

EDIT: Interesting, after restarting WSL and Rancher Desktop, after VPN connection, I get now a

docker pull hello-world
Using default tag: latest
Error response from daemon: Get "https://registry-1.docker.io/v2/": EOF

Maybe it's DNS related.

Logging into rancher-desktop distribution:

docker -d rancher-desktop

ping www.google.de
ping: bad address 'www.google.de'

ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
64 bytes from 8.8.8.8: seq=0 ttl=115 time=22.046 ms
64 bytes from 8.8.8.8: seq=1 ttl=115 time=29.356 ms
64 bytes from 8.8.8.8: seq=2 ttl=115 time=38.626 ms

cat /etc/resolv.conf
nameserver 192.168.127.1

ping 192.168.127.1
PING 192.168.127.1 (192.168.127.1): 56 data bytes
<timeout>

So what is 192.168.127.1?

@Qenupve
Copy link

Qenupve commented Feb 7, 2025

So what is 192.168.127.1?

That looks to be the gateway IP hard coded in gvisor-tap-vsock

@jankap have you checked the Rancher Desktop WSL Proxy settings? Someone noted that it was causing them issues in 1.17.1 #8055 (comment)

@Nino-K
Copy link
Member Author

Nino-K commented Feb 11, 2025

@jankap are you able to try out the build from our main branch which can be found here (you would need to be logged into the github). Thanks.

@jankap
Copy link

jankap commented Feb 11, 2025

I will. Please give me a few days, I'm sick at home, just wanted to comment, thanks :)

@JWood48
Copy link

JWood48 commented Feb 14, 2025

I'm having a similar issue, don't know if it's related or something new, i'm behind a corporate vpn and when pulling images the docker host freezes and looses the connection to the net after some time eg.

start pulling an image, many layers are pulled but suddenly it hangs:

Image

If the process is cancelled and rerun i get this error:

$ docker pull mcr.microsoft.com/dotnet/sdk:8.0 Error response from daemon: Get "https://mcr.microsoft.com/v2/": dial tcp: lookup mcr.microsoft.com on 192.168.127.1:53: dial udp 192.168.127.1:53: connect: network is unreachable

This is the same for any repository i pull from:

$ docker pull confluentinc/cp-kafka:7.0.0 Error response from daemon: Get "https://registry-1.docker.io/v2/": dial tcp: lookup registry-1.docker.io on 192.168.127.1:53: dial udp 192.168.127.1:53: connect: network is unreachable

I need to restart Rancher Desktop to get it working again, and after many tries i might get lucky and pull all layers before it freezes.

(I'm on windows 11 and Rancher Desktop 1.17.1)

@jankap
Copy link

jankap commented Feb 18, 2025

@jankap are you able to try out the build from our main branch which can be found here (you would need to be logged into the github). Thanks.

@Nino-K

I tried Version: 1.17.1-369-gc3ccaba62. The error changed.

 docker pull curlimages/curl
Using default tag: latest
Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

When disconnecting VPN, everything works.

@jankap have you checked the Rancher Desktop WSL Proxy settings? Someone noted that it was causing them issues in 1.17.1 #8055 (comment)

Yes, the proxy integration is currently disabled.

Is there anything I can do to support searching the issue? Docker Desktop works BTW.

@Nino-K
Copy link
Member Author

Nino-K commented Feb 18, 2025

@jankap are you able to try out the build from our main branch which can be found here (you would need to be logged into the github). Thanks.

@Nino-K

I tried Version: 1.17.1-369-gc3ccaba62. The error changed.

 docker pull curlimages/curl
Using default tag: latest
Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

When disconnecting VPN, everything works.

@jankap have you checked the Rancher Desktop WSL Proxy settings? Someone noted that it was causing them issues in 1.17.1 #8055 (comment)

Yes, the proxy integration is currently disabled.

Is there anything I can do to support searching the issue? Docker Desktop works BTW.

@jankap, thanks for testing it out! One thing I suspect might be causing issues is the mirrored network mode in WSL, which we don't officially support. Are you able to disable it, or is it a must-have for your setup? Also, could you try pulling from the Windows CMD terminal on the host to see if you get a different result?

@jankap
Copy link

jankap commented Feb 19, 2025

@jankap, thanks for testing it out! One thing I suspect might be causing issues is the mirrored network mode in WSL, which we don't officially support. Are you able to disable it, or is it a must-have for your setup? Also, could you try pulling from the Windows CMD terminal on the host to see if you get a different result?

@Nino-K the mirrored network mode is awesome because it gets rid of all kinds of issues with DNS, proxies, etc in WSL - no manual settings needed anymore, no messing around with resolv.conf etc. I think it's going to become the standard mode in the future.

But yes, I tried to disable the mode, used NAT again, and restarted WSL and Rancher. No change.

From inside WSL:

docker pull hello-world
Using default tag: latest
Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

From host powershell:

docker pull hello-world
Using default tag: latest
Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Is there any way to debug from inside rancher-desktop or rancher-desktop-data? How to find out if it's DNS or proxy related? I have not set any proxies, because I'm not sure how to do that, see #4289

@Nino-K
Copy link
Member Author

Nino-K commented Feb 19, 2025

@jankap, thanks for testing it out! One thing I suspect might be causing issues is the mirrored network mode in WSL, which we don't officially support. Are you able to disable it, or is it a must-have for your setup? Also, could you try pulling from the Windows CMD terminal on the host to see if you get a different result?

@Nino-K the mirrored network mode is awesome because it gets rid of all kinds of issues with DNS, proxies, etc in WSL - no manual settings needed anymore, no messing around with resolv.conf etc. I think it's going to become the standard mode in the future.

But yes, I tried to disable the mode, used NAT again, and restarted WSL and Rancher. No change.

From inside WSL:

docker pull hello-world
Using default tag: latest
Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

From host powershell:

docker pull hello-world
Using default tag: latest
Error response from daemon: Get "https://registry-1.docker.io/v2/": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

Is there any way to debug from inside rancher-desktop or rancher-desktop-data? How to find out if it's DNS or proxy related? I have not set any proxies, because I'm not sure how to do that, see #4289

Trying to figure out if it's a DNS issue, you can do rdctl shell nslookup registry-1.docker.io to see if it resolves.

@jandubois jandubois modified the milestones: 1.18, 1.19 Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants