From 6995f4a5ecf4a740f70d878fc3423c4fba16955d Mon Sep 17 00:00:00 2001 From: Anurag Guda Date: Wed, 2 Oct 2024 10:39:38 -0500 Subject: [PATCH] Troubleshooting Steps Signed-off-by: Anurag Guda --- README.md | 3 + troubleshooting/README.md | 6 + troubleshooting/dns.md | 62 + troubleshooting/driver-container-logs.png | 2352 +++++++++++++++++++++ 4 files changed, 2423 insertions(+) create mode 100644 troubleshooting/README.md create mode 100644 troubleshooting/driver-container-logs.png diff --git a/README.md b/README.md index ac1b2e2..f863064 100755 --- a/README.md +++ b/README.md @@ -114,6 +114,9 @@ For more Information about customize the values, please refer [Installation](htt `NOTE:` (Cloud Native Stack does not allow the deployment of several control plane nodes) +# Troubleshooting + +[Troubleshoot CNS installation issues](https://github.com/NVIDIA/cloud-native-stack/blob/master/troubleshooting/README.md) # Getting help or Providing feedback diff --git a/troubleshooting/README.md b/troubleshooting/README.md new file mode 100644 index 0000000..bd84406 --- /dev/null +++ b/troubleshooting/README.md @@ -0,0 +1,6 @@ +# CNS Troubleshooting + +CNS deployment may fail for diverse reasons. +The topics below provide some guidance to root cause the issue. + +[DNS Issues](https://github.com/NVIDIA/cloud-native-stack/blob/master/troubleshooting/dns.md) \ No newline at end of file diff --git a/troubleshooting/dns.md b/troubleshooting/dns.md index e69de29..b674e79 100644 --- a/troubleshooting/dns.md +++ b/troubleshooting/dns.md @@ -0,0 +1,62 @@ +# DNS troubleshooting + +## DNS resolution for pods + +### Driver Container failed to access archive.ubuntu.com + +#### Issue: + +Driver Container logs display the following error messages: +![driver container logs](https://github.com/NVIDIA/cloud-native-stack/blob/master/troubleshooting/driver-container-logs.png) + + +#### Troubleshooting: + +follow the steps located here: https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/ to troubleshoot DNS pod resolution. + +To install the dnsutils pod, launch the command: +``` +kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml +``` + +In a working CNS deployment, you should have an output similar to below: + +``` +nvidia@ipp1-1394:~$ kubectl exec -i -t dnsutils -- nslookup archive.ubuntu.com +Server: 10.96.0.10 +Address: 10.96.0.10#53 + +Non-authoritative answer: +Name: archive.ubuntu.com +Address: 91.189.91.82 +Name: archive.ubuntu.com +Address: 185.125.190.82 +Name: archive.ubuntu.com +Address: 185.125.190.83 +Name: archive.ubuntu.com +Address: 185.125.190.81 +Name: archive.ubuntu.com +Address: 91.189.91.81 +Name: archive.ubuntu.com +Address: 91.189.91.83 +Name: archive.ubuntu.com +Address: 2620:2d:4002:1::103 +Name: archive.ubuntu.com +Address: 2620:2d:4000:1::101 +Name: archive.ubuntu.com +Address: 2620:2d:4002:1::102 +Name: archive.ubuntu.com +Address: 2620:2d:4002:1::101 +Name: archive.ubuntu.com +Address: 2620:2d:4000:1::103 +Name: archive.ubuntu.com +Address: 2620:2d:4000:1::102 +``` + +Note that Name must be exactly 'archive.ubuntu.com': + +***Name: archive.ubuntu.com*** + + +If you get a different output, it is recommended to fix the root cause (check with the team in charge of the DNS server. They may have created an entry for the archive.ubuntu.com and if this is the case, they must remove it). + diff --git a/troubleshooting/driver-container-logs.png b/troubleshooting/driver-container-logs.png new file mode 100644 index 0000000..94b34b4 --- /dev/null +++ b/troubleshooting/driver-container-logs.png @@ -0,0 +1,2352 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + cloud-native-stack/troubleshooting/driver-container-logs.png at master · NVIDIA/cloud-native-stack · GitHub + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+ Skip to content + + + + + + + + + + + + +
+
+ + + + + + + + + + + + + + +
+ +
+ + + + + + + + +
+ + + + + +
+ + + + + + + + + +
+
+
+ + + + + + + + + + + + +
+ + +
+ +
+ +
+ + + + / + + cloud-native-stack + + + Public +
+ + +
+ +
+ + +
+
+ +
+
+ + + + +
+ + + + + + +
+ + + + + + + + + + + + + + + + + + +

Latest commit

 

History

History
97.5 KB

driver-container-logs.png

File metadata and controls

97.5 KB
driver-container-logs.png
+
+ + + + +
+ +
+ +
+
+ +
+ +
+

Footer

+ + + + +
+
+ + + + + © 2024 GitHub, Inc. + +
+ + +
+
+ + + + + + + + + + + + + + + + + + + +
+ +
+
+ + +