Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Yurthub keeps crashing: "lb failed to cache req yurthub watch services" #2230

Open
Gillefranc opened this issue Dec 15, 2024 · 2 comments
Labels
kind/bug kind/bug

Comments

@Gillefranc
Copy link

Gillefranc commented Dec 15, 2024

What happened:
After adding one node using the yurtadm join command, yurthub keeps crashing and restarting every few minutes, causing kube-flannel, kube-proxy and raven-agent to restart aswell

What you expected to happen:
For yurthub to keep running normally.

How to reproduce it (as minimally and precisely as possible):

  • Installed kubernetes master node on a server with kubeadm (version 1.28.5).
  • Installed flannel, on CIDR 10.245.0.0/16
  • Installed OpneYURT components using helm, following this guide (https://openyurt.io/docs/installation/manually-setup)
  • Installed yurtadm on an edge node (raspberry pi running ubuntu), and execured the following command:
    yurtadm join 10.0.20.230:6443 --token=xxx --node-type=edge --discovery-token-unsafe-skip-ca-verification --cri-socket=unix:///run/containerd/containerd.sock --v=5
  • All components are installed, but the edge components crash after a few minutes, then restarting again, in a loop.

Anything else we need to know?:

Environment:

  • OpenYurt version: v1.5.0
  • Kubernetes version (use kubectl version):
    Client Version: v1.31.3
    Kustomize Version: v5.4.2
    Server Version: v1.28.15
  • OS (e.g: cat /etc/os-release):
    NAME="Ubuntu"
    VERSION_ID="24.04"
    VERSION="24.04.1 LTS (Noble Numbat)"
  • Kernel (e.g. uname -a):
    Linux thesis-gilleslefranc-2 6.8.0-49-generic set the server-addr before deploying the yurthub #49-Ubuntu SMP PREEMPT_DYNAMIC Mon Nov 4 02:06:24 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
  • Others:

YurtHub logs (last 10 lines):

2024-12-15T11:48:15.291964103Z stderr F I1215 11:48:15.291262       1 cache_manager.go:386] yurthub watch nodepools: /apis/apps.openyurt.io/v1beta1/nodepools get 0 objects(add:0/update:0/del:0)
2024-12-15T11:48:15.292577203Z stderr F E1215 11:48:15.291396       1 loadbalancer.go:391] lb failed to cache req yurthub watch nodepools: https://10.0.20.230:6443/apis/apps.openyurt.io/v1beta1/nodepools?allowWatchBookmarks=true&resourceVersion=5434&timeoutSeconds=309&watch=true in local cache, EOF
2024-12-15T11:48:15.292674924Z stderr F I1215 11:48:15.291069       1 util.go:248] yurthub watch nodepools: /apis/apps.openyurt.io/v1beta1/nodepools?allowWatchBookmarks=true&resourceVersion=5434&timeoutSeconds=309&watch=true with status code 200, spent 1m15.187555851s
2024-12-15T11:48:15.292730497Z stderr F I1215 11:48:15.291838       1 util.go:248] yurthub watch services: /api/v1/services?allowWatchBookmarks=true&resourceVersion=5432&timeout=5m7s&timeoutSeconds=307&watch=true with status code 200, spent 1m15.083990561s
2024-12-15T11:48:15.292761737Z stderr F I1215 11:48:15.292107       1 secure_serving.go:311] Stopped listening on 169.254.2.1:10261
2024-12-15T11:48:15.292789329Z stderr F I1215 11:48:15.287226       1 dynamic_cafile_content.go:170] "Shutting down controller" name="client-ca-bundle::/var/lib/yurthub/pki/ca.crt"
2024-12-15T11:48:15.292816385Z stderr F E1215 11:48:15.292258       1 cache_manager.go:392] yurthub /api/v1/namespaces/kube-system/configmaps watch decode ended with: EOF
2024-12-15T11:48:15.292843088Z stderr F I1215 11:48:15.287318       1 tlsconfig.go:255] "Shutting down DynamicServingCertificateController"
2024-12-15T11:48:15.292869106Z stderr F I1215 11:48:15.292326       1 cache_manager.go:386] yurthub watch configmaps: /api/v1/namespaces/kube-system/configmaps get 0 objects(add:0/update:0/del:0)
2024-12-15T11:48:15.293191137Z stderr F I1215 11:48:15.287407       1 start.go:207] hub agent exited

/kind bug

@Gillefranc Gillefranc added the kind/bug kind/bug label Dec 15, 2024
@rambohe-ch
Copy link
Member

@Gillefranc It seems that Yurthub container has received SIGTERM signal from the os, this means that Yurthub process has been killed by os.

@rambohe-ch
Copy link
Member

@Gillefranc you can use the following command to verify which signal cause yurthub exit.

strace -p {PID} -e trace=signal

please use the pid of yurthub process to replace the {PID}.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug kind/bug
Projects
None yet
Development

No branches or pull requests

2 participants