-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dfinit: restart-container-runtime restart loop #294
Comments
@kakkoyun I will fix it. Thanks! |
Thank you 🙏 |
This is my patch to make it work, but it's not production-worthy. Restarting the container runtime in a container is NOT a good idea but I'm not sure if there's any other way to do this. Let's make sure there isn't any loop so that kubernetes can schedule them eventually. - op: remove
path: /spec/template/spec/initContainers/3
- op: add
path: /spec/template/spec/initContainers/-
value:
name: restart-container-runtime
image: docker.io/busybox:latest
command:
- /bin/sh
- -cx
- |-
if [ -f /var/lib/dragonfly/container-runtime-restarted ]; then
echo "container runtime already restarted once"
exit 0
fi
echo "restarting container runtime..."
touch /var/lib/dragonfly/container-runtime-restarted
nsenter -t 1 -m -- systemctl try-reload-or-restart containerd.service
echo "restart container"
securityContext:
privileged: true
volumeMounts:
- name: storage
mountPath: /var/lib/dragonfly |
One other issues, is about # A Kubernetes DaemonSet patch to add initContainers to the dragonfly-client DaemonSet.
- op: add
path: /spec/template/spec/initContainers/0
value:
name: update-containerd-remove-registry-mirrors
image: python:3.12-slim
securityContext:
privileged: true
volumeMounts:
- name: containerd-config-dir
mountPath: /etc/containerd
# The command below is to remove the registry mirrors in the containerd config.toml file.
# When confing_path is defined, 'mirrors' cannot be specified for the registry entry.
command:
- /bin/sh
- -cxe
- |-
apt-get update && apt-get install -y jq
pip install yq
if tomlq -e '.plugins."io.containerd.grpc.v1.cri".registry.mirrors' /etc/containerd/config.toml > /dev/null; then
tomlq -i -t 'del(.plugins."io.containerd.grpc.v1.cri".registry.mirrors)' /etc/containerd/config.toml
nsenter -t 1 -m -- systemctl try-reload-or-restart containerd.service
echo "containerd config updated"
else
echo "Entry does not exist, no changes made"
fi |
@kakkoyun Is it because containerd's configuration file has been changed incorrectly, causing containerd to fail to restart? If so, please provide me with the default configuration for GKE's containerd. |
Hey @gaius-qi, sorry for the delayed response. I was on PTO and away from the keyboard. I can create another issue for the GKE-specific error if it would be clearer. Let me know. But briefly, GKE injects a [plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/etc/containerd/certs.d"
So, with the quick and dirty solution that I proposed in #294 (comment), I made it work. However, Let me know if you need further explanation. |
@kakkoyun Example A: [plugins."io.containerd.grpc.v1.cri".registry]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."gcr.io"]
endpoint = ["https://gcr.io"] Example B: [plugins."io.containerd.cri.v1.images".registry]
[plugins."io.containerd.cri.v1.images".registry.mirrors]
[plugins."io.containerd.cri.v1.images".registry.mirrors."docker.io"]
endpoint = ["https://registry-1.docker.io"]
[plugins."io.containerd.cri.v1.images".registry.mirrors."gcr.io"]
endpoint = ["https://gcr.io"] |
@gaius-qi Yes, quite similar. Here is exactly how it looks: [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://mirror.gcr.io","https://registry-1.docker.io"] |
@kakkoyun If you don't know how to get dfinit entire config and dfinit verison, you can give me the helm chart config. |
@kakkoyun Can you help me to provide your entire containerd config before installing, dfinit entire config and dfinit verison? I want to fix the bug. Thanks! |
@gaius-qi I'll do it as soon as I've some free cycles. |
Bug report:
The restart-container-runtime init container is configured to restart the container runtime without any conditions. As a result, the pod remains in an unready state (NotReady) perpetually. This happens because the container runtime is continuously being restarted, preventing the pod from reaching a stable, ready state.
The restart should only happen once if the configuration is changed. So that the next loop could be marked as ready.
Expected behavior:
Daemonset should start normally.
How to reproduce it:
values.yaml
withEnvironment:
uname -a
):Linux jack-oneill 6.9.3-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 31 May 2024 15:14:45 +0000 x86_64 GNU/Linux
Logs:
kubectl describe pod
:Details
The text was updated successfully, but these errors were encountered: