-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upgrading to runc 1.1.6 / 1.1.7 breaks #3223
Comments
xref: kubernetes/k8s.io#5276 (we'll need to make sure we continue to cover v1 elsewhere for some time) |
I ran an image built for k8s 1.27.1 with #3221 but on a GKE 1.26 / cgroupv2 node pool and no issues so far. While checking into kubernetes/k8s.io#5276 Eventually a lot of these headaches will go away with v1 goes away, but not just yet, probably another 1-2 years. |
In the CI nested environment we have:
with docker However there is also the host CI node level containerd/runc. I don't think I have direct access to the k8s infra CI nodes so it's not quite as easy to confirm the versions there. |
I think we can just push older k8s versions with a base image from to |
There's still some less frequent issues with misc controller in kubernetes CI. ref: #3250 (comment) The host runtime is not aware of misc, and probably won't be for a while. |
The trick from @kolyshkin to unmount the misc controller doesn't appear to work even if we add some logic to consider misc unsupported when on cgroupv1 + kubernetes without the kubelet runc update. Tentatively systemd discovers that misc is available and enabled on the host kernel via We have a bug currently where we'd mount it back as well, but even after fixing this and confirming it's not mounted before After inspecting systemd's logic for this, considering bind mounting a modified Someday we will only need to support hosts with cgroups v2 and we can phase out most of the nonsense kind employs currently. At least we're always using cgroupns starting with the next release (#3241). |
We also should get around to fixing the horrible dind setup that the main Kubernetes CI is running (which is itself kubernetes pods), but similarly considering if we can get that switched to cgroups v2 first (kubernetes/k8s.io#5276) and just test v1 in github actions without dind for the remaining users that haven't switched yet. |
it seems like minikube is already using runc 1.1.7, we have not faced any issues yet (or discovered it yet) @BenTheElder do you know a specific OS we could try on to see if it would fail for minikube ? the oldest ubuntu on free github action is ubuntu 20.04 (and minikube github action tests run on that) that seems to be cgroup v1 |
Yes.
You won't see cluster bring up fail at least with kind, but once pods have been running for a while things will start to fail (e.g. when running e2e tests container execs will break). I'm currently developing with a GCE VM on ubuntu 23.04 but doing:
This ensures a new enough kernel to have I was planning to follow up with minikube when we had a solution, there's been some other recent patches for always using cgroupns=private but they're not quite fully baked yet. For now I'd recommend moving back to 1.1.5, the bug fixes since 1.1.5 are mostly pretty minor currently. |
If you use Kubernetes without the recent patches to update to runc 1.1.6, (only available in 1.24+ on latest patch versions) the problems are worse. opencontainers/runc#3849 |
#3255 should resolve this. The PR body outlines the core necessary parts, the change itself is a bit messy so I've outlined the key approach in the PR body / comments. |
an update on minikube side: we could reproduce this bug for minikube, even though we have latest runc version,
however it worth noting after doing the ^^ the mount grep was still showing cgroupv2. so maybe we failed to make it cgroup v1. |
To reproduce you also need a new enough kernel to have the I used 23.04. Also make sure to set unified under
|
I'll try with an Ubuntu 23.04 machine, previously what I tried was:
|
If cgroup2 is on Docker etc should still use v1 then, systemd calls this "hybrid" mode (https://systemd.io/CGROUP_DELEGATION/).. That's expected. You don't need pure v1 mode. You do need |
FWIW: I'm not currently reproducing the issue, what I'm looking for is misc in-use, since I already settled on just disabling misc in v1 (see discussion in #3255). But when I was, the reproducer in the runc issue was sufficient. |
I would also recommend considering #3241 while working on the cgroups support. It has the downside of raising the minimum docker version to 20.10.0 (2.5 years old), but makes the whole containers-in-containers thing a log cleaner. We get this by default from all major runtimes with the transition to cgroups v2 but as long as users are on v1, v1 with cgroupns on is a lot better. For kind at least that required some additional fixups, for minikube it might be as simple as added |
kind is on runc 1.1.7 now |
I see all the changes to fix this happened at the kind node base image. I'm using image
My guess is that newer kind k8s nodes images haven't been updated/rebuild with the new base image? Is there a workaround at the 0S level I can use to make newer k8s versions to run without this issue in cgroup v1? I'm using
therefore OS version
|
Thanks for the report. To avoid image changes like this please use the digests as instructed in the release notes (IE @sha256...) |
To fully debug your environment we'l need a full bug template report with the information requested there. I suspect this is due to cgroupv1 being used for the cluster nodes without cgroupns=private. In kind v0.20.0 we will force it to always cgroupns=private, but the images were expected to continue to work with cgroupns=host.
1.27.2 has been, so probably the opposite issue? In the short term if you use the digest pinning you will be able to use a version predating these base image changes. Or, you could try the latest kind code at HEAD and see if the cgroupns=private change solves it. |
I'm using "CgroupnsMode": "private",\ https://docs.docker.com/engine/api/v1.43/#tag/Container/operation/ContainerCreate Thanks for the quick response and the help. It is all working fine now and I can do |
See: #3220, https://kubernetes.slack.com/archives/CEKK1KTN2/p1683851267796889
#3221 and #3222 have test results.
The failure mode is like:
This is limited to cgroups v1, this happens in our CI environment which is further awful by way of host node containerd => dockerd (in CI cluster pod) => kind (running against that nested dockerd)
The text was updated successfully, but these errors were encountered: