Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing runtime metrics from cAdvisor #3776

Closed
stevehipwell opened this issue Feb 14, 2024 · 5 comments · Fixed by #3804
Closed

Missing runtime metrics from cAdvisor #3776

stevehipwell opened this issue Feb 14, 2024 · 5 comments · Fixed by #3804
Assignees
Labels
area/kubernetes K8s including EKS, EKS-A, and including VMW type/bug Something isn't working type/enhancement New feature or request

Comments

@stevehipwell
Copy link

Platform I'm building on:

EKS

What I expected to happen:

I'd expect to see the cAdvisor runtime values as part of the kubelet metrics.

What actually happened:

The kubelet cAdvisor metrics are missing the runtime values meaning we can't see the node usage.

How to reproduce the problem:

Run kubectl get --raw "/api/v1/nodes/${NODE_NAME}/proxy/stats/summary" | jq -r '.node.systemContainers[] | .name' and you'll see the following response.

kubelet
pods

This the same as awslabs/amazon-eks-ami#1667; the solution I added there would also seem to be relevant in the context of Bottlerocket.

@stevehipwell stevehipwell added status/needs-triage Pending triage or re-evaluation type/bug Something isn't working labels Feb 14, 2024
@arnaldo2792
Copy link
Contributor

Hi @stevehipwell , thanks for letting us know! Let me give your solution a try and confirm. Just to keep a record, at some point we moved both containerd.service and kubelet.service to be under the runtime.slice cgroup. It looks like cadvisor may need to know which cgroups it should track in order to provide the metrics:

https://github.com/kubernetes/kubernetes/blob/ad6477e342c8ce0f9b1997d5345322c930f6911d/cmd/kubelet/app/server.go#L722

@stevehipwell
Copy link
Author

@arnaldo2792 thanks. I've not had a chance to check out the actual code in detail but I'm surprised this doesn't just work. The kubelet should have all of the information required to automate this.

@ginglis13
Copy link
Contributor

ginglis13 commented Mar 1, 2024

Hi @stevehipwell, thanks for the issue. I had a chance this afternoon to repro as well as attempt your suggested fix, and it worked :D I'm going to test some more but I'll get a PR when ready

@ginglis13 ginglis13 self-assigned this Mar 1, 2024
@ginglis13 ginglis13 added type/enhancement New feature or request area/kubernetes K8s including EKS, EKS-A, and including VMW and removed status/needs-triage Pending triage or re-evaluation labels Mar 1, 2024
@webern
Copy link
Contributor

webern commented Mar 18, 2024

@arnaldo2792 and @ginglis13 is this related somehow to #2743?

@arnaldo2792
Copy link
Contributor

Yes it is

ginglis13 added a commit that referenced this issue Mar 22, 2024
kubernetes: provide runtime cgroup to kubelet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubernetes K8s including EKS, EKS-A, and including VMW type/bug Something isn't working type/enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants