Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relying on undefined behavior - parsing kubernetes resource names #716

Open
thockin opened this issue Mar 10, 2023 · 10 comments
Open

Relying on undefined behavior - parsing kubernetes resource names #716

thockin opened this issue Mar 10, 2023 · 10 comments
Labels
kind/bug Something isn't working

Comments

@thockin
Copy link

thockin commented Mar 10, 2023

https://github.com/gocrane/crane/blob/main/pkg/utils/expression_prom_deafult_test.go#L18

The random suffix of a pod or other resource is not guaranteed to be 5 characters, nor is the characterset defined. Someone looked into the implementation details and took a dependency on them. This is brittle and could change at any time.

You should be using labels to extract metadata, not parsing the name. That's literally why they exist.

@qmhu
Copy link
Member

qmhu commented Mar 13, 2023

Thank you for bringing this to our attention. We agree that relying on the random suffix of a pod or other resource can be brittle and may not be a reliable method for extracting metadata. We acknowledge the importance of using labels to extract metadata, as this is their intended purpose. Moving forward, we will explore ways to implement this approach by using labels to extract metadata.

@saikey0379
Copy link
Contributor

@thockin
Hello, I would like to ask a question.
We expect to predict based on the resource/pods/external type metrics.However,kubelet_ip:10250/metrics/resource only contain namespace/pod and container labels,while the predicted require to input metrics of all replicaset versions.Is there a better way for us to accurately match the resources such as deployment or daemonset name?

Similar to the pods and external type.The OwnerReferences of the pods for deployment are replicaset,I think it is unreasonable to query the associated deployment name from apiserver by the process inside pod,Currently, we set deployment name with custom labels in the container through environment variables manually, which is not perfect. Can you provide some relevant suggestions?

@thockin
Copy link
Author

thockin commented Mar 14, 2023

This is a good question - I am not an expert in the metrics but it seems like what you have is a feature request for more labels.

Are you trying to get this information from inside the pod itself (its own parent ref), or just somewhere in the monitoring/collection path?

Are you trying to extract the replicaset or the deployment, or both?

What if a pod is not run under a deployment or replicaset? E.g. a StatefulSet or Job or something custom?

@saikey0379
Copy link
Contributor

This is a good question - I am not an expert in the metrics but it seems like what you have is a feature request for more labels.

Are you trying to get this information from inside the pod itself (its own parent ref), or just somewhere in the monitoring/collection path?

Are you trying to extract the replicaset or the deployment, or both?

What if a pod is not run under a deployment or replicaset? E.g. a StatefulSet or Job or something custom?

Yes, this project operates based on three types of metrics, and considerations related to labels for resources and pods.

For resource type metrics, we expect to at least add ownerreference when obtaining pod status through the kubelet path, and preferably use the top-level reference such as deployment.

For pods type metrics, we expect to easily obtain this information within the pod (such as native rather than manually set environment variables) to facilitate the generation of custom metric labels.

For pods that are not under deployment or replicaset, such as statefulset and daemonset, the parent reference of the pod can also be queried. For independent pods, the name is specified at creation and precise matching is also straightforward. I think it's acceptable to set the reference to null when querying.

@thockin
Copy link
Author

thockin commented Mar 15, 2023

Sorry, I'm still not clear on what you are already doing (perhaps in ugly ways) and what you are asking for as net-new features.

For resource type metrics, we expect to at least add ownerreference when obtaining pod status through the kubelet path, and preferably use the top-level reference such as deployment.

Is there a feature request here? ownerReferences is in the API, but to climb to the "root" you need GET access on all the intermediate resources which you may not have.

For pods type metrics, we expect to easily obtain this information within the pod (such as native rather than manually set environment variables) to facilitate the generation of custom metric labels.

As far as I can tell, ownerReferences is not available via env valueFrom or the downward API. Both seem reasonable to me, if someone wants to write a small KEP (to debate the finer points of formatting and edge cases). It should be fairly straight-forward.

@thockin
Copy link
Author

thockin commented Mar 15, 2023

@saikey0379
Copy link
Contributor

If a pod's random suffix is fragile and could change at any time, then we would have two new requirements:

The first requirement is to add owner or topowner labels to kubelet metrics such as container_cpu_usage_seconds_total.

The second requirement is to enable processes within a pod to easily obtain owner or topowner information, which can be used to fill in monitoring metric labels.

If the provided owner is a replicaset instead of a deployment, then when querying historical data with a Prometheus expression, we would need to use a query expression like {ownerReferences=~"test-111111|test-222222|test-3333333...}, which could become very ugly if the deployment version changes frequently. But an expression like {topOrKind="deployment",topOrName="test"} would be much simpler.

@thockin
Copy link
Author

thockin commented Mar 19, 2023

I'm afraid the issue is not quite so easy. There are a bunch of issues.

Deployments and ReplicaSets can orphan and adopt pods. Any additional information we provide can change over time. Is that acceptable? On Monday the pod "foobar-deadbeef93-xyz76" is owned by replicaset "foobar-deadbeef93" which is owned by deployment "foobar", but on tuesday it is owned by replicaset "hamburger-aabbccddee" and deployment "hamburger". The pod's name only means something at the instant it is created. And even THEN a long name can be truncated to add the random suffix, such that it doesnt actuyally match .*-.{10}-.{5}.

We could add well-known labels like workload.kubernetes.io/{deployment,replicaset,statuefulset,job,...} - this works for core k8s, but not for 3rd party workload controllers (which do exist!!). We'd have to verify that all of these APIs' object names are valid label values (hint: they are not). We could hash the deployment name if it is too long or use the UID. All that tells you is that 2 pods are in the same deployment, not what the deployment name actually is.

We could use annotations instead, but those are not selectable.

We could do a combination of labels AND annotations, I guess. Given the adoption issue, we would have to update those on the fly, meaning EVERY workload controller needs to handle it, and they might need more RBAC access. We can't just hash the selector, because an adoption could happen with a different-but-compatible selector.

Oh, but env vars are not updatable, so the downward API for this could only be through projected volumes.

@thockin
Copy link
Author

thockin commented Jan 14, 2024

Almost a year later this bug still exists. If kubernetes changes this behavior you will break.

@ArangoGutierrez
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants