Introduce a mechanism to guard against unbounded cardinality #970

dgrisonnet · 2022-01-21T15:06:30Z

dgrisonnet
Jan 21, 2022

Summary

Cardinality explosion is a reoccurring security concern when working with Prometheus metrics and although over time a lot of functionalities have been added to the Prometheus server to prevent them from causing major incidents, no improvements were made to the client to control cardinality explosions. As such when writing applications intentionally exposing metrics with unbounded cardinality, it is impossible to provide some guarantees to the users and consumers of the said metrics.

Motivations

In a big open-source project such as Kubernetes which has a big community and where we are continuously trying to improve monitoring, contributors often want to introduce metrics that make sense from a monitoring perspective but have an unbounded cardinality. This kind of change is very complex to approve for maintainers due to the security concerns that cardinality explosions involve even though these metrics would be really useful. If there was a functionality in the Prometheus client that would allow bounding the dimensions of these metrics, it would allow maintainers to merge these improvements with the guarantees that even if the Prometheus backends of the users aren't set up to guard against cardinality explosion, the metrics exposed by Kubernetes would not explode anyway.

Goals

Introduce a mechanism in client_golang to bound the cardinality of all the metrics it exposes
Inform users of cardinality explosions threats

Non goals

Introduce new protection mechanisms in Prometheus server

Proposal

Today, the only way to bound the cardinality of a metric is to go through each one of its labels and identify the unbounded dimensions. Once that's done, the developer needs to allow list the possible values that the label is allowed to take in order to fix the cardinality of the metrics and prevent it from exploding. While this is simple for labels such as the HTTP status code, it is impossible to do so if the label is for example a Kubernetes resource such as a pod or a namespace since they are user-defined. At the same time, for a lot of Kubernetes-related metrics, it is hard to get rid of these labels since they allow to identify the application that the metric is referring to.

A hive of such unbounded cardinality metrics is kube-state-metrics, and although there are mechanisms there to disable metrics, I don't think it is reasonable for users to remove all the information they have on their pods because malicious users could exploit them to cause cardinality explosions.

My proposition to improve that from the client-side would be to focus on the very source of the cardinality explosions, the unbounded dimensions. By introducing hard limits on values that a single label can take, we could bound dimensions that were previously unbounded. The idea would be that if a label takes more values than the hard limit that was defined, then the client would blank it to avoid exposing exploding metrics and then inform users/cluster administrators that a cardinality explosion was prevented at the cost of some granularity on a metric. Informing users that action needs to be taken can be easily node via a combo of metrics and alerts and then it will be up to them to either increase the cardinality limit or find the cause of the explosion and get rid of it.

Pros:

generic mechanism preventing cardinality explosion from the client-side
hard limit configurable by users depending on their use cases
informing users of potential threats

Cons:

blanking label values will impact alerting
performance concerns of iterating through all the timeseries to detect cardinality explosions
developers might become lazy and put less care into reducing the cardinality of metrics when possible

Design Details

My first idea about a potential design was focused on the UX that a maintainer would have when enforcing these hard limits. The easiest way would be to have them at the Registry level. In Kubernetes we have thousands of metrics, but only a few Registries, so it would make it way easier to enforce these kinds of restrictions in the few Registries that are responsible for gathering metrics. As such, I was thinking about either adding a new limit field to the Registry type of client_golang or creating a separated Registry that would add the cardinality protection functionality.

As for enforcing the limit on the number of values that a dimension can take, this can be done at the end of the Gather call since that's when all the series are transformed and where we can really know the cardinality of the metrics that will be served. The issue is that this change seems very expensive in terms of performance since we have to go through all the timeseries a second time so it might not be the best option.

The approach at the Registry level seemed very convenient to my use-case but that would mean that all the metrics in a Registry share the same limit which is reasonable to me, but some use-case might want to define limits per metrics?

I started working on a POC, but haven't finished yet and after discussing with @bwplotka we figure out that it might be better to first start a discussion here to gather some feedback first.

dgrisonnet · 2022-01-21T15:23:09Z

dgrisonnet
Jan 21, 2022
Author

A direct use-case for this kind of functionality would be this Kubernetes PR: kubernetes/kubernetes#104484. This PR adds a metric about probe duration which is very valuable to detect probe timeouts. However, to identify a probe, the pod name, and its namespace is needed which means that for this metric to be useful it would need at least these two dimensions that are unbounded and user-controlled. From a security point of view, that is a big concern, but from an observability one, we really want this information that we could then use to create alerts.

So far, in Kubernetes we've taken the stance that we shouldn't introduce any metrics that as unbounded cardinality and that prevents a lot of observability improvements. That is even stated in our developer guidelines: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/instrumentation.md#dimensionality--cardinality. With a protection against cardinality explosions from the client side we would be able to think less about security issues and focus more on making actual observability improvements.

0 replies

kakkoyun · 2022-01-25T10:07:13Z

kakkoyun
Jan 25, 2022
Maintainer

@dgrisonnet Thanks for bringing this and the proposal. I like to idea.

Cons:

blanking label values will impact alerting

As you already brought up, this rather a huge con. To prevent any major behaviour change if we enable this by an opt-in mechanism, I would be happy to proceed. In the documentation we can provide all the necessary tread-offs and user could choose whether they would like to use it or not.

3 replies

dgrisonnet Jan 25, 2022
Author

I agree, I was thinking of either having it as an option to the existing registry or even having a new registry just serving that purpose. It would then be left to the library consumer to make it opt-in to their users.

From the consumer perspective, I would offer the users to possibility to disable the limit or increase it as much as they much. At the end of the day, it really depends on the environment in which it will be used as well as the memory allocated to Prometheus.

dgrisonnet Jan 25, 2022
Author

One thing we can do to make it a bit safer for the users is to first provide the alert that would inform them that a label has been blanked in client_golang mixins and then heavily discourage them from using this option if they don't have this alert in their setup.

kakkoyun Jan 27, 2022
Maintainer

Yes, exactly. We should provide that alert as much place as possible. However, it's still a loosely coupled solution. And I don't know any other way than printing warning messages to stdout to warn users of the library. But it's the way it is.

jan--f · 2022-01-26T11:08:57Z

jan--f
Jan 26, 2022
Maintainer

I like the idea! Having more ways of protecting against overloading a Prometheus deployment make sense to me.
I wonder if it makes sense to think about a user interface that is a bit more complex in order to address some of the drawbacks and concerns.

How are the metrics to export selected? We could incorporate user input here.
- An allowlist of certain metric values that should never be redacted.
- Provide an aggregation operator to use, e.g. topk or bottomk seem like natural choices for some use cases
How to map metric values and label values that get excluded?
- Provide an aggregation, e.g. sum might be a natural choice for counter type metrics. This way prometheus aggregations would still work.
- Maybe provide an optional magic label value that the aggregated metric has.

What is unclear to me yet is how an exporter like this can expose these limitation to Prometheus and its users. If an exporter imposes limitations like this some operators no longer give accurate results (e.g. avg()) for example.
Maybe the type hints could play a role for this?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduce a mechanism to guard against unbounded cardinality #970

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 3 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Introduce a mechanism to guard against unbounded cardinality #970

Uh oh!

Uh oh!

dgrisonnet Jan 21, 2022

Summary

Motivations

Goals

Non goals

Proposal

Pros:

Cons:

Design Details

Replies: 3 comments · 3 replies

Uh oh!

dgrisonnet Jan 21, 2022 Author

Uh oh!

kakkoyun Jan 25, 2022 Maintainer

Uh oh!

dgrisonnet Jan 25, 2022 Author

Uh oh!

dgrisonnet Jan 25, 2022 Author

Uh oh!

kakkoyun Jan 27, 2022 Maintainer

Uh oh!

jan--f Jan 26, 2022 Maintainer

dgrisonnet
Jan 21, 2022

Replies: 3 comments 3 replies

dgrisonnet
Jan 21, 2022
Author

kakkoyun
Jan 25, 2022
Maintainer

dgrisonnet Jan 25, 2022
Author

dgrisonnet Jan 25, 2022
Author

kakkoyun Jan 27, 2022
Maintainer

jan--f
Jan 26, 2022
Maintainer