Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No option to delete value related to specific set of tags in ETS table #46

Open
FelonEkonom opened this issue Apr 6, 2022 · 6 comments

Comments

@FelonEkonom
Copy link

FelonEkonom commented Apr 6, 2022

There is no option, to delete an existing entry in ETS table. For example, if I have a sum metric with some tags, there is no option to remove value related to a specific set of tags. Because of that, size of reports generated during scrapes can only grow, and there is no possibility to remove values, that are no longer needed from these reports.

@FelonEkonom FelonEkonom changed the title No option to delete exisiting entry in ETS table No option to delete value related to specific set of tags in ETS table Apr 6, 2022
@bryannaegele
Copy link
Collaborator

That is expected behavior for Prometheus. If you're running into size issues that would be an indication that your tags have too much cardinality.

@FelonEkonom
Copy link
Author

Let's assume, that I have a system, that has many jobs running inside it. Every job has its lifetime and I want to have a tool, that will help me aggregate some metrics about these jobs. In this case, job id would be a tag, that I would group metrics by. I think, that in systems like this, you don't want to have metrics about obsolete, ended jobs in reports generated during scrapes. That is why, I think, having the option to delete metrics related to a job, that is ending, would be a great idea. Also, in this case, the cardinality of tags does not come from bad system design, but will naturally increase with a lifetime of whole systems, as upcoming jobs will start and end.

@bryannaegele
Copy link
Collaborator

Prometheus is simply not the right tool for the requirements you're describing. Prometheus creates a timeseries for every combination of metric * attributes * attribute values and those are stored in the prometheus server for the whatever the duration of the storage is set to.

I think for the use case you're describing you would be better served with tracing where cardinality in attributes is not a concern and you can get insights on multiple operations by a common attribute+value, in your case a job id.

https://github.com/open-telemetry/opentelemetry-erlang combined with Lightstep, Honeycomb, Zipkin, Grafana, etc would better fit your requirements. If you want more help or opinions you can get a lot of help in the #opentelemetry channel in the Elixir Slack.

@hairyhum
Copy link

It's true that prometheus is storing everything, but it still has a retention time in the server configuration. So by default after 15 days the old time series will be removed.
But this reporter implementaion does not have such retention time and will keep reporting old time series on every scrape. This means that old time series which could have been removed by prometheus already keep getting updated unnecessarily.
Some sort of cleanup on the reporter side would be helpful, whether it's a delete function or a retention time.

@Rados13
Copy link

Rados13 commented Feb 5, 2024

Hi @bryannaegele, what do you think about a suggestion from @hairyhum?

@bryannaegele
Copy link
Collaborator

I'm fine with that if someone wanted to submit a PR for an expiration setting but I am not personally adding features to this library at this time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants