You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a use case where my node process (using prom-client) is listening to events broadcast by a home IoT device. When the node process receives a broadcast, it updates a bunch of Gauges with the latest values.
I ran into a problem recently where because the IoT device was in a failure state, it wasn't broadcasting updates. Since my node process kept the same value for the gauge from the last update (hours ago), I didn't realize that the ingestion pipeline had failed. It would have been better in this situation for the metric to stop being reported. My fix is here and essentially consists of hacking the concept of a TTL on-top of the prom-client Gauge objects.
I know that it doesn't make sense to have a TTL on all metrics, but I do think there are cases it would make sense to support TTL on gauges. If a specific metric / label combination doesn't receive an update before the TTL is up (which would refresh the TTL) the metric would be automatically removed. If a metric receives an update after being automatically removed, it would be reinstated (until the TTL removes it again). It could be an opt-in feature, and the default could be to have an unset or forever TTL on metrics which would maintain backwards compatibility.
There is some prior art for a feature like this in other libraries. Rust's metrics library (see example usage here) has the ability to set a TTL for metrics globally on the registry level.
This request seems similar / related to #492 but is a different request
The text was updated successfully, but these errors were encountered:
I have a use case where my node process (using
prom-client
) is listening to events broadcast by a home IoT device. When the node process receives a broadcast, it updates a bunch of Gauges with the latest values.I ran into a problem recently where because the IoT device was in a failure state, it wasn't broadcasting updates. Since my node process kept the same value for the gauge from the last update (hours ago), I didn't realize that the ingestion pipeline had failed. It would have been better in this situation for the metric to stop being reported. My fix is here and essentially consists of hacking the concept of a TTL on-top of the
prom-client
Gauge objects.I know that it doesn't make sense to have a TTL on all metrics, but I do think there are cases it would make sense to support TTL on gauges. If a specific metric / label combination doesn't receive an update before the TTL is up (which would refresh the TTL) the metric would be automatically removed. If a metric receives an update after being automatically removed, it would be reinstated (until the TTL removes it again). It could be an opt-in feature, and the default could be to have an unset or
forever
TTL on metrics which would maintain backwards compatibility.There is some prior art for a feature like this in other libraries. Rust's
metrics
library (see example usage here) has the ability to set a TTL for metrics globally on the registry level.This request seems similar / related to #492 but is a different request
The text was updated successfully, but these errors were encountered: