You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In a real-life deployment, scraping hardware-exporter may take more than 10s, e.g.
time curl localhost:10000/metrics
...
real 0m25.534s
user 0m0.008s
sys 0m0.000s
By default, grafana-agent's scrape_timeout is 10s. If the scrape takes more than 10s, it fails silently, there are no alerts about missing metrics and any firing alerts get resolved because there are no metrics in Prometheus since grafana-agent doesn't push them.
As a reliability engineer, I didn't know this was happening until I ran queries in Prometheus to check whether the metrics were there.
The workaround is to bump global_scrape_timeout in grafana-agent juju config to something like 60s to make sure the scrape jobs are actually executed.
The text was updated successfully, but these errors were encountered:
In a real-life deployment, scraping hardware-exporter may take more than 10s, e.g.
By default, grafana-agent's scrape_timeout is 10s. If the scrape takes more than 10s, it fails silently, there are no alerts about missing metrics and any firing alerts get resolved because there are no metrics in Prometheus since grafana-agent doesn't push them.
As a reliability engineer, I didn't know this was happening until I ran queries in Prometheus to check whether the metrics were there.
The workaround is to bump global_scrape_timeout in grafana-agent juju config to something like 60s to make sure the scrape jobs are actually executed.
The text was updated successfully, but these errors were encountered: