You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unfortunately the used buckets for the histogram are unsuitable for the latency observed (and realistic) with AWS NLB target registration. As it stands the now improved registration time is about 60 to 70 seconds with the buckets being: {.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10} by default (see https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#pkg-variables). This causes all readiness flips to end up the in the catchall bucket, e.g.:
Describe the bug
Following the discussion around very slow target registration in #1834 PR #3941 was crafted by @zac-nixon adding metrics about the latency (
podReadinessFlipSeconds
) of the readiness gate. This cool new feature was merged by @wweiwei-li and @shraddhabang and then released with https://github.com/kubernetes-sigs/aws-load-balancer-controller/releases/tag/v2.10.1Unfortunately the used buckets for the histogram are unsuitable for the latency observed (and realistic) with AWS NLB target registration. As it stands the now improved registration time is about 60 to 70 seconds with the buckets being:
{.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10}
by default (see https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#pkg-variables). This causes all readiness flips to end up the in the catchall bucket, e.g.:awslbc_readiness_gate_ready_seconds_bucket{le="+Inf"}
Likely some linear buckets (https://pkg.go.dev/github.com/prometheus/client_golang/prometheus#LinearBuckets) with a range from e.g.
30s
to5m
that can be expected from the API and processes behind the health check and readiness gate mechanism makes sense.Steps to reproduce
Expected outcome
A concise description of what you expected to happen.
Environment
Additional Context:
The text was updated successfully, but these errors were encountered: