Skip to content

native histoggram spec: document extrapolation below zero heuristics #2682

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 18 additions & 5 deletions docs/specs/native_histograms.md
Original file line number Diff line number Diff line change
Expand Up @@ -1877,11 +1877,24 @@ In this case, the 1st sample is not included in the calculation, so an
incompatible bucket layout between the 1st sample and the other samples is
simply ignored silently.

TODO: Preventing [extrapolation below
zero](https://github.com/prometheus/prometheus/blob/034d2b24bcae90fce3ac337b4ddd399bd2ff4bc4/promql/functions.go#L153-L159)
is currently not yet implemented (and might actually not make sense) for native
histograms. This may lead to slightly different results when comparing classic
histograms with equivalent NHCBs.
To prevent extrapolation below zero, the same heuristics is applied as for
[float counters](https://github.com/prometheus/prometheus/blob/034d2b24bcae90fce3ac337b4ddd399bd2ff4bc4/promql/functions.go#L153-L159),
but solely based on the count of observations. Therefore, individual buckets
might still be extrapolated below zero in some cases. An alternative could have
been to find the smallest extrapolation where neither the count nor any bucket
would be extropolated below zero. However, this does not necessarily lead to a
better heuristics while inflicting a significant cost in complexity. In the
common and important case where the first sample in the range is a synthetic
zero sample derived from the created-at timestamp, the limited extrapolation
will actually work perfectly precise, because the count and all buckets are
zero at precisely the timestamp of the synthetic sample, which is also the
point in time to which the extrapolation is limited. Note that classic
histogram apply the heuristics independently to each bucket and the count and
the sum (as they are all separate series). This is known to lead to
inconsistencies. NHCBs do not reproduce this problem and work in the same way
as other native histograms, which means that the result of `rate()` and
`increase()` may be slightly different when comparing classic histograms and
equivalent NHCBs.

`avg_over_time()` and `sum_over_time()` work with native histograms in a way
that corresponds to the respective aggregation operators. In particular, if a
Expand Down