-
Notifications
You must be signed in to change notification settings - Fork 832
Description
Part of #1912. Prerequisite for OM2 suffix-related PRs.
TL;DR
Today, metric names are modified at both creation time
(reserved suffix rejection) and scrape time (suffix
appending/stripping). This design moves all suffix handling
to scrape time. Going forward, the SDK does not modify
metric names — only the OM1 legacy format writer appends
suffixes.
Reference: Metric Names: A Guide for Client Library
Maintainers
Dev summit consensus (2024-09-13):
Suffixes like
_totalor_infoare only ever enforced
in OM v1 and some existing SDK implementations. They MUST
NOT be enforced in any other situation.We intend to get rid of automatically enforced suffixes
with OM 2.0.
Principles
- Don't break existing apps. Existing names exposed
via OM1 (pull) and OTel (push) must not change without
explicit opt-in. - The SDK does not modify metric names. The OM2 spec's
SHOULDs about unit suffixes and_totalare guidance
for users naming their metrics, not for the SDK to
enforce. OM2 and OTel preserve output the name exactly
as provided. - OM1 is legacy. It continues to append
_total, unit
suffixes, etc. at scrape time. No changes to OM1 output. - OTel legacy behavior behind a flag. The existing OTel
name stripping (_total, unit suffix) is kept as default
for backward compat, flipped in v2 (OTel exporter: preserve_names=true as default #1943). - Fail early on collisions. Registry detects name
collisions across all formats at registration time.
Key table
| User provides | OM1 | OM2 | OTel | OTel + otel_preserve_names |
|---|---|---|---|---|
Counter("events") |
events_total |
events |
events |
events |
Counter("events_total") |
events_total |
events_total |
events |
events_total |
Counter("req_bytes").unit(BYTES) |
req_bytes_total |
req_bytes |
name req, unit By |
name req_bytes, unit By |
Counter("req").unit(BYTES) |
req_bytes_total |
req |
name req, unit By |
name req, unit By |
Gauge("connections") |
connections |
connections |
connections |
connections |
Gauge("events_total") |
events_total |
events_total |
events_total |
events_total |
Histogram("dur_seconds").unit(SECONDS) |
dur_seconds_* |
dur_seconds |
name dur, unit s |
name dur_seconds, unit s |
Histogram("dur").unit(SECONDS) |
dur_seconds_* |
dur |
name dur, unit s |
name dur, unit s |
- Rows 1+2 cannot coexist (OM1 collision detected at
registration time) - OM1 (legacy): appends
_total, unit suffix — no
changes - OM2: name as user wrote it. No flag needed — the SDK
does not modify names - OTel: legacy path strips
_total+ unit suffix;
otel_preserve_namesstops stripping - OTel preserve = OM2 = name as provided
Collision detection
Registry computes names a metric would produce across all
formats. Rejects if any collide with existing metrics.
Gauge("foo_total")+Histogram("foo")-> no
collision. Fixes Issue when using @Timed annotation with prometheus-metrics-instrumentation-dropwizard #1321.Gauge("foo_total")+Counter("foo")-> collision
in OM1. Correctly rejected.
Strictly better than today's blanket rejection — only
genuine collisions are caught.
Responsibility matrix
| Component | Today | After |
|---|---|---|
Counter.Builder |
strips _total |
no-op |
Info.Builder |
strips _info |
no-op |
PrometheusNaming |
rejects/strips reserved suffixes | no reserved suffixes |
PrometheusRegistry |
rejects duplicate names | + cross-format collisions |
| OM1 writer | always appends _total |
smart-appends (skip if present) |
| OM2 writer | always appends _total |
no-op (name as provided) |
| OTel exporter | strips unit only | legacy: + _total strip; preserve: no-op |
| Protobuf writer | always appends _total |
smart-appends (skip if present) |
otel_preserve_names flag
Controls OTel export behavior:
false(default today): legacy behavior — strip
_totalfrom Counters, strip unit suffix from name, set
OTel unit metadata.true: name exactly as the user wrote it. Unit
metadata set from.unit()if provided.
Default becomes true in next major release (#1943).
No equivalent flag for OM2 — the OM2 writer always outputs
the name as provided. The OM2 spec's SHOULDs about unit
suffixes and _total are naming guidance for users, not
behavior the SDK enforces.
disableSuffixAppending not needed
Suffix behavior is per-format, not a standalone flag. OM2
config (#1939) should focus on contentNegotiation,
compositeValues, exemplarCompliance,
nativeHistograms. Suffix handling follows the format
automatically.
OM2 spec references
From the OM2 spec:
- Unit: "If non-empty, it SHOULD be a suffix of the
MetricFamily name separated by an underscore." - Counter: "The MetricFamily name for Counters SHOULD end
in_total." - Info: "The MetricFamily name for Info metrics MUST end
in_info."
These SHOULDs are guidance for users naming their metrics.
The SDK's job is to pass names through, not enforce naming
conventions. MUST (_info) is enforced by the OM2 writer.
v2 defaults
In the next major release:
otel_preserve_names: defaulttrue. OTel users get
names exactly as they wrote them. Users who depend on
_total/unit stripping setfalseexplicitly.
v2 key table
| User provides | OM1 | OM2 | OTel (v2) |
|---|---|---|---|
Counter("events") |
events_total |
events |
events |
Counter("events_total") |
events_total |
events_total |
events_total |
Counter("req").unit(BYTES) |
req_bytes_total |
req |
name req, unit By |
Histogram("dur").unit(SECONDS) |
dur_seconds_* |
dur |
name dur, unit s |
OM2 and OTel v2 both output name as provided. The only
difference is OTel sets unit metadata from .unit().
Child issues
- Move suffix handling from creation time to scrape time (current release) #1941 —
_total/_infosuffix validation + OTel
otel_preserve_namesflag (current release) - OTel exporter: preserve_names=true as default #1943 — OTel
otel_preserve_namesdefault flip (next
major release)
Implementation order
- Collision detection in
PrometheusRegistry - Smart-append in OM1/protobuf writers
- Remove reserved suffixes + store original name
- OTel:
_totalstripping (legacy) + preserve flag - OM2 writer: no-op (name as provided)
- Move unit suffix appending to scrape time (OM1 only)