Skip to content

fix: escape colons in metric names when escaping=underscores#1180

Open
johnsaurabh wants to merge 3 commits into
prometheus:masterfrom
johnsaurabh:fix/openmetrics-colon-escaping-underscores
Open

fix: escape colons in metric names when escaping=underscores#1180
johnsaurabh wants to merge 3 commits into
prometheus:masterfrom
johnsaurabh:fix/openmetrics-colon-escaping-underscores

Conversation

@johnsaurabh
Copy link
Copy Markdown

Fixes #1177

When a metric name contains a colon (e.g. sglang:token_usage) and the client negotiates escaping=underscores, the # HELP and # TYPE lines were emitting the raw name while the sample line replaced the colon with an underscore:

# HELP sglang:token_usage Total token usage.
# TYPE sglang:token_usage gauge
sglang_token_usage 42.0

This violates the OpenMetrics spec and causes strict parsers to treat the metadata and the sample as two separate metrics.

The bug is in escape_metric_name in prometheus_client/openmetrics/exposition.py. In UNDERSCORES mode, the function short-circuits for any name that matches the legacy Prometheus metric name regex. That regex allows colons, so names like sglang:token_usage were returned unchanged. The sample line uses _is_legacy_labelname_rune which does not allow colons, so it escaped them correctly.

The fix adds a colon check to the short-circuit condition and switches the fallback _escape call to use _is_legacy_labelname_rune, matching what the sample line already does.

Two existing test expectations that documented the old behavior are updated. A new end-to-end test reproduces the exact scenario from the issue.

@csmarchbanks

In UNDERSCORES escaping mode, escape_metric_name was short-circuiting
for names that matched the legacy Prometheus metric name regex, which
allows colons. This caused # HELP and # TYPE lines to emit the raw name
(e.g. sglang:token_usage) while sample lines, which use
_is_legacy_labelname_rune, correctly replaced the colon with an
underscore (e.g. sglang_token_usage). The mismatch violates the
OpenMetrics standard and breaks strict parsers.

Fix by requiring the name to also be colon-free before taking the
no-op path, and switching the fallback _escape call to use
_is_legacy_labelname_rune so colons are replaced consistently.

Fixes prometheus#1177

Signed-off-by: John Saurabh <itsjohnsaurabh@gmail.com>
Add test_gauge_colon_in_name_escaped_underscores to verify that a gauge
named sglang:token_usage produces consistent # HELP, # TYPE, and sample
lines when escaping=underscores is in effect.

Update two parametrized scenario expectations that documented the old
buggy behaviour (colon preserved in UNDERSCORES mode):
- "legacy valid metric name": no:escaping_required -> no_escaping_required
- "metric name with dots and colon": http_status:sum -> http_status_sum

Signed-off-by: John Saurabh <itsjohnsaurabh@gmail.com>
Mirror the same expectation corrections made to
tests/openmetrics/test_exposition.py: under UNDERSCORES mode, colons in
metric names are now replaced with underscores to match sample line
output.

Signed-off-by: John Saurabh <itsjohnsaurabh@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenMetrics content negotiation: escaping=underscores causes name mismatch between HELP/TYPE metadata and sample lines for metrics with colons

1 participant