Skip to content

[FFL-2449] Add server-side flag evaluation metrics documentation#37257

Open
vjfridge wants to merge 12 commits into
masterfrom
vickie/FFL-2449-server-flag-eval-metrics-docs
Open

[FFL-2449] Add server-side flag evaluation metrics documentation#37257
vjfridge wants to merge 12 commits into
masterfrom
vickie/FFL-2449-server-flag-eval-metrics-docs

Conversation

@vjfridge
Copy link
Copy Markdown
Contributor

@vjfridge vjfridge commented Jun 4, 2026

What does this PR do? What is the motivation?

Fixes FFL-2449

Adds public documentation for setting up server-side flag evaluation metrics, which were previously undocumented beyond a one-liner env var mention. The setup requires enabling the Datadog Agent OTLP receiver and pointing the application at it — neither of which existed in any public docs.

Changes

  • New guide page feature_flags/guide/server_flag_evaluation_metrics — step-by-step setup for Agent OTLP receiver, application env vars, metric verification, Historical Metrics retention, and a dashboard query reference. Includes a minimum tracer version table per SDK and marks feature_flag.evaluations as experimental.
  • New concepts page feature_flags/concepts/flag_graphs — describes the graphs on the flags list page and flag details page (targeting rule distribution, server evaluations, client evaluations, errors/latency, export to dashboard) for both client and server SDKs.
  • Updated feature_flags/server/_index.md — existing DD_METRICS_OTEL_ENABLED alert updated to note the metric is experimental and link to the new guide.
  • Updated getting_started/feature_flags/_index.md — Step 5 now references the new metrics guide for server-side apps.
  • Updated feature_flags/guide/_index.md and feature_flags/concepts/_index.md — new pages added to navigation indexes.
  • All 6 server SDK pages (dotnet, go, java, nodejs, python, ruby) — added experimental warning alert next to the DD_METRICS_OTEL_ENABLED env var.

Merge instructions

Merge readiness:

  • Ready for merge

vjfridge and others added 5 commits June 4, 2026 09:47
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… metrics guide

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rver SDK pages

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vjfridge vjfridge requested a review from a team as a code owner June 4, 2026 14:02
@vjfridge
Copy link
Copy Markdown
Contributor Author

vjfridge commented Jun 4, 2026

@codex review

@github-actions github-actions Bot added the Guide Content impacting a guide label Jun 4, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7b42c572b5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread content/en/feature_flags/concepts/flag_graphs.md Outdated
@@ -0,0 +1,146 @@
---
title: Set Up Server-Side Flag Evaluation Metrics
Copy link
Copy Markdown
Contributor Author

@vjfridge vjfridge Jun 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,72 @@
---
title: Feature Flag Graphs
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions github-actions Bot added the Images Images are added/removed with this PR label Jun 4, 2026
@@ -8,6 +8,9 @@ further_reading:
- link: "/remote_configuration/"
tag: "Documentation"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vjfridge and others added 2 commits June 4, 2026 12:55
…ng links

Remove DD_METRICS_OTEL_ENABLED from all server SDK pages and replace with
comments pointing to the setup guide, matching the _index.md pattern. Add
further_reading links to server_flag_evaluation_metrics and flag_graphs on
all SDK pages.

Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
@@ -8,6 +8,12 @@ further_reading:
- link: "/tracing/trace_collection/dd_libraries/dotnet-core/"
tag: "Documentation"
text: ".NET Tracing"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -8,6 +8,12 @@ further_reading:
- link: "/tracing/trace_collection/dd_libraries/go/"
tag: "Documentation"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -8,10 +8,18 @@ further_reading:
- link: "/tracing/trace_collection/automatic_instrumentation/dd_libraries/java/"
tag: "Documentation"
text: "Java APM and Distributed Tracing"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -11,6 +11,12 @@ further_reading:
- link: "/tracing/"
tag: "Documentation"
text: "Learn about Application Performance Monitoring (APM)"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -8,6 +8,12 @@ further_reading:
- link: "/tracing/trace_collection/dd_libraries/python/"
tag: "Documentation"
text: "Python Tracing"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -11,6 +11,12 @@ further_reading:
- link: "/tracing/"
tag: "Documentation"
text: "Learn about Application Performance Monitoring (APM)"
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OliviaShoup OliviaShoup added the editorial review Waiting on a more in-depth review label Jun 4, 2026
Copy link
Copy Markdown
Contributor

@sameerank sameerank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on! I did some cross-referencing and verifying to tighten up the information. Feel free to let me know if anything is unclear

Comment on lines +47 to +53
{{< code-block lang="bash" >}}
# gRPC endpoint (port 4317)
DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_GRPC_ENDPOINT=0.0.0.0:4317

# HTTP endpoint (port 4318)
DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_HTTP_ENDPOINT=0.0.0.0:4318
{{< /code-block >}}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of this is already covered in otlp_ingest_in_the_agent.md

2. For the Datadog Agent container, set the following endpoint environment variables and expose the corresponding port:
- For gRPC: Set `DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_GRPC_ENDPOINT` to `0.0.0.0:4317` and expose port `4317`.

So I might be better to just link to that doc which I assume is the canonical one instead of duplicating it here


You only need to enable the protocol your application uses. Both gRPC and HTTP are shown for reference.

<div class="alert alert-info">If you are running Agent v7.61.0 or later in Docker, set <code>HOST_PROC=/proc</code> on the Agent container to work around a known issue with the OTLP pipeline.</div>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also covered here

1. Set the environment variable <code>HOST_PROC</code> to <code>/proc</code> in your Agent Docker container.<br>

OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://<AGENT_HOST>:4318/v1/metrics

# Or use gRPC (no path suffix):
# OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://<AGENT_HOST>:4317
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTEL_EXPORTER_OTLP_METRICS_ENDPOINT is valid, but the canonical docs seem to prefer OTEL_EXPORTER_OTLP_ENDPOINT, which I believe accomplishes the same

Set the `OTEL_EXPORTER_OTLP_ENDPOINT` environment variable in your application's environment:
For gRPC:
```shell
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4317"
```
For HTTP:
```shell
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318"
```

I'm actually not sure which one to use. OTEL_EXPORTER_OTLP_METRICS_ENDPOINT is more granular and only applies to OTLP for metrics, while OTEL_EXPORTER_OTLP_ENDPOINT covers all OTLP

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the default OTLP protocol is http/protobuf; pointing at :4317 without OTEL_EXPORTER_OTLP_PROTOCOL=grpc sends HTTP to the gRPC port and fails. So we also need to mention the protocol var for the gRPC option

Comment on lines +59 to +75
## Step 2: Configure your application

Set the following environment variables on your application in addition to the standard [server-side feature flag configuration][1]:

{{< code-block lang="bash" >}}
# Enable flag evaluation metrics
DD_METRICS_OTEL_ENABLED=true

# Point OTLP metrics at the Datadog Agent
# HTTP endpoint (note the /v1/metrics path suffix):
OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://<AGENT_HOST>:4318/v1/metrics

# Or use gRPC (no path suffix):
# OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://<AGENT_HOST>:4317
{{< /code-block >}}

Replace `<AGENT_HOST>` with the hostname or IP address of your Datadog Agent. In a Docker Compose setup, this is typically the Agent container's service name.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most cases, the endpoints don't need to be set. These variables were used in ffe-dogfooding mainly to control if the metrics were being sent to the agent vs. a special container for reporting the counts in http://localhost:8080/dashboard

Suggested change
## Step 2: Configure your application
Set the following environment variables on your application in addition to the standard [server-side feature flag configuration][1]:
{{< code-block lang="bash" >}}
# Enable flag evaluation metrics
DD_METRICS_OTEL_ENABLED=true
# Point OTLP metrics at the Datadog Agent
# HTTP endpoint (note the /v1/metrics path suffix):
OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://<AGENT_HOST>:4318/v1/metrics
# Or use gRPC (no path suffix):
# OTEL_EXPORTER_OTLP_METRICS_ENDPOINT=http://<AGENT_HOST>:4317
{{< /code-block >}}
Replace `<AGENT_HOST>` with the hostname or IP address of your Datadog Agent. In a Docker Compose setup, this is typically the Agent container's service name.
## Step 2: Configure your application
Set the following environment variable on your application, in addition to the
standard [server-side feature flag configuration][1]:
{{< code-block lang="bash" >}}
# Enable flag evaluation metrics
DD_METRICS_OTEL_ENABLED=true
{{< /code-block >}}
By default, most tracers send OTLP metrics to the Agent at `DD_AGENT_HOST` on port
`4318`. If your application already sets `DD_AGENT_HOST` to reach the Agent, no
endpoint configuration is required.
Set an OTLP endpoint explicitly in either of these cases:
- The Agent is not reachable at `DD_AGENT_HOST` on the default OTLP port (for example,
a remote Agent or a non-default port).
- You use the **Java** tracer. The Java tracer does not derive the endpoint from
`DD_AGENT_HOST`; it defaults to `localhost:4318`. Set the endpoint whenever the
Agent is not on `localhost`.
To set the endpoint, use the standard OpenTelemetry variable:
{{< code-block lang="bash" >}}
# Point OTLP data at the Datadog Agent (HTTP, port 4318)
OTEL_EXPORTER_OTLP_ENDPOINT=http://<AGENT_HOST>:4318
# Or use gRPC (port 4317). The default protocol is http/protobuf, so you must also
# set the protocol to grpc when using the gRPC port:
# OTEL_EXPORTER_OTLP_ENDPOINT=http://<AGENT_HOST>:4317
# OTEL_EXPORTER_OTLP_PROTOCOL=grpc
{{< /code-block >}}
Replace `<AGENT_HOST>` with the hostname or IP address of your Datadog Agent. In a
Docker Compose setup, this is typically the Agent container's service name. To set the
metrics endpoint independently of other OTLP signals, use
`OTEL_EXPORTER_OTLP_METRICS_ENDPOINT` instead, and append the `/v1/metrics` path for HTTP.

Before setting up flag evaluation metrics:

- Server-side feature flags are already configured. See [Server-Side Feature Flags][1].
- Datadog Agent 7.55 or later is running.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs say

Since versions 6.32.0 and 7.32.0

OTLP Ingest in the Agent is a way to send telemetry data directly from applications instrumented with [OpenTelemetry SDKs][1] to Datadog Agent. Since versions 6.32.0 and 7.32.0, the Datadog Agent can ingest OTLP traces and [OTLP metrics][2] through gRPC or HTTP. Since versions 6.48.0 and 7.48.0, the Datadog Agent can ingest OTLP logs through gRPC or HTTP.

I don't see this number in ffe-dogfooding either, so this might be over-restrictive unless there's some other source for this

Comment on lines +34 to +41
| Language | Minimum tracer version |
| -------- | ---------------------- |
| .NET | 3.44.0 |
| Go | 2.8.0 |
| Java | 1.62.0 |
| Node.js | 5.99.0 |
| Python | 4.7.0 |
| Ruby | 2.32.0 |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add PHP soon DataDog/dd-trace-php#3911

And according to DataDog/system-tests#7033 we expect it in the not-yet-released v1.21.1

1. Go to [Metrics Explorer][2] and search for `feature_flag.evaluations`.
2. If the metric does not appear within a few minutes of your application evaluating flags, check:
- The Agent OTLP receiver is enabled and the correct port is exposed.
- `OTEL_EXPORTER_OTLP_METRICS_ENDPOINT` points to the Agent, not a separate collector.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might also update this depending on what you update in step 2 re: OTEL_EXPORTER_OTLP_ENDPOINT vs OTEL_EXPORTER_OTLP_METRICS_ENDPOINT

Comment on lines +83 to +85
- DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_GRPC_ENDPOINT=0.0.0.0:4317
- DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_HTTP_ENDPOINT=0.0.0.0:4318
- HOST_PROC=/proc # Required for Agent v7.61.0+ running in Docker
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You only need one or the other, but this reads like both are required. We needed both in ffe-dogfooding because python defaults to gRPC and the rest of the SDKs use HTTP

2. For the Datadog Agent container, set the following endpoint environment variables and expose the corresponding port:
- For gRPC: Set `DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_GRPC_ENDPOINT` to `0.0.0.0:4317` and expose port `4317`.
- For HTTP: Set `DD_OTLP_CONFIG_RECEIVER_PROTOCOLS_HTTP_ENDPOINT` to `0.0.0.0:4318` and expose port `4318`.

"# Required for Agent v7.61.0+ running in Docker" is a bit of an overstatement because it's one of a few workarounds for a known issue. I'd rephrase to "If running Agent v7.61.0+ in Docker"

Copy link
Copy Markdown
Contributor

@sameerank sameerank Jun 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And actually that distinction between python/gRPC vs the rest is probably worth mentioning somewhere

Maybe an additional bullet in the Step 2 rewrite?

  - The **Python** tracer defaults to the gRPC protocol (Agent OTLP port `4317`), whereas the other tracers default to HTTP (port `4318`). Make sure the Agent receiver port you enabled in Step 1 matches, or set `OTEL_EXPORTER_OTLP_PROTOCOL` and the endpoint explicitly.

Comment on lines +52 to +53


Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming 2 blank lines was unintentional

Suggested change

| `feature_flag.key` | The flag key being evaluated |
| `feature_flag.result.variant` | The variant returned by the evaluation |
| `feature_flag.result.reason` | The reason for the evaluation result |
| `feature_flag.result.allocation_key` | The targeting rule id |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All six SDKs also attach error.type on error evaluations. This is an OTel standard key https://opentelemetry.io/docs/specs/semconv/registry/attributes/error/ Also worth noting that allocation_key is emitted only when present, i.e. it's conditional.

Also technically "targeting rule id" is not the right description for an allocation_key because it's a 1-to-many relationship. An allocation can contain multiple targeting rules. I don't think we officially have a definition anywhere and I don't think this table is the right place to get into it in any meaningful depth .. so maybe we can go with "The identifier for the evaluated allocation"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

editorial review Waiting on a more in-depth review Guide Content impacting a guide Images Images are added/removed with this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants