Skip to content

Conversation

@constantinius
Copy link
Contributor

@constantinius constantinius requested a review from a team as a code owner January 7, 2026 11:19
@linear
Copy link

linear bot commented Jan 7, 2026

Copy link
Member

@Dav1dde Dav1dde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should definitely add tests for this, unit and integration tests.

…rocessing

- Refactor event duration calculation to use a unified approach.
- Update span processing to include AI measurements mapping.
- Enhance test cases for AI span detection with new data structures.
@constantinius constantinius requested a review from Dav1dde January 8, 2026 09:28
Copy link
Member

@Dav1dde Dav1dde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should still add at least 1 integration test to actually tests this end to end, with AI data in the transaction trace context as well as child spans.

Also enrich_ai_event_data doesn't have a single unittest, but that is the public interface which is actually called and where the bug is/was hidden.

…snapshot assertions

- Refactor span data initialization to use JSON for better clarity.
- Update assertions to utilize snapshot testing for AI operation types.
- Simplify span data handling in tests for improved readability.
…urements

- Update `enrich_ai_span_data` and `enrich_ai_event_data` functions to accept and utilize measurements.
- Remove legacy measurement mapping from span data processing for clarity.
- Ensure consistent handling of AI measurements across span and event data enrichment.
…enrichment

- Introduce new test cases for `enrich_ai_event_data` to validate behavior with various trace contexts and nested spans.
- Ensure accurate handling of AI operation types and measurements in the enriched event data.
- Utilize JSON input and snapshot assertions for improved test clarity and reliability.
@constantinius constantinius requested a review from Dav1dde January 8, 2026 13:12
@Dav1dde
Copy link
Member

Dav1dde commented Jan 8, 2026

As discussed on Slack, we'll also want to modify the existing AI integration test in test_ai.py to also include a trace context.

…vent-normalization/gen_ai-normalize-trace-context
- Added support for the `invoke_agent` operation type in the AI span tests.
- Included additional attributes for `gen_ai.invoke_agent`, such as operation name and token usage metrics.
- Improved clarity and consistency in transaction operations across multiple test cases.
.or(span.op.value())
.or(span_op.value())
.and_then(|op| operation_type_map.get_operation_type(op));

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The enrich_ai_span_data function is incorrectly passed event.measurements for the trace context, which can misattribute transaction-level metrics as AI data on the root span.
Severity: CRITICAL

🔍 Detailed Analysis

The function enrich_ai_span_data is called with &event.measurements to enrich the trace_context. This is incorrect because the TraceContext struct does not have a measurements field, and event.measurements contains transaction-level data like web vitals, not span-specific AI metrics. If a transaction-level measurement's name happens to match a legacy AI measurement name (e.g., ai_prompt_tokens_used), it will be incorrectly mapped to the root span's AI data fields. This can lead to corrupted AI token counts and incorrect cost calculations.

💡 Suggested Fix

The call to enrich_ai_span_data for the trace_context should not use event.measurements. Since TraceContext does not have its own measurements, an empty Measurements map should be passed to prevent incorrect data mapping from the event level. For example, pass &Default::default().

🤖 Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: relay-event-normalization/src/normalize/span/ai.rs#L252

Potential issue: The function `enrich_ai_span_data` is called with `&event.measurements`
to enrich the `trace_context`. This is incorrect because the `TraceContext` struct does
not have a `measurements` field, and `event.measurements` contains transaction-level
data like web vitals, not span-specific AI metrics. If a transaction-level measurement's
name happens to match a legacy AI measurement name (e.g., `ai_prompt_tokens_used`), it
will be incorrectly mapped to the root span's AI data fields. This can lead to corrupted
AI token counts and incorrect cost calculations.

Did we get this right? 👍 / 👎 to inform future reviews.
Reference ID: 8402706

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants