Skip to content

Flatten 1:1 shape GET child spans into Plug_shape_get root-span attributes#4561

Merged
alco merged 2 commits into
mainfrom
flatten-shape-get-spans
Jun 11, 2026
Merged

Flatten 1:1 shape GET child spans into Plug_shape_get root-span attributes#4561
alco merged 2 commits into
mainfrom
flatten-shape-get-spans

Conversation

@erik-the-implementer

Copy link
Copy Markdown
Contributor

Motivation

We are over our Honeycomb event budget, and two spans are an outsized contributor: shape_get.api.load_shape_info and shape_get.plug.serve_shape_log together produce ~147M events/week — roughly 33% of the org's event budget.

Both spans are strictly 1:1 with the Plug_shape_get root span (one load_shape_info per request; one serve_shape_log per request on the log-serving path), and neither carries information that can't live on the root span. load_shape_info is a leaf span; serve_shape_log's only children are the shape_get.plug.stream_chunk spans, which are untouched by this PR and now parent directly to the root.

What changed

  • New helper Electric.Telemetry.OpenTelemetry.with_flattened_span/2: runs a function and records its duration and process-memory footprint as attributes on the current span instead of emitting a child span.
  • Electric.Shapes.Api.do_load_shape_info/1 and serve_shape_log/1 now use it instead of with_span/4. No other spans were changed (shape_get.plug.serve_subset_response and shape_get.plug.stream_chunk are left as-is).
  • Changeset: patch bump for @core/sync-service.

What's preserved, and where

All numeric data the child spans used to carry now lands on the Plug_shape_get root span:

Was (child span) Now (root-span attribute)
shape_get.api.load_shape_info duration load_shape_info.duration_ms (float ms, sub-ms precision)
its memory.start.{process,binary}_bytes / memory.end.{process,binary}_bytes load_shape_info.memory.start.* / load_shape_info.memory.end.*
shape_get.plug.serve_shape_log duration serve_shape_log.duration_ms
its memory.start.* / memory.end.* serve_shape_log.memory.start.* / serve_shape_log.memory.end.*

The memory attributes are prefixed with the flattened span's short name so they don't collide with the root span's own memory.start.* / memory.end.* attributes.

Error semantics are unchanged: attributes (including duration) are recorded in an after block even when the wrapped code raises, and the exception propagates up to ServeShapePlug's error handler, which records the exception event and sets error status on the root span exactly as before (the child spans never carried error status themselves — :otel_tracer.with_span/4 only ends the span on raise).

Trade-offs

  • Trace waterfalls lose the visual split between "load shape info" and "serve shape log" inside a request — but no numeric data is lost; the same timings are queryable as root-span attributes.
  • The [:electric, :shape_get, :api, :load_shape_info, :start|:stop|:exception] and [:electric, :shape_get, :plug, :serve_shape_log, ...] :telemetry.span events are no longer emitted. A repo-wide search found no handlers attached to them.
  • In embedded (elixir-client) usage, Api.serve_shape_log/1 previously created a root span of its own; with no enclosing span the helper is now a no-op (it just runs the function), so embedded calls produce no span — consistent with the unsampled-request path.

Verification

  • mix test test/electric/telemetry/open_telemetry_test.exs test/electric/shapes/api_test.exs test/electric/plug/serve_shape_plug_test.exs test/electric/plug/serve_shape_plug_logging_test.exs — 121 tests, 0 failures.
  • New unit tests for with_flattened_span/2 cover attribute recording, the raise path, and the no-span-context no-op.

🤖 Generated with Claude Code

…butes

shape_get.api.load_shape_info and shape_get.plug.serve_shape_log are
emitted once per shape GET request, strictly 1:1 with the Plug_shape_get
root span (~147M Honeycomb events/week, ~33% of the event budget) while
carrying no information that can't live on the root span.

Replace them with OpenTelemetry.with_flattened_span/2, which records
<name>.duration_ms and <name>.memory.{start,end}.* attributes on the
current span instead of emitting a child span. Attributes are recorded
even when the wrapped function raises; exceptions still propagate to the
plug's error handler, which records them on the root span as before.

shape_get.plug.stream_chunk and shape_get.plug.serve_subset_response are
left untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.07%. Comparing base (683cfae) to head (50413b4).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #4561   +/-   ##
=======================================
  Coverage   58.06%   58.07%           
=======================================
  Files         369      369           
  Lines       40459    40459           
  Branches    11468    11469    +1     
=======================================
+ Hits        23494    23498    +4     
+ Misses      16890    16887    -3     
+ Partials       75       74    -1     
Flag Coverage Δ
packages/agents 71.37% <ø> (ø)
packages/agents-mcp 77.54% <ø> (ø)
packages/agents-mobile 75.49% <ø> (ø)
packages/agents-runtime 82.13% <ø> (ø)
packages/agents-server 74.81% <ø> (+0.04%) ⬆️
packages/agents-server-ui 6.25% <ø> (ø)
packages/electric-ax 46.42% <ø> (ø)
packages/experimental 87.73% <ø> (ø)
packages/react-hooks 86.48% <ø> (ø)
packages/start 82.83% <ø> (ø)
packages/typescript-client 91.83% <ø> (+0.11%) ⬆️
packages/y-electric 56.05% <ø> (ø)
typescript 58.07% <ø> (+<0.01%) ⬆️
unit-tests 58.07% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@claude

claude Bot commented Jun 11, 2026

Copy link
Copy Markdown

Claude Code Review

Summary

This PR replaces two OTel child spans (shape_get.api.load_shape_info and shape_get.plug.serve_shape_log) — both strictly 1:1 with the Plug_shape_get root span — with a new OpenTelemetry.with_flattened_span/2 helper that records the same timing/memory data as prefixed attributes on the current span, to cut ~33% of the org's Honeycomb event budget. The change remains small, correct, well-tested, and exceptionally well-documented. I found no correctness issues, and the two open suggestions from my previous review have now been addressed.

What's Working Well

  • The motivation is sound and the mechanism is correct. Honeycomb bills per span/event, not per attribute, so moving 1:1 child-span data onto existing root-span attributes genuinely removes events without losing queryable data.
  • The in_span_context?() gate is exactly right. Because Electric samples at span-creation time (Sampler), when the root span isn't sampled there's no current span context, so with_flattened_span no-ops and avoids the two Process.info/2 calls — matching the existing with_span behavior.
  • Attribute-collision reasoning checks out. prefix_attrs/2 namespaces the memory keys under load_shape_info.* / serve_shape_log.*, which are disjoint from each other and from the root span's own unprefixed memory.start.*/memory.end.*, so nothing is clobbered.
  • Error semantics are preserved. The after block records duration/memory on raise and re-raises; the root span's existing handler (ServeShapePlug.handle_errorsrecord_exception/4) still owns error status, matching the old child-span behavior.
  • Tests cover the three meaningful paths (attribute recording, raise path, no-span no-op), using the project's Repatch convention with async: true. Codecov reports all modified lines covered.
  • Changeset present and now thorough (@core/sync-service, patch).

Issues Found

Critical (Must Fix)

None.

Important (Should Fix)

None.

Suggestions (Nice to Have)

  1. duration_ms as a float vs. the existing *_µs integer convention (carried over from iteration 1, minor). Other timings in this module (timed_fun/3) record integer microseconds via :timer.tc; with_flattened_span records float milliseconds. Both are fine and the unit is documented in the @doc and changeset — flagging only in case dashboards expect a uniform unit/type across span attributes. No action required.

Issue Conformance

No linked issue (per the prompt context). The PR description and changeset together are unusually thorough — they document the preserved-data mapping, error semantics, the exclude_spans implication, and embedded-mode behavior. Per project convention a PR should still reference an issue; consider linking the Honeycomb-budget tracking issue if one exists.

Previous Review Status

This is an incremental review (iteration 2). Since iteration 1, commit 50413b4d ("Document exclude_spans and embedded-mode implications in changeset") addresses both open suggestions:

  • Suggestion 1 (:exclude_spans no longer applies) — the changeset now includes an explicit operator note that listing these names in ELECTRIC_EXCLUDE_SPANS has no effect since they're no longer spans.
  • Suggestion 2 (embedded-mode observability) — the changeset documents the embedded-mode behavior, and is actually more precise than my original framing: rather than a pure no-op, the helper now records attributes onto the caller's current span if one exists (e.g. a Phoenix request span in OTel-instrumented apps) and emits nothing otherwise.

The remaining suggestion (3, float vs integer units) is a documented stylistic choice and needs no change. No new issues introduced; the implementation is unchanged from iteration 1 and remains correct.


Review iteration: 2 | 2026-06-11

@alco alco self-assigned this Jun 11, 2026

@robacourt robacourt left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@alco alco merged commit 0947230 into main Jun 11, 2026
76 of 77 checks passed
@alco alco deleted the flatten-shape-get-spans branch June 11, 2026 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants