Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[elixir] Performance powered by OTel #538

Open
1 task
Tracked by #4
sl0thentr0py opened this issue Mar 10, 2023 · 16 comments
Open
1 task
Tracked by #4

[elixir] Performance powered by OTel #538

sl0thentr0py opened this issue Mar 10, 2023 · 16 comments
Assignees
Milestone

Comments

@sl0thentr0py
Copy link
Member

sl0thentr0py commented Mar 10, 2023

@krainboltgreene so there are several things that would need to be done to get performance tracing working with sentry-elixir. This SDK was so far handled by the community so it is not on par with what we call our Unified API for making common abstractions across all Sentry SDKs.

I will make 2 lists below, one of the bare minimum that would lead to quick opentelemetry based performance and the other more 'ideal' list where we make the SDK feature compatible in terms of performance with other SDKs.

Required

Optional

  • The elixir SDK is further also missing Hub (for concurrency) and Scope abstractions but technically they can be ignored if a quick path to opentelemetry support is desired.

I will make a new issue out of this to track it in sentry-elixir. Unfortunately, this is a non-trivial amount of development work, so I can't give you very clear cut instructions on how you can contribute, but feel free to try stuff out and make a PR if you're interested and we can collaborate.

If there is sufficient interest from the community, we can also potentially prioritize me working on this as well next quarter.

Originally posted by @sl0thentr0py in getsentry/sentry#40712 (reply in thread)

Other notes

Preview Give feedback
@tsloughter
Copy link

What does "Hub" for concurrency in Elixir look like? It looks to be a context store? Like how OpenTelemetry we use the process dictionary?

Does this mean Sentry does no intend to adopt the OpenTelemetry API?

Also, what is "opentelemetry based performance" mean here? Based on what I read of the SpanProcessor model it doesn't read like you'd be relying on the Otel SDK for span operations, so if you found performance better with OpenTelemetry I wouldn't be sure thats the case when combined with the SpanProcessor. But I don't yet fully understand what it is doing, so I may be wrong there :)

@sl0thentr0py
Copy link
Member Author

What does "Hub" for concurrency in Elixir look like? It looks to be a context store? Like how OpenTelemetry we use the process dictionary?

something like that yea, basically the Hub needs to be cloned per concurrency unit, so we'll need to spec out what that looks like for elixir.

Does this mean Sentry does no intend to adopt the OpenTelemetry API?

not directly, we have our own Tracing model and Ingestion so we will only support OpenTelemetry indirectly via the SpanProcessor pathway.

it doesn't read like you'd be relying on the Otel SDK for span operations

we'd rely on the Otel SDK for instrumenting and recording spans but we need to convert them to the Sentry model to be able to ingest and store them on our side, this is what the SpanProcessor would do.

cc @smeubank for high-level product design discussion ^

@tsloughter
Copy link

Now I see, so the SpanProcessor updates a global store of SpanId->SentrySpan and then OnEnd will update that based on the finished OpenTelemetry Span.

@thbar
Copy link

thbar commented Jun 19, 2023

If there is sufficient interest from the community, we can also potentially prioritize me working on this as well next quarter.

I'm not sure how to provide feedback on that, but we (at https://transport.data.gouv.fr) would be very interested to see performance monitoring supported for Elixir.

@josevalim
Copy link
Contributor

For what is worth, the Ruby one seems to be done via OpenTelemetry: https://docs.sentry.io/platforms/ruby/performance/instrumentation/opentelemetry/ ?

@tsloughter
Copy link

@josevalim I think it has both. And the OpenTelemetry option is awkward to implement -- uses a span processor to basically do a parallel tracking of spans. I don't know that there will be another option other than the processor though as long as Sentry requires the implementation to create Transactions.

@hkrutzer
Copy link

There is also a JS implementation using OpenTelemetry, indeed using a span processor.

@sl0thentr0py
Copy link
Member Author

as long as Sentry requires the implementation to create Transactions.

@tsloughter @josevalim we have an ongoing project to move away from our Transaction model gradually on the ingestion side, will keep this thread updated when we ship something production ready.

@jwaldrip
Copy link

Any update on this?

@whatyouhide
Copy link
Collaborator

@jwaldrip no, and we'll post updates if there are any, no worries!

@sl0thentr0py
Copy link
Member Author

I will actually start writing a spec for it this week!

@sl0thentr0py
Copy link
Member Author

oki, current status of sentry ingestion of otlp traces follows!

Spec

Business concerns

  • this will be shipped experimentally and as an alpha feature for starters, we will stabilize the feature and pricing somewhere at the end of Q2
  • elixir and node will be used as testing grounds for this new ingestion capability

Elixir SDK implications

  • elixir will not ship the old Transaction model like other SDKs at all, it will directly leverage OTLP and OpenTelemetry instrumentation
  • we will still want the Sentry SDK to be installed to setup the DSN / ingestion endpoint / trace exporter
  • the Sentry SDK setup will also take care of configuring sane defaults for the OpenTelemetry SDK for an 'out of the box' experience
  • also other things like sampling / trace propagation TBD - can be ignored for an MVP

I will start playing around with OpenTelemetry SDKs and exporters this week and update once ingestion works end-end.

whatyouhide added a commit that referenced this issue Feb 3, 2024
This was not released yet. See #538 (comment).
whatyouhide added a commit that referenced this issue Feb 3, 2024
This was not released yet. See #538 (comment).
@sl0thentr0py
Copy link
Member Author

sl0thentr0py commented Feb 12, 2024

update:
I have now added protobuf ingestion support because the elixir/erlang opentelemetry exporters only had http/protobuf support and not json.
getsentry/relay#3044

I will now test ingestion since this is in production.

@whatyouhide
Copy link
Collaborator

@sl0thentr0py lol this is fantastic news, I had already started working on the JSON export support 😄

@smeubank smeubank changed the title Investigate quickest path to otel + performance tracing [elixir] Performance powered by OTel Feb 20, 2024
@whatyouhide whatyouhide changed the title [elixir] Performance powered by OTel Performance powered by OTel Feb 24, 2024
@smeubank smeubank changed the title Performance powered by OTel [elixir] Performance powered by OTel Mar 6, 2024
@sl0thentr0py sl0thentr0py assigned solnic and unassigned sl0thentr0py Sep 2, 2024
@jwaldrip
Copy link

jwaldrip commented Dec 4, 2024

Any progress on this. We would love to trace things back to our API. :-)

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Dec 4, 2024
@solnic
Copy link
Collaborator

solnic commented Dec 5, 2024

@jwaldrip hey Jason, yes! We're working on wrapping up #784 where traces are working via an OTel span processor, we're close! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

9 participants