Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix interference between extension and Pyroscope SDK #13

Merged
merged 3 commits into from
Jan 2, 2025

Conversation

aleks-p
Copy link
Contributor

@aleks-p aleks-p commented Dec 27, 2024

Provides a solution for #1, #10 and #12. This is just a proposal, there are perhaps other alternatives that could work as well.

Background

When building the extension jar, we bundle a relocated Pyroscope SDK and async profiler binaries. We do this so that profiling itself and the tracing-profiling correlation can be achieved with no user-level code modifications. A user simply adds the open telemetry java agent and registers the pyroscope-otel jar as an extension:

OTEL_JAVAAGENT_EXTENSIONS=./pyroscope-otel.jar java -javaagent:opentelemetry-javaagent.jar ...

Limitations

If an application already depends on the Pyroscope SDK (e.g., to add custom labels), and we want to correlate traces and profiles, we are faced with a problem. We effectively have 2 self-contained Pyroscope jars in the system, one from the SDK itself and one for the otel extension. Furthermore, the otel extension is loaded via an isolated class loader, which means we can't easily detect whether the Pyroscope SDK is present or not.

In some cases this will result in an exception like in #1 and #10. in other cases (#12, depending on used versions) we will end up with 2 profilers running and an undefined profiler output.

Proposed solution

We introduce a ProfilerApi in the Pyroscope SDK, exposing basic profiling functionalities needed by this otel extension. When the extension runs, it will try to load an implementation of this API using the system class loader. If it succeeds, we use it for all future profiler interactions. If it fails for any reason, we do everything as we do today.

Depends on grafana/pyroscope-java#176.

Caveats

The extension runs early in the boot process. For it to be able to "see" the ProfilerApi implementation, we need to run Pyroscope in agent mode and the agent needs to be loaded before the open telemetry one:

OTEL_JAVAAGENT_EXTENSIONS=./pyroscope-otel.jar java -javaagent:pyroscope.jar -javaagent:opentelemetry-javaagent.jar ...

Users that don't explicitly need the Pyroscope SDK in their applications can omit this and run the extension like they do today:

OTEL_JAVAAGENT_EXTENSIONS=./pyroscope-otel.jar java -javaagent:opentelemetry-javaagent.jar 

Copy link
Collaborator

@korniltsev korniltsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@Override
public void close() {
try {
Method closeMethod = ctx.getClass().getDeclaredMethod("close");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm a bit worried about performance of using reflection here. If this becomes a problem we can try generate a java class at runtime

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I'll keep an eye. Also, we need to add some test automation to this repo, I'll create an issue for this.

@aleks-p aleks-p force-pushed the fix/extension-sdk-interference branch from e4622a1 to 827fbaf Compare January 2, 2025 16:14
@aleks-p aleks-p merged commit f778008 into main Jan 2, 2025
1 check passed
@aleks-p aleks-p deleted the fix/extension-sdk-interference branch January 2, 2025 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants