Detect trace conflicts in Kprobe context propagation #557
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This fix detects conflicts on traces for runtimes that use single threaded queued dispatches, e.g. NodeJS. In these situations there's one worker thread, which uses a queue to pick up work and do all of the communication. Our current kprobes support for matching server -> client spans uses the thread id as an identifier, which ends up being wrong in this scenario.
This fix essentially checks if there's an existing active server span for the same thread it that hasn't finished and if there's one, it marks it as invalid. The client spans pull the server span info, if the span is invalid, they ignore it and generate new trace id. This effectively break the server -> client chain, but at least it doesn't make the last server span a parent of all client spans.
I think we'll be able to full implement support for NodeJS by finding appropriate uprobe we can attach in the node runtime, which will allow us perhaps to find a different identifier for the server span, than the thread id. For now NodeJS will not be able to propagate context.