fix: inconsistent interceptor span emission when workflow is in replay [WIP] #2060
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What was changed
This PR addresses an issue that causes
RunWorkflow:*
spans to be emitted when workflows are viewed in the UI (i.e in replay). To make the behavior consistent,startNonReplaySpan
was extracted fromtracingWorkflowOutboundInterceptor
and simplified to check the workflow context and return a no-op span when the workflow is in replay mode. ThenstartNonReplaySpan
is reused in all places where a workflow context is in scope.Why?
Discern uses the
RunWorkflow:*
otel span to alert on complete workflow failures (i.e a workflow that has died indefinitely). When viewing the temporal UI in the cloud console, our monitoring is re-triggered because the query to the worker replays the workflow history and re-emits theRunWorkflow
span.Historical context in slack
How was this tested:
I am still in the process of testing this using go workspaces by aliasing my branch of
temporalio/sdk-go
in our Go environment.