-
Notifications
You must be signed in to change notification settings - Fork 576
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix and detect race condition in clienttrace.go #5700
base: main
Are you sure you want to change the base?
Fix and detect race condition in clienttrace.go #5700
Conversation
instrumentation/net/http/httptrace/otelhttptrace/constants_test.go
Outdated
Show resolved
Hide resolved
instrumentation/net/http/httptrace/otelhttptrace/clienttrace_test.go
Outdated
Show resolved
Hide resolved
if !ct.useSpans { | ||
if err != nil { | ||
attrs = append(attrs, attribute.String(hook+".error", err.Error())) | ||
} | ||
if ct.root == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the conditions of this be tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very familiar with this codebase, I just tried to fix a race condition that I could repro locally.
What should I test instead of checking whether ct.root
is nil?
In my latest revision, I moved the test for ct.root
being nil to the top of the function and am successfully exiting early if it is nil.
I'm not sure that's right. I'm operating under the assumption that most functions that call events on ct.root
cannot do useful work until ct.root
is set and should just move on.
instrumentation/net/http/httptrace/otelhttptrace/clienttrace.go
Outdated
Show resolved
Hide resolved
for i := 1; i < workers; i++ { | ||
wg.Add(1) | ||
go func() { | ||
resp, err := client.Get("https://example.com") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn use httptest, not make an external network call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still won't work without an active network / if example.com is unavailable.
52345f4
to
eb1dc63
Compare
Friendly ping. |
ct.root can be updated without holding the mutex on ct. Also, addEvent can be called on a nil ct.root. Fix both of these issues by moving the mutex acquisition logic so that ct.root is only touched while the mutex is held.
4994805
to
647b6f3
Compare
|
||
// TestNewClientParallelismWithoutSubspans tests running many Gets on a client simultaneously, | ||
// which would trigger a race condition if root were not protected by a mutex. | ||
func TestNewClientParallelismWithoutSubspans(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test passes when run against main:
$ go test -race -count=20 .
ok go.opentelemetry.io/contrib/instrumentation/net/http/httptrace/otelhttptrace 5.443s
It does not seem to be testing the fix being applied here.
|
||
var wg sync.WaitGroup | ||
|
||
for i := 1; i < 10000; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot of goroutines being spawned here. What's the reasoning behind this? It is making this test quite resource intensive.
if ct.root == nil { | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is what is breaking
opentelemetry-go-contrib/instrumentation/net/http/httptrace/otelhttptrace/test/clienttrace_test.go
Lines 239 to 251 in b0dce52
func TestEndBeforeStartCreatesSpan(t *testing.T) { | |
sr := tracetest.NewSpanRecorder() | |
tp := trace.NewTracerProvider(trace.WithSpanProcessor(sr)) | |
otel.SetTracerProvider(tp) | |
ct := otelhttptrace.NewClientTrace(context.Background()) | |
ct.DNSDone(httptrace.DNSDoneInfo{}) | |
ct.DNSStart(httptrace.DNSStartInfo{Host: "example.com"}) | |
name := "http.dns" | |
spans := getSpansFromRecorder(sr, name) | |
require.Len(t, spans, 1) | |
} |
ct.root can be updated without holding the mutex on ct. Also, addEvent can be called on a nil ct.root.
Fix both of these issues by moving the mutex acquisition logic so that ct.root is only touched while the mutex is held.