You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In #25, I accidentally introduced an additional comment on evwait(sawcommevfd) in the testComm test. This was unintentional, and doesn't otherwise relate to the fixes introduced in that PR.
// TODO(acln): investigate the legitimacy of the following crutch.
//
// Wait for the parent to see that we changed our name, then exit.
//
// If we do not wait here, there is a terrible race condition waiting
// to happen: If we PR_SET_NAME in the child, then immediately exit,
// the other side may not see POLLIN on the comm record: it may see
// POLLHUP directly, even though a comm record was actually written
// to the ring in the meantime. Why we get POLLHUP directly, and not
// POLLIN before it, is unclear. The machinery to deal with this
// eventuality in the poller does not exist yet, and at the time
// when this comment was written, I have found no good solutions to
// this conundrum.
//
// So we live with it, but still try to make our test pass.
// evwait(sawcommevfd)
_=sawcommevfd
The current state for me is that if I apply the patch of #26 onto master at 4d8e4e5, then sudo ./perf.test -test.count=1000 -test.run=Record/Comm passes with no failures for me.
However, if I uncomment the evwait above, then the tests fail. In that case, ReadRecord doesn't return, even if the context deadline is set many seconds into the future. What I see is that the child ends up waiting for the signal sawcommevfd after changing its COMM, and never exits.
So the mystery is why ReadRecord doesn't return when this wait is present. It feels as though the kernel isn't respecting our wakeup events = 1. When the wait is commented out (erroneously by me), then the process exits, and we receive the event.
The text was updated successfully, but these errors were encountered:
In #25, I accidentally introduced an additional comment on
evwait(sawcommevfd)
in the testComm test. This was unintentional, and doesn't otherwise relate to the fixes introduced in that PR.This issue is here as a reminder to revisit this.
perf/record_test.go
Lines 528 to 544 in 4d8e4e5
The current state for me is that if I apply the patch of #26 onto master at 4d8e4e5, then
sudo ./perf.test -test.count=1000 -test.run=Record/Comm
passes with no failures for me.However, if I uncomment the
evwait
above, then the tests fail. In that case, ReadRecord doesn't return, even if the context deadline is set many seconds into the future. What I see is that the child ends up waiting for the signal sawcommevfd after changing its COMM, and never exits.So the mystery is why ReadRecord doesn't return when this wait is present. It feels as though the kernel isn't respecting our wakeup events = 1. When the wait is commented out (erroneously by me), then the process exits, and we receive the event.
The text was updated successfully, but these errors were encountered: