-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
fix fork cache bug in MemoryAccessor.zig #24376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
lib/std/debug/MemoryAccessor.zig
Outdated
if (newpid != pid) { | ||
@atomicStore(posix.pid_t, &cached_pid, newpid, .monotonic); | ||
continue :linux -1; | ||
} else unreachable; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be cleaner to write this as
assert(newpid != pid);
@atomicStore(...
lib/std/debug/MemoryAccessor.zig
Outdated
.INVAL, .SRCH => unreachable, // own pid is always valid | ||
.INVAL => unreachable, // EINVAL is returned when flags to process_vm_readv is non-zero | ||
.SRCH => { | ||
// process with pid does not exist, but it should be our pid, so we'll try again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we please explain an example of a case in which this can happen (post-fork()) in the comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't sure how much detail to add, do you think this is enough?
What if the parent process has |
I agree, I considered both 1) deleting cached_pid completely and 2) exposing it as pub to give callers a work around. I reject both because 1) seem to be more a more controversial change that I wasn't prepared to defend, and 2) it seemed strictly worse than both this change and deleting cached_pid, it also felt like leaving in a foot gun that didn't need to exist, or be someone else's problem. (I also considered creating a function to invalidate/reset the cache that I would also call from fork(), but enshrining this as desirable felt like a mistake I'd regret later when rambling about tech debt, lol) |
I feel like this would be the right move. This is stack trace code; getting correct and useful results is more important than getting them slightly faster, IMO. But cc @jacobly0 since IIRC he wrote this code. |
Why can't you just move the global to a field and initialize it to -1 every time the struct is initialized? |
I guess if I were to be super pedantic, it's possible for a signal to be delivered while we're doing a stack trace, and for the signal handler to But this is obviously a fairly ridiculous edge case. |
|
If you fork in a signal handler... I feel like that's an implicit agreement that you enjoy fighting dragons, and then this behavior would be "expected" :P
I was assuming there was a reason it was a file |
The goal is to delete this file once the actual bugs it is working around are fixed anyway. |
This is also true if you call |
lib/std/debug/MemoryAccessor.zig
Outdated
@@ -38,8 +38,8 @@ pub fn deinit(ma: *MemoryAccessor) void { | |||
|
|||
fn read(ma: *MemoryAccessor, address: usize, buf: []u8) bool { | |||
switch (native_os) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also try deleting this switch entirely and seeing if CI still passes. I just did a preliminary test which suggests that the issue this was working around is no longer present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jacobly0 I think is what you had in mind?
I didn't scan through the commit history to understand if this would be the simplest revert, and also because I don't have the context around the struct, I'm very much depending on CI + review here to validate this change. Let me know if I missed something obvious?
b4b2871
to
59b3ebd
Compare
wrote this quickly via the web interface, but didn't validate it yet. Submitting the PR for comments and I can fix this up if the approach seems reasonable.