Skip to content

std.debug: fix some corner cases #23927

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

rootbeer
Copy link
Contributor

Add infinite loop detection to the std.debug backtraces. Make the backtrace and stacktrace code more robust on corner-case architectures.

Expand the "unwind.zig" test case to exercise std.debug.dumpCurrentStackTrace(). And trigger a signal handler so the test can exercise std.debug.dumpStackTraceFromBase() and std.debug.StackIterator.initWithContext() using a kernel-constructed context.

This is preparation for moving std.debug away from getContext() (#23801).

@@ -959,8 +985,8 @@ pub fn writeCurrentStackTrace(
start_addr: ?usize,
) !void {
if (native_os == .windows) {
var context: ThreadContext = undefined;
assert(getContext(&context));
var context: windows.CONTEXT = std.mem.zeroes(windows.CONTEXT);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the zeroing necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. I'm manually inlining std.debug.getContext here, and it already does the zeroing. I don't see any documentation that suggests this is necessary or not, though.

@alexrp alexrp self-assigned this May 19, 2025
// Getting the backtrace inside the signal handler (with the ucontext_t)
// gets stuck in a loop on some systems:
const expect_signal_frame_overflow =
(native_arch == .arm and link_libc); // loops above main()
Copy link
Member

@alexrp alexrp May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

glibc vs musl? armeb, thumb, thumbeb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't investigated the failure cases too closely yet. I'm just trying to get the test to compile and not blow up on every architecture. And I'm trying to avoid watering the test down too much on the platforms where it works reliably. That said, the failure I see for this one is when statically linking musl to the test case.

I haven't been building the other ARM variants, so I'll try and mix those in too.

Comment on lines 39 to 44
native_arch == .mips or
native_arch == .mipsel or
native_arch == .mips64 or
native_arch == .mips64el or
native_arch == .powerpc64 or
native_arch == .powerpc64le;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally they don't seem to generate traces at all (either through dumpCurrentStackTrace or with StackIterator). Zig doesn't have a ucontext_t on MIPS. I'm not sure what's up with the PowerPC ones.

On a related note, Zig CI failed on aarch64-linux (no libc) because the stack trace gets stuck in a loop above main (see https://github.com/ziglang/zig/actions/runs/15104670128/job/42451228605?pr=23927). So I've added .aarch64 to this list of ignorable failures for now. Its flaky though. The test sometimes builds traces without looping for me locally, and sometimes not. (From what I can tell a specific build of test test is deterministic, but across multiple builds its not.)

rootbeer added 2 commits May 19, 2025 14:36
This test creates three nested stack frames and then tests stack trace
creation.  Add some additional tests of stack traces by invoking
"dumpCurrentStackTrace()" and by using a signal handler's "context"
parameter to feed backtrace construction.

Make the test case at least runnable on a wide variety of systems
(including Windows, and WASI).  Because `ucontext_t` and `getcontext` are
not evenly supported everywhere, some systems are expected only get
through parts of the test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants