Skip to content

Commit 729d00d

Browse files
jameysharpsunfishcode
authored andcommitted
wasmtime: Check stack limits only on exit from wasm
Currently, we do an explicit check for stack overflow on entry to every WebAssembly function. But that costs some time, and is a significant performance hit for very short functions. This commit instead switches Wasmtime to relying on guard pages at the end of the stack to catch stack overflow, so the MMU does this check for "free". This means we may allow deeper recursion in guest code than we did before. To make this work, we need Wasmtime's signal handlers to recognize when a guest memory fault is in a stack guard page and report the appropriate stack-overflow trap code. Note that we can't turn host-code signals into guest traps, so the signal handlers have to verify that the signal occurred in guest code. When the guest calls host code (explicitly due to calling an imported host function, or implicitly due to a libcall inserted by Wasmtime or Cranelift), we also need to ensure that there is enough stack space available for the host code to not hit the guard pages. We do that by checking the stack limit that the embedder provided in the trampolines where we exit wasm.
1 parent b81d20d commit 729d00d

File tree

4 files changed

+42
-58
lines changed

4 files changed

+42
-58
lines changed

crates/cranelift/src/compiler.rs

Lines changed: 29 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -209,60 +209,6 @@ impl wasmtime_environ::Compiler for Compiler {
209209

210210
let mut func_env = FuncEnvironment::new(self, translation, types, wasm_func_ty);
211211

212-
// The `stack_limit` global value below is the implementation of stack
213-
// overflow checks in Wasmtime.
214-
//
215-
// The Wasm spec defines that stack overflows will raise a trap, and
216-
// there's also an added constraint where as an embedder you frequently
217-
// are running host-provided code called from wasm. WebAssembly and
218-
// native code currently share the same call stack, so Wasmtime needs to
219-
// make sure that host-provided code will have enough call-stack
220-
// available to it.
221-
//
222-
// The way that stack overflow is handled here is by adding a prologue
223-
// check to all functions for how much native stack is remaining. The
224-
// `VMContext` pointer is the first argument to all functions, and the
225-
// first field of this structure is `*const VMRuntimeLimits` and the
226-
// first field of that is the stack limit. Note that the stack limit in
227-
// this case means "if the stack pointer goes below this, trap". Each
228-
// function which consumes stack space or isn't a leaf function starts
229-
// off by loading the stack limit, checking it against the stack
230-
// pointer, and optionally traps.
231-
//
232-
// This manual check allows the embedder to give wasm a relatively
233-
// precise amount of stack allocation. Using this scheme we reserve a
234-
// chunk of stack for wasm code relative from where wasm code was
235-
// called. This ensures that native code called by wasm should have
236-
// native stack space to run, and the numbers of stack spaces here
237-
// should all be configurable for various embeddings.
238-
//
239-
// Note that this check is independent of each thread's stack guard page
240-
// here. If the stack guard page is reached that's still considered an
241-
// abort for the whole program since the runtime limits configured by
242-
// the embedder should cause wasm to trap before it reaches that
243-
// (ensuring the host has enough space as well for its functionality).
244-
if !isa.triple().is_pulley() {
245-
let vmctx = context
246-
.func
247-
.create_global_value(ir::GlobalValueData::VMContext);
248-
let interrupts_ptr = context.func.create_global_value(ir::GlobalValueData::Load {
249-
base: vmctx,
250-
offset: i32::from(func_env.offsets.ptr.vmctx_runtime_limits()).into(),
251-
global_type: isa.pointer_type(),
252-
flags: MemFlags::trusted().with_readonly(),
253-
});
254-
let stack_limit = context.func.create_global_value(ir::GlobalValueData::Load {
255-
base: interrupts_ptr,
256-
offset: i32::from(func_env.offsets.ptr.vmruntime_limits_stack_limit()).into(),
257-
global_type: isa.pointer_type(),
258-
flags: MemFlags::trusted(),
259-
});
260-
if self.tunables.signals_based_traps {
261-
context.func.stack_limit = Some(stack_limit);
262-
} else {
263-
func_env.stack_limit_at_function_entry = Some(stack_limit);
264-
}
265-
}
266212
let FunctionBodyData { validator, body } = input;
267213
let mut validator =
268214
validator.into_validator(mem::take(&mut compiler.cx.validator_allocations));
@@ -1162,6 +1108,35 @@ fn save_last_wasm_exit_fp_and_pc(
11621108
ptr: &impl PtrSize,
11631109
limits: Value,
11641110
) {
1111+
// The Wasm spec defines that stack overflows will raise a trap, and
1112+
// there's also an added constraint where as an embedder you frequently are
1113+
// running host-provided code called from wasm. WebAssembly and native code
1114+
// currently share the same call stack, so Wasmtime needs to make sure that
1115+
// host-provided code will have enough call-stack available to it.
1116+
//
1117+
// The first field of `VMRuntimeLimits` is the stack limit. If the stack
1118+
// pointer is below this limit when we're about to call out of guest code,
1119+
// trap. But we don't check this limit as long as we stay within guest or
1120+
// trampoline code. Instead, we rely on the guest hitting a guard page,
1121+
// which the OS will tell our signal handler about. The following explicit
1122+
// check on guest exit ensures that native code called by wasm should have
1123+
// enough stack space to run without hitting a guard page.
1124+
let trampoline_sp = builder.ins().get_stack_pointer(pointer_type);
1125+
let stack_limit = builder.ins().load(
1126+
pointer_type,
1127+
MemFlags::trusted(),
1128+
limits,
1129+
ptr.vmruntime_limits_stack_limit(),
1130+
);
1131+
let is_overflow = builder.ins().icmp(
1132+
ir::condcodes::IntCC::UnsignedLessThan,
1133+
trampoline_sp,
1134+
stack_limit,
1135+
);
1136+
builder
1137+
.ins()
1138+
.trapnz(is_overflow, ir::TrapCode::StackOverflow);
1139+
11651140
// Save the exit Wasm FP to the limits. We dereference the current FP to get
11661141
// the previous FP because the current FP is the trampoline's FP, and we
11671142
// want the Wasm function's FP, which is the caller of this trampoline.

crates/wasmtime/src/runtime/trap.rs

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -104,8 +104,10 @@ pub(crate) fn from_runtime_box(
104104
// then simultaneously assert that it's within a known linear memory
105105
// and additionally translate it to a wasm-local address to be added
106106
// as context to the error.
107-
if let Some(fault) = faulting_addr.and_then(|addr| store.wasm_fault(pc, addr)) {
108-
err = err.context(fault);
107+
if trap != Trap::StackOverflow {
108+
if let Some(fault) = faulting_addr.and_then(|addr| store.wasm_fault(pc, addr)) {
109+
err = err.context(fault);
110+
}
109111
}
110112
(err, Some(pc))
111113
}

crates/wasmtime/src/runtime/vm/traphandlers.rs

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -671,11 +671,18 @@ impl CallThreadState {
671671
return TrapTest::NotWasm;
672672
};
673673

674+
let stack_overflow_trap = if let Some(faulting_address) = faulting_addr {
675+
(regs.sp - 128 <= faulting_address && faulting_address <= regs.sp + 4096)
676+
.then_some(wasmtime_environ::Trap::StackOverflow)
677+
} else {
678+
None
679+
};
680+
674681
// If the fault was at a location that was not marked as potentially
675682
// trapping, then that's a bug in Cranelift/Winch/etc. Don't try to
676683
// catch the trap and pretend this isn't wasm so the program likely
677684
// aborts.
678-
let Some(trap) = code.lookup_trap_code(text_offset) else {
685+
let Some(trap) = stack_overflow_trap.or_else(|| code.lookup_trap_code(text_offset)) else {
679686
return TrapTest::NotWasm;
680687
};
681688

0 commit comments

Comments
 (0)