-
Notifications
You must be signed in to change notification settings - Fork 702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wasm2c: update memory/table operations to use u64 + harmonize checks #2506
base: main
Are you sure you want to change the base?
Conversation
176ec50
to
7bea8ee
Compare
return _addcarry_u64(0, a, b, resptr); | ||
} | ||
#endif | ||
|
||
#define RANGE_CHECK(mem, offset, len) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While defaulting to the 64-bit RANGE check is fine for memcpy, tables, etc. (it's unlikely to affect performance), the concern is that the 64-bit RANGE_CHECK will slow down accesses to 32-bit linear memories for bounds-checked wasm2c. Firefox uses the bounds-checked wasm2c for Wasm on 32-bit devices, and so it is perf sensitive to this.
I don't know if this is addressed in a future PR, but this particular PR would be a perf problem from the Firefox use case.
-
If you believe future PRs you are landing will give us the property "bounds checks on 32-bit memories are not slowed down", then i don't have any concerns. (I'd prefer landing this PR and the PR that fixes it in quick succession though). I'll look through the other PRs next to see if this is resolved by them
-
If you believe this is not addressed in future PRs, we may need to specialize the bounds checked added depending on the type of memory, which may need specializingi32_load
etc. on the type of memory -
An alternate approach would be to make the current PR about changing the RANGE_CHECK on the memory_fill style operations only, but leaving the RANGE_CHECKs on memory ops as is, i.e., it checks depending onSUPPORT_MEMORY64
Edit: I see that this might possibly be addressed in the next PR. If yes, please disregard the concern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these thoughtful (and well-taken) comments. I believe #2507 will nail this for you (by preserving the current RANGE_CHECK on 32-bit, default-page-size memories), so, how about we wait to get alignment on both #2506 and #2507 and then land them at the same time.
I should say that even the current RANGE_CHECK uses 64-bit arithmetic:
#define RANGE_CHECK(mem, offset, len) \
if (UNLIKELY(offset + (uint64_t)len > mem->size)) \
TRAP(OOB);
... but the difference is that RANGE_CHECK64 does an explicit check for 64-bit overflow. I wish I had the benchmarking infrastructure to promise you it won't affect performance on 32-bit x86 but... safer to wait for #2507 which lets you keep the same code.
70556e4
to
d15bfb9
Compare
Is the concern that the test-suite may not run on small machines? One thought I had to run at least simple tests due to lazy memory allocation on Linux-like OSes? If tests are of the form such as below:
this should end up allocating only two physical pages for the heap. So apart from the large virtual memory footprint, the test should run fine even on small machines? |
beb8809
to
64a616d
Compare
src/template/wasm2c.declarations.c
Outdated
#elif defined(_MSC_VER) | ||
static inline bool add_overflow(uint64_t a, uint64_t b, uint64_t* resptr) { | ||
return _addcarry_u64(0, a, b, resptr); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe used #define
here for consistency with above? Or use static inline
function above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
487cb8a
to
326e3cb
Compare
Yeah, and also that a test could be really slow to run (e.g. memory.fill with an "n" greater than UINT32_MAX). |
The PR updates the bulk memory operations (memory.fill, memory.copy, table.fill, etc.) to support 64-bit addresses and counts, and standardizes on a 64-bit version of RANGE_CHECK everywhere.
Previously we were only taking u32's for these arguments, even with memory64 enabled. (I don't think the memory64 tests check the ability to use memory.copy or the other operations beyond the first 4 GiB of a memory -- I wonder if there would be a way to add this as an "intensive" test if people don't mind having to allocate >4 GiB to run the test.)
This is a stepping-stone to being able to mix software-bounds-checked i64 memories and "guard-page-checked" i32 memories in the same module (#2507) and supporting custom-page-sizes (#2508).