-
Notifications
You must be signed in to change notification settings - Fork 180
perf: Relax atomic allocated ops in Table
and shrink page size
#766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@davidbarsky here is your reminder for this for your loom work |
✅ Deploy Preview for salsa-rs canceled.
|
CodSpeed Performance ReportMerging #766 will not alter performanceComparing Summary
|
eeb899a
to
0b2a614
Compare
0b2a614
to
45eaab3
Compare
I don't think shuttle actually models weak atomics. We could run Miri with |
Ah I see, a bummer. According to miri docs the default is already at a high, slow 64. Unsure if increasing it is worth it? Is my reasoning for relaxing the loads here correct either way? I think it is but I'd rather get a second opinion of course. |
I think that's the default if you pass the flag without a value. A plan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is correct if something like slots_of
is used concurrently? I know that's only exposed under salsa_unstable
, but the code is currently written such that slots can be accessed across threads (the functions return a slice, not just the specific ID). If we wanted to relax the loads I think we should be clearer about only supporting point-access for previously synchronized slots.
The allocation lock already enforces a happens-before relationship for the initialized length
Hmm, is that a problem? The You mean the |
We do not use more than 10 bits for counts per page anyways, this shrinks the page overhead by 1 word
45eaab3
to
f6cdb3d
Compare
Table
Table
and shrink page size
|
fair enough |
These are already paired with a lock-access on write. We also never store more than 10 bits of entries per page, so we can swap the usize with a u16 to shrink the page overhead.