-
Notifications
You must be signed in to change notification settings - Fork 180
Replace ingredient cache with faster ingredient map #921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Replace ingredient cache with faster ingredient map #921
Conversation
✅ Deploy Preview for salsa-rs canceled.
|
CodSpeed Performance ReportMerging #921 will degrade performances by 10.4%Comparing Summary
Benchmarks breakdown
|
Hmm looks like this may not be good enough... I guess we could keep the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great and I prefer it over the other PR not just because it's much faster but also because it doesn't require using raw-api.
I do think it makes sense to keep the secondary. A 10% regression for the most common case seems a lot.
Main
single
ty_walltime fastest │ slowest │ median │ mean │ samples │ iters
╰─ small │ │ │ │ │
╰─ pydantic 296.6 ms │ 305.5 ms │ 299.5 ms │ 300.5 ms │ 3 │ 3
multi
ty_walltime fastest │ slowest │ median │ mean │ samples │ iters
╰─ small │ │ │ │ │
╰─ pydantic 69.42 ms │ 612.2 ms │ 605.8 ms │ 429.1 ms │ 3 │ 3
This PR
single
ty_walltime fastest │ slowest │ median │ mean │ samples │ iters
╰─ small │ │ │ │ │
╰─ pydantic 303.8 ms │ 322.4 ms │ 305.9 ms │ 310.7 ms │ 3 │ 3
multi
ty_walltime fastest │ slowest │ median │ mean │ samples │ iters
╰─ small │ │ │ │ │
╰─ pydantic 55.83 ms │ 79.95 ms │ 57.73 ms │ 64.5 ms │ 3 │ 3
Overall: The multi-threading regression is now about the same (~10%) as the single threaded regression when using multiple databases
Hmm, it does seem that shuttle got stuck somewhere.... |
The other PR doesn't actually need the raw-api, that just made it easier to make the shuttle shim.. it looks like we'll need a shuttle shim for this one as well. I now realized the shim can just return a copy of the value instead of having to mimic the guard API. |
It might also be worth adding specific benchmarks to compare running with a second database, because right now (for ty_walltime) we are assuming that the fastest run is the one with the first database. At least on my machine it wasn't clear that was the case; the gap was a lot closer and running into noise. |
Did you change the sample count to 1? |
An alternative to #919, this removes the ingredient cache entirely, removing the gap between single and multi-database use cases. A
TypeId
is already a hash internally, so the ingredient lookup can become a very fast lookup in a lock-free map like papaya. The one annoying part is that we now have to runzalsa_register_downcaster
every time a tracked function is called because we don't have a good way of caching that call independent of the database. I'm interested in the benchmark results here.