Shard jar map #919

ibraheemdev · 2025-06-18T22:25:57Z

Replaces the jar map with a DashMap. This depends on ibraheemdev/boxcar#35, which allows us to claim multiple sequential ingredient indices without holding a global lock.

This should also help with #918, because the slow path for accessing an uncached ingredient now goes through a sharded DashMap instead of a single Mutex, but I don't think we have benchmarks that use multiple databases yet (I'll add some). The ideal fix for that issue might also be to extend the IngredientCache, but this should at least help. We could also look into using papaya here (or more extreme, ArcSwap?), because the jar map is almost purely read-heavy after the table is initially filled.

netlify · 2025-06-18T22:26:02Z

✅ Deploy Preview for salsa-rs canceled.

Name	Link
🔨 Latest commit	`6ff8f84`
🔍 Latest deploy log	https://app.netlify.com/projects/salsa-rs/deploys/6854d3a80f405200089cbccb

codspeed-hq · 2025-06-18T22:34:35Z

CodSpeed Performance Report

Merging #919 will degrade performances by 4.37%

_{Comparing ibraheemdev:ibraheem/ingredient-cache (6ff8f84) with master (87a730f)}

Summary

❌ 1 regressions
✅ 11 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
❌	`new[InternedInput]`	5.3 µs	5.6 µs	-4.37%

ibraheemdev · 2025-06-18T22:38:06Z

The benchmarks are probably misleading here because they're single threaded.

MichaReiser · 2025-06-19T08:21:10Z

This should also help with #918, because the slow path for accessing an uncached ingredient now goes through a sharded DashMap instead of a single Mutex, but I don't think we have benchmarks that use multiple databases yet (I'll add some). The ideal fix for that issue might also be to extend the IngredientCache, but this should at least help. We could also look into using papaya here (or more extreme, ArcSwap?), because the jar map is almost purely read-heavy after the table is initially filled.

Did you try changing the ty benchmark by removing the thread pool override in the setup here:

https://github.com/astral-sh/ruff/blob/e352a50b74329268589cdc18eafab123832559ac/crates/ruff_benchmark/benches/ty_walltime.rs#L240-L250

This should be a very good benchmark to show the impact of this change

Edit 1

I made the change real quick and here are the results:

main

single threaded

ty_walltime     fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ small                      │               │               │               │         │
   ╰─ pydantic  304.6 ms      │ 327 ms        │ 309.6 ms      │ 313.7 ms      │ 3       │ 6

multi threaded

ty_walltime     fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ small                      │               │               │               │         │
   ╰─ pydantic  342 ms        │ 592.5 ms      │ 584.8 ms      │ 506.4 ms      │ 3       │ 6

PR

single threaded

ty_walltime     fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ small                      │               │               │               │         │
   ╰─ pydantic  298.9 ms      │ 315.7 ms      │ 303.1 ms      │ 305.9 ms      │ 3       │ 6

multi threaded

ty_walltime     fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ small                      │               │               │               │         │
   ╰─ pydantic  140 ms        │ 181.8 ms      │ 176.1 ms      │ 166 ms        │ 3       │ 6

The first run is still significantly faster (which is expected), but it's at least no longer the case that multithreaded is slower than single-threaded! I'm not quite sure why I can't observe the 10x slowdown anymore on main.

Edit 2

Wait, it now makes sense. I need to reduce the sample size to 1 or divan takes the median from two runs;

Main

single

ty_walltime     fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ small                      │               │               │               │         │
   ╰─ pydantic  295.9 ms      │ 341.8 ms      │ 304.1 ms      │ 313.9 ms      │ 3       │ 3

multi

ty_walltime     fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ small                      │               │               │               │         │
   ╰─ pydantic  92.33 ms      │ 582 ms        │ 581.3 ms      │ 418.5 ms      │ 3       │ 3

This PR

single

ty_walltime     fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ small                      │               │               │               │         │
   ╰─ pydantic  325.7 ms      │ 342.1 ms      │ 326.9 ms      │ 331.6 ms      │ 3       │ 3

multi

ty_walltime     fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ small                      │               │               │               │         │
   ╰─ pydantic  96.58 ms      │ 173.9 ms      │ 161.7 ms      │ 144.1 ms      │ 3       │ 3

So this is pretty good. It reduces the 5-6x slow down to a 1.5-2x slow down. I'm still not sure if that's good enough for us to enable multi threading in benchmarks because it probably adds too much overhead but this is a huge improvement in production when running with more than one db.

MichaReiser

Excellent! It would be nice to reduce this even further (e.g. could we use a lock free map that's optimized for reads?) but this is a great first step.

ibraheemdev force-pushed the ibraheem/ingredient-cache branch 2 times, most recently from 2de993f to 5d6cafe Compare June 18, 2025 22:32

ibraheemdev force-pushed the ibraheem/ingredient-cache branch from 5d6cafe to dd31360 Compare June 18, 2025 23:38

shard jar map

b21f69f

ibraheemdev force-pushed the ibraheem/ingredient-cache branch from dd31360 to b21f69f Compare June 18, 2025 23:38

MichaReiser approved these changes Jun 19, 2025

View reviewed changes

ibraheemdev mentioned this pull request Jun 20, 2025

Replace ingredient cache with faster ingredient map #921

Open

use no-op hasher for TypeId

6ff8f84

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Shard jar map #919

Shard jar map #919

Uh oh!

ibraheemdev commented Jun 18, 2025 •

edited

Loading

Uh oh!

netlify bot commented Jun 18, 2025 •

edited

Loading

Uh oh!

codspeed-hq bot commented Jun 18, 2025 •

edited

Loading

Uh oh!

ibraheemdev commented Jun 18, 2025

Uh oh!

MichaReiser commented Jun 19, 2025 •

edited

Loading

Uh oh!

MichaReiser left a comment

Uh oh!

Uh oh!

Shard jar map #919

Are you sure you want to change the base?

Shard jar map #919

Uh oh!

Conversation

ibraheemdev commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for salsa-rs canceled.

Uh oh!

codspeed-hq bot commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #919 will degrade performances by 4.37%

Summary

Benchmarks breakdown

Uh oh!

ibraheemdev commented Jun 18, 2025

Uh oh!

MichaReiser commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Edit 1

main

PR

Edit 2

Main

This PR

Uh oh!

MichaReiser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ibraheemdev commented Jun 18, 2025 •

edited

Loading

netlify bot commented Jun 18, 2025 •

edited

Loading

codspeed-hq bot commented Jun 18, 2025 •

edited

Loading

MichaReiser commented Jun 19, 2025 •

edited

Loading