Simple LRU garbage collection for interned values #839

ibraheemdev · 2025-04-30T22:12:57Z

Basic implementation of #597.

The garbage collection strategy used here is very simple.

Slots that have not been accessed in the last 3 active revisions are always reused, and nothing else. Active revisions are those in which interned values were actually read to avoid counting empty revisions.
Slots are never swept proactively, though this would be simple to add as an option.
We also cannot collect low durability values, which would require &mut db. This could be supported with the above.

This is by no means a perfect strategy, but we can always tune it as needed once the infrastructure is in place. We may also want to expose the N active revisions constant as a configuration option in the #[interned] macro (it seems somewhat related to durability).

The LRU list is sharded by hash, very similar to dashmap. There seems to be no performance difference from the previous implementation in multi-threaded ty benchmarks (though there was a significant regression without sharding). There is also no significant performance difference in single-threaded benchmarks thanks to #864, which lets us avoid additional read dependencies.

My main concern is that this change adds 3 word-sized fields to interned values: an Id and two pointers for the intrusive LRU list. This seems unavoidable to implement garbage collection, but still something to note.

netlify · 2025-04-30T22:13:40Z

✅ Deploy Preview for salsa-rs canceled.

Name	Link
🔨 Latest commit	`aa71466`
🔍 Latest deploy log	https://app.netlify.com/projects/salsa-rs/deploys/6837832757fbe90008c39bc1

codspeed-hq · 2025-04-30T22:15:25Z

CodSpeed Performance Report

Merging #839 will improve performances by 4.01%

_{Comparing ibraheemdev:interned-gc (aa71466) with master (f9a9bb2)}

Summary

⚡ 1 improvements
✅ 11 untouched benchmarks

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
⚡	`new[InternedInput]`	5.4 µs	5.2 µs	+4.01%

MichaReiser · 2025-05-01T06:10:24Z

I'm seeing a lot of panics in red-knot, mostly Data was not interned in the latest revision for its durability. I'm not sure exactly why this is happening, but I suspect it's related to durabilities.

Interesting. I suspect this is new but I'm investigating a similar issue related to interned in cycles

MichaReiser · 2025-05-01T14:59:39Z

That makes me wonder if your PR might help to come up with simpler repros for the fixpoint issues that we're seeing. I should give this a try tomorrow.

On the perf side. You could try using the knot_benchmark script in the ruff repository

MichaReiser · 2025-05-01T16:32:01Z

I quickly tried the branch in Red Knot but I also got new panics that we didn't see before. That makes me suspect that some revision tracking might be off in this PR?

What I like to do to debug those issues is to compare the tracing output and see where things go different

ibraheemdev · 2025-05-01T23:34:32Z

I think the issue is unrelated to durabilities but about interned query arguments. If a query takes an interned struct as an argument then an older memo may be incorrectly verified after the slot has been reused, but the IDs still match. I think we have to check that the memo and first_interned_at revisions match to avoid this, but I'm not exactly sure where that would be.

MichaReiser · 2025-05-02T06:18:30Z

I think I understand. What you're saying is that query(interned) may be checked for changes in a later revision where interned has been collected.

I think will need something similar to what we have in tracked structs where salsa informs other ingredients that the memo was collected so that they can free resources as well if necessary:

https://github.com/MichaReiser/salsa/blob/953dc34ce5cfd7fca6ad040241c3cbebc11830e7/src/tracked_struct.rs#L597-L655

The second part in tracked struct that protects against this is that we separately track created_at and updated_at. maybe_changed_after can then compare created_at to see if this struct was recrated since rev_before.

ibraheemdev · 2025-05-09T22:08:06Z

I updated the PR to add read dependencies and clear memos on reuse (I believe we will need to defer reclamation of the memos), but there are a still some ty tests failing that I'm unsure about.

ibraheemdev · 2025-05-22T14:36:59Z

Hmm I'm not sure we can remove the read dependency on creation with generational IDs, we still need last_interned_at to be updated for any dependent queries. Maybe it would be cheaper to update that field directly when an interned value is read?

MichaReiser · 2025-05-22T14:39:05Z

Hmm I'm not sure we can remove the read dependency on creation with generational IDs, we still need last_interned_at to be updated for any dependent queries. Maybe it would be cheaper to update that field directly when an interned value is read?

No, I don't think we can remove it from the writing query. We need the dependency to ensure the interned value gets re-created when it was claimed previously

ibraheemdev · 2025-05-27T22:45:29Z

I updated the PR description with the new changes, this should be ready for review.

MichaReiser · 2025-05-28T10:44:15Z

We also cannot collect low durability values, which would require &mut db. This could be supported with the above.

I believe you mean high durability here? It would be helpful to add some details to why

Slots that have not been accessed in the last 3 active revisions are always reused, and nothing else. Active revisions are those in which interned values were actually read to avoid counting empty revisions.

I believe we originally discussed a GC where interned values would move from green, to yellow, to red generations. I assume what you implemented now isn't that. Can you give some context to why?

My main concern is that this change adds 3 word-sized fields to interned values: an Id and two pointers for the intrusive LRU list. This seems unavoidable to implement garbage collection, but still something to note.

Is this after removing the created_at field or in addition to it?

MichaReiser

I left a few questions, but this overall looks great!

src/interned.rs

MichaReiser · 2025-05-28T10:53:10Z

src/interned.rs

+            if let Some((_, stamp)) = zalsa_local.active_query() {
+                value_shared.durability = std::cmp::max(value_shared.durability, stamp.durability);
+            }


Given that we can't LRU interned values with a durability other than LOW. Do you see a way for how we could avoid doing the entire LRU caching for MEDIUM and HIGH durability interned values?

This is possible, but I was imagining that when we add a way to sweep higher durability values we would still use the LRU to only look at the relevant stale slots. That said traversing a linked list is not very efficient, so maybe the LRU should only store low durability values.

src/interned.rs

MichaReiser · 2025-05-28T10:55:31Z

src/interned.rs

+            if value_shared.durability != Durability::LOW {
+                cursor.move_prev();
+                continue;
            }


Should we avoid recording medium or high durability values in the LRU? Or is there no risk that, after a while, all high durability values are at the end of the LRU, requiring us to skip many values here

MichaReiser · 2025-05-28T10:58:53Z

src/interned.rs

+        let value_shared = unsafe { &mut *value.shared.get() };
+
+        // The slot was reused.
+        if value_shared.first_interned_at > revision {


Do we still need this, now that we have generational IDs?

I think we still need it to record the dependency on creation, which uses first_interned_at. Do you mean this if statement specifically?

I think you're right, we don't actually need this anymore.

src/interned.rs

ibraheemdev · 2025-05-28T12:41:20Z

I believe we originally discussed a GC where interned values would move from green, to yellow, to red generations. I assume what you implemented now isn't that. Can you give some context to why?

The reason is that it doesn't really make sense to have a specific capacity for the LRU. Instead, we care more about reusing values that are unlikely to be accessed again. Bounding the LRU means we might end up evicting values that were read in a recent revision. It also makes it a lot more expensive to grow the LRU and shuffle around the values based on their color, especially for a cold run.

The LRU implemented in this PR does have some idea of coloring based on revision, i.e. values that have already been read in the current revision are not moved in the LRU, which as I understand is the main benefit of the coloring idea.

I believe you mean high durability here? It would be helpful to add some details to why

Yes, sorry. I believe issue was that we need to make sure that maybe_changed_after (and the update to last_interned_at) is not short-circuited for higher durability values that are reused , which would require a synthetic write to invalidate the revision for the given durability.

ibraheemdev · 2025-05-28T21:43:15Z

I removed the first_interned_at field, so the memory usage impact is only two words per-value for the LRU list.

ibraheemdev force-pushed the interned-gc branch from b500f7e to b54e365 Compare April 30, 2025 22:16

ibraheemdev force-pushed the interned-gc branch 3 times, most recently from 073a988 to 7f89c28 Compare May 2, 2025 21:19

ibraheemdev force-pushed the interned-gc branch 2 times, most recently from 3ac564d to a232751 Compare May 9, 2025 22:06

ibraheemdev force-pushed the interned-gc branch 2 times, most recently from 39c219b to bcb023c Compare May 15, 2025 05:13

ibraheemdev mentioned this pull request May 16, 2025

Use generational identifiers for tracked structs #864

Merged

ibraheemdev force-pushed the interned-gc branch 2 times, most recently from e366c0e to 5e693d9 Compare May 22, 2025 14:18

ibraheemdev force-pushed the interned-gc branch 3 times, most recently from 8fc9557 to 54b6bba Compare May 27, 2025 22:13

simple LRU garbage collection for interned values

0816eba

ibraheemdev force-pushed the interned-gc branch from 54b6bba to 987f76b Compare May 27, 2025 22:26

shard interned values

e9dd043

ibraheemdev force-pushed the interned-gc branch from 987f76b to e9dd043 Compare May 27, 2025 22:30

ibraheemdev marked this pull request as ready for review May 27, 2025 22:42

gate memo size assertion to 64-bit platforms

286c167

MichaReiser reviewed May 28, 2025

View reviewed changes

ibraheemdev added 3 commits May 28, 2025 17:24

extend internal documentation for interned structs

7f288ac

remove first_interned_at field from interned values

ea07f22

clarify difference between tracked and interned structs

aa71466

ibraheemdev force-pushed the interned-gc branch from b703039 to aa71466 Compare May 28, 2025 21:41

Simple LRU garbage collection for interned values #839

Are you sure you want to change the base?

Simple LRU garbage collection for interned values #839

Uh oh!

Conversation

ibraheemdev commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for salsa-rs canceled.

Uh oh!

codspeed-hq bot commented Apr 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #839 will improve performances by 4.01%

Summary

Benchmarks breakdown

Uh oh!

MichaReiser commented May 1, 2025

Uh oh!

MichaReiser commented May 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichaReiser commented May 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ibraheemdev commented May 1, 2025

Uh oh!

MichaReiser commented May 2, 2025

Uh oh!

ibraheemdev commented May 9, 2025

Uh oh!

ibraheemdev commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichaReiser commented May 22, 2025

Uh oh!

ibraheemdev commented May 27, 2025

Uh oh!

MichaReiser commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichaReiser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MichaReiser May 28, 2025

Choose a reason for hiding this comment

Uh oh!

ibraheemdev May 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MichaReiser May 28, 2025

Choose a reason for hiding this comment

Uh oh!

MichaReiser May 28, 2025

Choose a reason for hiding this comment

Uh oh!

ibraheemdev May 28, 2025

Choose a reason for hiding this comment

Uh oh!

ibraheemdev May 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ibraheemdev commented May 28, 2025

Uh oh!

ibraheemdev commented May 28, 2025

Uh oh!

Uh oh!

ibraheemdev commented Apr 30, 2025 •

edited

Loading

netlify bot commented Apr 30, 2025 •

edited

Loading

codspeed-hq bot commented Apr 30, 2025 •

edited

Loading

MichaReiser commented May 1, 2025 •

edited

Loading

MichaReiser commented May 1, 2025 •

edited

Loading

ibraheemdev commented May 22, 2025 •

edited

Loading

MichaReiser commented May 28, 2025 •

edited

Loading