Reduce notebook memory footprint #21319

MichaReiser · 2025-11-07T16:26:20Z

Summary

Reduce the memory footprint of NotebookIndex from O(lines) to O(cells).

The NotebookIndex stores a mapping from rows in the concatenated notebook document
to the relative row numbers within each cell and from absolute line number to the corresponding cell.

The way this is implemented today is by having a vector with one entry for every absolute row number
where the value is the index of the cell. While this allows O(1) lookup, it does require an entry for every single line.

This PR rewrites our representation (and uses one Vec instead of two) to only store the
start line number per cell and use a binary search to find to which cell a line number belongs.
This makes lookups slightly slower (from O(1) to O(log(n)) but reduces memory usage
from O(rows) to O(cells).

I noticed this improvement when working on ty's notebook support but decided to extract it into its own PR to make reviewing easier.

Test Plan

Existing tests

github-actions · 2025-11-07T16:42:51Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

dhruvmanila · 2025-11-11T09:34:58Z

crates/ruff_notebook/src/index.rs

-            .map(|(row, cell)| (OneIndexed::from_zero_indexed(row), *cell))
+    /// Returns an iterator over the starting rows of each cell (1-based).
+    ///
+    /// This yields one entry per Python cell (skipping over Makrdown cell).


Suggested change

/// This yields one entry per Python cell (skipping over Makrdown cell).

/// This yields one entry per Python cell (skipping over Markdown cell).

Thanks. I pushed the fix to #21175

MichaReiser added the internal An internal refactor or improvement label Nov 7, 2025

Reduce memory footprint of notebooks

fcd67f2

MichaReiser force-pushed the micha/refactor-notebook-index branch from c7340bd to fcd67f2 Compare November 7, 2025 16:31

MichaReiser changed the title ~~Reduce memory footprint of notebooks~~ Reduce notebook memory footprint Nov 7, 2025

MichaReiser marked this pull request as ready for review November 7, 2025 16:31

MichaReiser requested review from carljm, dcreager, dhruvmanila and sharkdp as code owners November 7, 2025 16:31

MichaReiser removed request for carljm, dcreager and sharkdp November 7, 2025 16:31

ntBre approved these changes Nov 10, 2025

View reviewed changes

MichaReiser merged commit 36cce34 into main Nov 11, 2025
38 checks passed

MichaReiser deleted the micha/refactor-notebook-index branch November 11, 2025 09:43

dhruvmanila approved these changes Nov 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reduce notebook memory footprint #21319

Reduce notebook memory footprint #21319

Uh oh!

MichaReiser commented Nov 7, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 7, 2025

Uh oh!

Uh oh!

dhruvmanila Nov 11, 2025

Uh oh!

MichaReiser Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	/// This yields one entry per Python cell (skipping over Makrdown cell).
	/// This yields one entry per Python cell (skipping over Markdown cell).

Reduce notebook memory footprint #21319

Reduce notebook memory footprint #21319

Uh oh!

Conversation

MichaReiser commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

github-actions bot commented Nov 7, 2025

ruff-ecosystem results

Linter (stable)

Linter (preview)

Uh oh!

Uh oh!

dhruvmanila Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

MichaReiser Nov 11, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

MichaReiser commented Nov 7, 2025 •

edited

Loading

`ruff-ecosystem` results