Skip to content

compute: page out correction v2 chunks via column_pager#36946

Merged
antiguru merged 1 commit into
MaterializeInc:mainfrom
antiguru:correction-v2-pager
Jun 10, 2026
Merged

compute: page out correction v2 chunks via column_pager#36946
antiguru merged 1 commit into
MaterializeInc:mainfrom
antiguru:correction-v2-pager

Conversation

@antiguru

@antiguru antiguru commented Jun 9, 2026

Copy link
Copy Markdown
Member

Motivation

CorrectionV2 (#36577) stores its chunks as columnar Columns but keeps them resident.
This wires those chunks through mz_timely_util::column_pager so cold chunks can spill out-of-core, the same seam the columnar merge batcher already uses.
Part of CLU-65 (pager) / swap-cooperative data shapes.

Description

Each Chunk is born paged (a PagedColumn from global_pager()) and materializes lazily through a OnceLock on first read, so cursors keep handing out borrows unchanged.
A chunk is released as the cursor advances past it, so only the chunks under an active merge front are resident — a merge streams an out-of-core working set instead of materializing it whole.
Cached first_time/last_time let the chain invariant and split_at_time route a resting chunk by its boundary times without paging it in.
When the global pager is disabled (the default) chunks stay resident and behavior is unchanged; benchmarking confirms the abstraction is free when it does not spill.

Relationship to #36898

This is rebased on top of #36898 (bucketed CorrectionV2), now merged.
The two are structurally orthogonal — this works at the chunk level (chunk storage), #36898 at the chain/bucket level — and OnceLock+Mutex keep Chunk Sync, so #36898's slice-proportional read path (emitted.iter()) is untouched.

Verification

Covered by the existing in-module equivalence tests (V1 vs V2) plus added swap-backend round-trip tests; the full swap/file/lz4 benchmark study lives on correction-v2-pager-eval (doc/developer/design/20260609_pager_swap_findings.md).

🤖 Generated with Claude Code

PR MaterializeInc#36577 moved correction v2 chunk storage from columnation to columnar
(Column<(D, Timestamp, Diff)> with an Align(Vec<u64>) spill form) but did
not wire the pager. Route chunks through mz_timely_util::column_pager so
cold chunks spill out-of-core, sharing the process-wide TieredPolicy
budget and swap/file backend already configured in compute_state.

Chunks are born paged (RefCell<Option<PagedColumn>>) and materialize
lazily through a OnceCell<Column> on first read, so index() still hands
out Ref<'_> borrows. Cursors release their Rc<Chunk>s as they advance, so
only the chunks under active merge fronts stay resident -- a merge streams
an out-of-core working set rather than materializing it whole. ChainBuilder
re-pages every minted chunk, keeping resting chains paged; try_unwrap only
reuses never-materialized (still-paged) chunks. Cached len/first_time/
last_time keep bookkeeping and boundary time checks from paging chunks in.

updates_before stays a non-destructive read (its updates remain in the
buffer for persist feedback to cancel) but now collects the before-upper
prefix into an owned Vec and returns it: Chunk is !Sync, so a borrowing
iterator can no longer be Send across the writer's await.

When the global pager is disabled (the default) chunks stay resident and
behavior is unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@antiguru antiguru force-pushed the correction-v2-pager branch from f3da77d to a5556e6 Compare June 10, 2026 19:15
@antiguru antiguru marked this pull request as ready for review June 10, 2026 19:23
@antiguru antiguru requested a review from a team as a code owner June 10, 2026 19:23
@antiguru antiguru requested a review from DAlperin June 10, 2026 19:23
@antiguru

Copy link
Copy Markdown
Member Author

Reviewer note on Chunk::column() residency semantics — a known, bounded limitation.

column() is one-way: it take()s the pager handle (reclaiming the paged storage) and caches the Column in the OnceLock, which is never cleared.
So once a chunk is read it stays resident until the Chunk is dropped; it never re-pages.
It is also whole-chunk granularity: any single access (index, a find_time_greater_than probe, a can_accept tie-break) materializes the entire ~2 MiB chunk.

Why this is bounded rather than a leak:

  • Merge/cursor chunks are materialized at the front, iterated, then dropped as the cursor advances (Rc → 0), freeing memory.
  • A resting chain is either untouched — nothing calls column(), so it stays paged — or consumed by consolidate_before/split_at_time, which re-emit through ChainBuilder and re-page on mint.

Where it does cost resident memory:

  • The emitted chain is fully materialized by updates_before (emitted.iter()) and stays resident until the next consolidate_before rebuilds it — the just-written batch is resident for one cycle (it is hot, about to be cancelled by persist feedback).
  • can_accept tie-breaks (equal boundary times across a chunk boundary) leave one materialized chunk in the built chain.
  • Probe-only accesses over-pull (whole chunk for a few elements); the cached first_time/last_time already eliminate the whole-chunk routing probes, leaving only the straddling chunk in split_at_time and the find_time_greater_than binary search.

Subtle accounting gap: a chunk paged in via take() of a Paged/Compressed variant gets no ResidentTicket, so the policy budget does not track it.
Total RSS is therefore policy-resident (the budget) plus the transiently-materialized set (untracked).

Empirically this stays bounded: the swap/file/lz4 study (on correction-v2-pager-eval) showed RSS flat at ~budget across 2M–16M ts on the file backend, and nospill = +0%.

Possible follow-ups if a profile ever shows it matters:

  • Use the pager's read_at range reads for find_time_greater_than / single-element access instead of take-ing the whole chunk (kills the probe over-pull; more syscalls on fully-iterated chunks, so probe-only).
  • Charge paged-in bytes back to the policy on take so the budget sees the materialized set and can evict under pressure.

@DAlperin DAlperin left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@antiguru antiguru merged commit c96d4ef into MaterializeInc:main Jun 10, 2026
117 checks passed
@antiguru antiguru deleted the correction-v2-pager branch June 10, 2026 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants