Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(pageserver): use vectored_get in collect_keyspace #10546

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

skyzh
Copy link
Member

@skyzh skyzh commented Jan 28, 2025

Problem

This is the first patch for optimizing the L0 compaction. As we know from the previous incident, the most time-consuming thing in the L0 compaction code is at repartition before the compaction actually starts. This patch is a very safe that doesn't change any behavior except using vectored get. We refactor the collect_keyspace code to use vectored get so that it's faster to retrieve all the data even if we get hundreds of the L0 piled up.

Next patch: repartition should read from the boundary of L0 and L1 instead of latest data to further accelerate.

Summary of changes

Use vectored get in collect_keyspace.

Signed-off-by: Alex Chi Z <[email protected]>
@skyzh skyzh requested a review from erikgrinaker January 28, 2025 19:33
@skyzh skyzh requested a review from a team as a code owner January 28, 2025 19:33
@skyzh skyzh marked this pull request as draft January 28, 2025 19:33
Copy link

7414 tests run: 7062 passed, 0 failed, 352 skipped (full report)


Flaky tests (7)

Postgres 17

Postgres 14

Code coverage* (full report)

  • functions: 33.5% (8506 of 25423 functions)
  • lines: 49.2% (71544 of 145450 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
3fb6e25 at 2025-01-28T21:18:10.839Z :recycle:

@skyzh skyzh marked this pull request as ready for review January 28, 2025 22:21
Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IoConcurrency is too-short-lived, better use IoConcurrency::sequential() for now.

Also, this PR only touches the innermost loops. That's probably decent bang for buck but given the amount of times we call this function in production, we could put in a liiitle more effort.

At a mimum, what about list_rels? Can we process all dbs in parallel?

I'm thinking a bigger rewrite of this function that makes it a pipeline could be a good compromise, like so: in all the places where we currently call get(), we'd instead submit the key into into an utils::sync::spsc_fold that folds the KeySpaceRandomAccum.
There's one other task / concurrent future that reads from the spsc_fold and executes it using get_vectored.

Somehow we'd need to provide results for not-inner-most-loop reads. Maybe using oneshot channels registered in a HashMap or so?

Comment on lines +1091 to +1097
// Skip the vectored-read max key check by using `get_vectored_impl`.
let io_concurrency = IoConcurrency::spawn_from_conf(
self.conf,
self.gate
.enter()
.map_err(|_| CollectKeySpaceError::Cancelled)?,
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think IoConcurrency::sequential() mode is more appropriate here, for now.
This io_concurrency that you're creating here (= tokio task you'd be spawning here if concurrent IO is enabled) is too short-lived.

let mut buf = self.get(relsize_key, lsn, ctx).await?;
relsize_keys_to_collect.add_key(relsize_key);
}
// Skip the vectored-read max key check by using `get_vectored_impl`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have that check for a reason: to limit memory consumption.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Consider adding a Timeline helper that takes an IntoIterator<Key>, chunks it according to MAX_GET_VECTORED_KEYS, and returns a key/value iterator.

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change LGTM overall.

let mut buf = self.get(relsize_key, lsn, ctx).await?;
relsize_keys_to_collect.add_key(relsize_key);
}
// Skip the vectored-read max key check by using `get_vectored_impl`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Consider adding a Timeline helper that takes an IntoIterator<Key>, chunks it according to MAX_GET_VECTORED_KEYS, and returns a key/value iterator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants