Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: improve L0 compaction performance #10694

Closed
12 of 14 tasks
Tracked by #10160
erikgrinaker opened this issue Feb 6, 2025 · 2 comments
Closed
12 of 14 tasks
Tracked by #10160

pageserver: improve L0 compaction performance #10694

erikgrinaker opened this issue Feb 6, 2025 · 2 comments
Assignees
Labels
a/performance Area: relates to performance of the system c/storage/pageserver Component: storage: pageserver

Comments

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Feb 6, 2025

L0 compaction is currently struggling to keep up with ingest workloads, causing high read amp. This is a blocker to remove L0 flush upload backpressure (l0_flush_wait_upload), and enable L0 compaction backpressure (#5415) and parallel S3 uploads (#10096).

@erikgrinaker erikgrinaker added a/performance Area: relates to performance of the system c/storage/pageserver Component: storage: pageserver labels Feb 6, 2025
github-merge-queue bot pushed a commit that referenced this issue Feb 7, 2025
## Problem

L0 compaction can get starved by other background tasks. It needs to be
responsive to avoid read amp blowing up during heavy write workloads.

Touches #10694.

## Summary of changes

Add a separate semaphore for compaction, configurable via
`use_compaction_semaphore` (disabled by default). This is primarily for
testing in staging; it needs further work (in particular to split
image/L0 compaction jobs) before it can be enabled.
knizhnik pushed a commit that referenced this issue Feb 10, 2025
## Problem

L0 compaction can get starved by other background tasks. It needs to be
responsive to avoid read amp blowing up during heavy write workloads.

Touches #10694.

## Summary of changes

Add a separate semaphore for compaction, configurable via
`use_compaction_semaphore` (disabled by default). This is primarily for
testing in staging; it needs further work (in particular to split
image/L0 compaction jobs) before it can be enabled.
github-merge-queue bot pushed a commit that referenced this issue Feb 10, 2025
## Problem

The compaction loop currently runs periodically, which can cause it to
wait for up to 20 seconds before starting L0 compaction by default.

Also, when we later separate the semaphores for L0 compaction and image
compaction, we want to give up waiting for the image compaction
semaphore if L0 compaction is needed on any timeline.

Touches #10694.

## Summary of changes

Notify the compaction loop when an L0 flush (on any timeline) exceeds
`compaction_threshold`.

Also do some opportunistic cleanups in the area.
github-merge-queue bot pushed a commit that referenced this issue Feb 11, 2025
## Problem

Image compaction can starve out L0 compaction if a tenant has several
timelines with L0 debt.

Touches #10694.
Requires #10740.

## Summary of changes

* Add an initial L0 compaction pass, in order of L0 count.
* Add a tenant option `compaction_l0_first` to control the L0 pass
(disabled by default).
* Add `CompactFlags::OnlyL0Compaction` to run an L0-only compaction
pass.
* Clean up the compaction iteration logic.

A later PR will use separate semaphores for the L0 and image compaction
passes to avoid cross-tenant L0 starvation. That PR will also make image
compaction yield if _any_ of the tenant's timelines have pending L0
compaction to further avoid starvation.
github-merge-queue bot pushed a commit that referenced this issue Feb 11, 2025
## Problem

When image compaction yields for L0 compaction, it may not immediately
schedule L0 compaction, because it just goes on to compact the next
pending timeline.

Touches #10694.
Requires #10744.

## Summary of changes

Extend `CompactionOutcome` with `YieldForL0` and `Skipped` variants, and
immediately schedule an L0 compaction pass in the `YieldForL0` case.
github-merge-queue bot pushed a commit that referenced this issue Feb 12, 2025
## Problem

L0 compaction frequently gets starved out by other background tasks and
image/GC compaction. L0 compaction must be responsive to keep read
amplification under control.

Touches #10694.
Resolves #10689.

## Summary of changes

Use a separate semaphore for the L0-only compaction pass.

* Add a `CONCURRENT_L0_COMPACTION_TASKS` semaphore and
`BackgroundLoopKind::L0Compaction`.
* Add a setting `compaction_l0_semaphore` (default off via
`compaction_l0_first`).
* Use the L0 semaphore when doing an `OnlyL0Compaction` pass.
* Use the background semaphore when doing a regular compaction pass
(which includes an initial L0 pass).
* While waiting for the background semaphore, yield for L0 compaction if
triggered.
* Add `CompactFlags::NoYield` to disable L0 yielding, and set it for the
HTTP API route.
* Remove the old `use_compaction_semaphore` setting and
compaction-scoped semaphore.
* Remove the warning when waiting for a semaphore; it's noisy and we
have metrics.
@erikgrinaker
Copy link
Contributor Author

erikgrinaker commented Feb 12, 2025

We've implemented all planned improvements, but not enabled them by default yet. I'll keep this open to verify in staging and roll out in production.

@erikgrinaker
Copy link
Contributor Author

The planned work here is mostly complete, production rollout is tracked in https://github.com/neondatabase/cloud/issues/24664.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/performance Area: relates to performance of the system c/storage/pageserver Component: storage: pageserver
Projects
None yet
Development

No branches or pull requests

2 participants