Skip to content

Conversation

@discord9
Copy link
Contributor

@discord9 discord9 commented Sep 17, 2025

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

gc worker metasrv trigger, which trigger gc worker on datanode depending on file_removal_rate and sst_num

TODO: still need to change the way datanode update removed_files to reduce the number of delete operation it send to object store done, but maybe also need to persistent actually delete files in RegionEdit
TODO:

  • more unit tests&integration tests

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.
  • API changes are backward compatible.
  • Schema or data changes are backward compatible.

@github-actions github-actions bot added size/M docs-not-required This change does not impact docs. labels Sep 17, 2025
@discord9 discord9 force-pushed the gc_metasrv_trigger branch 3 times, most recently from 9f76fb4 to 6bfa276 Compare September 18, 2025 11:04
@github-actions github-actions bot added size/L and removed size/M labels Sep 18, 2025
@discord9 discord9 changed the title feat: gc worker metasrv trigger feat: gc worker metasrv scheduler Sep 19, 2025
@discord9 discord9 force-pushed the gc_metasrv_trigger branch 4 times, most recently from 4ac8dda to 3fa76b6 Compare October 15, 2025 03:09
@discord9 discord9 force-pushed the gc_metasrv_trigger branch 3 times, most recently from 92dc253 to ed00d04 Compare October 22, 2025 09:01
@github-actions github-actions bot added size/XL and removed size/L labels Oct 22, 2025
@discord9 discord9 force-pushed the gc_metasrv_trigger branch 3 times, most recently from 162f2ed to 6a80190 Compare October 29, 2025 03:54
@github-actions github-actions bot added size/L and removed size/XL labels Oct 29, 2025
@discord9 discord9 marked this pull request as ready for review October 30, 2025 12:25
Copy link
Collaborator

@MichaelScofield MichaelScofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lack of integration test?

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a comprehensive garbage collection (GC) system for managing file lifecycle in the mito2 storage engine. The changes introduce metasrv-coordinated GC scheduling, file removal rate tracking, and improved manifest management for tracking deleted files.

Key changes:

  • Added GC scheduler on metasrv to coordinate garbage collection across datanodes
  • Implemented file removal rate tracking to monitor GC pressure and prioritize regions
  • Refactored manifest management to use reference-based statistics and removed hardcoded file retention policies
  • Changed lingering_time from required to optional, allowing immediate file deletion when set to None

Reviewed Changes

Copilot reviewed 44 out of 45 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/store-api/src/region_engine.rs Added file_removal_rate field to RegionManifestInfo and derived Hash for RegionRole
src/mito2/src/region.rs Added file_removal_rate to ManifestStats and exposed it in region statistics
src/mito2/src/manifest/manager.rs Refactored to use ManifestStats reference instead of individual atomic fields
src/mito2/src/manifest/action.rs Added file removal rate calculation and removed TTL-based file eviction logic
src/mito2/src/gc.rs Refactored GC worker to accept region references and made lingering_time optional
src/mito2/src/sst/file_purger.rs Added gc_enabled parameter to determine purger type selection
src/meta-srv/src/gc/*.rs New GC scheduling infrastructure including scheduler, tracker, candidate selection, and mailbox communication
src/datanode/src/heartbeat/handler/gc_worker.rs Updated to handle GC instructions with improved validation
tests-integration/tests/http.rs Reduced default lingering times for faster testing
src/mito2/src/gc/worker_test.rs New integration tests for GC worker functionality

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 310 to 323
fn file_removed_cnt_after(&self, t_ms: i64) -> (u64, Option<i64>) {
let mut cnt = 0;
let mut min_ts_after: Option<i64> = None;
for record in &self.removed_files {
if record.removed_at >= t_ms {
cnt += record.file_ids.len();
}
min_ts_after = match min_ts_after {
Some(ts) => Some(ts.min(record.removed_at)),
None => Some(record.removed_at),
};
}
(cnt as u64, min_ts_after)
}
Copy link

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The min_ts_after calculation is incorrect. It computes the minimum timestamp across ALL removed files, not just those after t_ms. This should only track the minimum timestamp of files matching the condition removed_at >= t_ms. The current logic will return timestamps from before the threshold, leading to incorrect file removal rate calculations.

Copilot uses AI. Check for mistakes.
Comment on lines 308 to 309
/// Count the number of files removed after the given timestamp. Also return the minimum
/// timestamp of all removed files.
Copy link

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation says 'minimum timestamp of all removed files' but based on the function name and first part of the comment, it should say 'minimum timestamp of removed files after the given timestamp' to match the intended behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +129 to +131
// expect long running queries to be finished(or at least be able to notify it's using a deleted file) within a reasonable time
lingering_time: Some(Duration::from_secs(60)),
Copy link

Copilot AI Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'it's' to 'they're' for grammatical correctness. Should be 'they're using a deleted file' since 'queries' is plural.

Copilot uses AI. Check for mistakes.
file_removal_rate, ..
} => *file_removal_rate as f64 * self.config.file_removal_rate_weight,
// Metric engine doesn't have file_removal_rate, also this should be unreachable since metrics engine doesn't support gc
RegionManifestInfo::Metric { .. } => 0.0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to panic?

}

impl GcScheduler {
/// Calculate GC priority score for a region based on various metrics.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to doc the alogorithm.

/// The mailbox to send messages.
pub(crate) mailbox: MailboxRef,
/// The server address.
pub(crate) server_addr: String,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who's server address?

let dn_stats = self.meta_peer_client.get_all_dn_stat_kvs().await?;
let mut table_to_region_stats: HashMap<TableId, Vec<RegionStat>> = HashMap::new();
for (_dn_id, stats) in dn_stats {
let mut stats = stats.stats;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren’t the stats already sorted by timestamp? I’m not sure, but it seems like they should be.

use crate::service::mailbox::{Channel, MailboxRef};

#[async_trait::async_trait]
pub(crate) trait SchedulerCtx: Send + Sync {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: love to doc the trait and functions

fn default() -> Self {
Self {
enable: false,
max_concurrent_tables: 10,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concurrency seems high; let's be conservative in the initial implementation.

impl GcSchedulerOptions {
/// Validates the configuration options.
pub fn validate(&self) -> Result<()> {
if self.max_concurrent_tables == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing all these validations with ensure! looks better.

while let Some(event) = self.receiver.recv().await {
match event {
Event::Tick => {
info!("Received gc tick");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoying log

Suggested change
info!("Received gc tick");

// 6 hours, for unknown expel time, which is when this file get removed from manifest, it should rarely happen, can keep it longer
unknown_file_lingering_time: Duration::from_secs(60 * 60 * 6),
// expect long running queries to be finished(or at least be able to notify it's using a deleted file) within a reasonable time
lingering_time: Some(Duration::from_secs(60)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change lingering time smaller?

"Successfully deleted {} unused files for region {}",
unused_len, region_id
);
// TODO(discord9): update region manifest about deleted files
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The todo is outdated.

@killme2008
Copy link
Contributor

I’m wondering whether running the GC worker concurrently with migration or repartition tasks could lead to any concurrency issues. This is something I’m particularly concerned about. @waynexia @WenyXu @fengjiachun

I’ve noticed that the current logic relies on certain assumptions — for example, that a region resides on the local datanode. However, these assumptions might not always hold true. It’s possible that the conditions are valid at the time of verification but become invalid during execution, leading to constraint violations.

The core question is: if these assumptions or checks are invalidated during execution due to ongoing migration or repartition operations, what would be the resulting behavior?

If such consistency cannot be guaranteed, then it would be safer to execute these processes sequentially rather than in parallel.

@discord9 discord9 marked this pull request as draft November 7, 2025 06:32
@github-actions github-actions bot added size/XXL and removed size/XL labels Nov 10, 2025
Signed-off-by: discord9 <[email protected]>
Signed-off-by: discord9 <[email protected]>
Signed-off-by: discord9 <[email protected]>
Signed-off-by: discord9 <[email protected]>
Signed-off-by: discord9 <[email protected]>
Signed-off-by: discord9 <[email protected]>

feat: gc scheduler

wip: gc trigger

Signed-off-by: discord9 <[email protected]>

feat: dn file removal rate

Signed-off-by: discord9 <[email protected]>

feat: trigger gc with stats(WIP)

Signed-off-by: discord9 <[email protected]>

chore

Signed-off-by: discord9 <[email protected]>

also move files ref manifest to store-api

Signed-off-by: discord9 <[email protected]>

feat: basic gc trigger impl

Signed-off-by: discord9 <[email protected]>

wip: handle file ref change

Signed-off-by: discord9 <[email protected]>

refactor: use region ids

Signed-off-by: discord9 <[email protected]>

fix: retry using related regions

Signed-off-by: discord9 <[email protected]>

chore: rm unused

Signed-off-by: discord9 <[email protected]>

fix: update file reference type in GC worker

Signed-off-by: discord9 <[email protected]>

feat: dn gc limiter

Signed-off-by: discord9 <[email protected]>

rename

Signed-off-by: discord9 <[email protected]>

feat: gc scheduler retry with outdated regions

Signed-off-by: discord9 <[email protected]>

feat: use real object store purger

Signed-off-by: discord9 <[email protected]>

wip: add to metasrv

Signed-off-by: discord9 <[email protected]>

feat: add to metasrv

Signed-off-by: discord9 <[email protected]>

feat: datanode gc worker handler

Signed-off-by: discord9 <[email protected]>

fix: no partition col fix

Signed-off-by: discord9 <[email protected]>

fix: RegionId json deser workaround

Signed-off-by: discord9 <[email protected]>

fix: find access layer

Signed-off-by: discord9 <[email protected]>

fix: on host dn

Signed-off-by: discord9 <[email protected]>

fix: stat dedup

Signed-off-by: discord9 <[email protected]>

refactor: rm load-based

Signed-off-by: discord9 <[email protected]>

chore: aft rebase fix

Signed-off-by: discord9 <[email protected]>

feat: not full scan

Signed-off-by: discord9 <[email protected]>

chore: after rebase fix

Signed-off-by: discord9 <[email protected]>

feat: clean tracker

Signed-off-by: discord9 <[email protected]>

after rebase fix

Signed-off-by: discord9 <[email protected]>

clippy

Signed-off-by: discord9 <[email protected]>

refactor: split gc scheduler

Signed-off-by: discord9 <[email protected]>

feat: smaller linger time

Signed-off-by: discord9 <[email protected]>

feat: parallel region gc instr

Signed-off-by: discord9 <[email protected]>

chore: rename

Signed-off-by: discord9 <[email protected]>

chore: rename

Signed-off-by: discord9 <[email protected]>

enable is false

Signed-off-by: discord9 <[email protected]>

feat: update removed files precisely

Signed-off-by: discord9 <[email protected]>

all default to false&use local file purger

Signed-off-by: discord9 <[email protected]>

feat: not evict if gc enabled

Signed-off-by: discord9 <[email protected]>

per review

Signed-off-by: discord9 <[email protected]>

fix: pass gc config in mito&test: after truncate gc

Signed-off-by: discord9 <[email protected]>

WIP: one more test

Signed-off-by: discord9 <[email protected]>

test: basic compact

Signed-off-by: discord9 <[email protected]>

test: compact with ref

Signed-off-by: discord9 <[email protected]>

refactor: for easier mock

Signed-off-by: discord9 <[email protected]>

docs: explain race condition

Signed-off-by: discord9 <[email protected]>

feat: gc region procedure

Signed-off-by: discord9 <[email protected]>

refactor: ctx send gc/ref instr with procedure

Signed-off-by: discord9 <[email protected]>

fix: config deser to default

Signed-off-by: discord9 <[email protected]>

refactor: gc report

Signed-off-by: discord9 <[email protected]>

wip: async index file rm

Signed-off-by: discord9 <[email protected]>

fixme?

Signed-off-by: discord9 <[email protected]>

typo

Signed-off-by: discord9 <[email protected]>

more ut

Signed-off-by: discord9 <[email protected]>

test: more mock test

Signed-off-by: discord9 <[email protected]>

more

Signed-off-by: discord9 <[email protected]>

refactor: split mock test

Signed-off-by: discord9 <[email protected]>

clippy

Signed-off-by: discord9 <[email protected]>

refactor: rm stuff

Signed-off-by: discord9 <[email protected]>

test: mock add gc report per region

Signed-off-by: discord9 <[email protected]>

fix: stricter table failure condition

Signed-off-by: discord9 <[email protected]>

sutff

Signed-off-by: discord9 <[email protected]>

feat: can do different table gc same time&more todos

Signed-off-by: discord9 <[email protected]>

after rebase check

Signed-off-by: discord9 <[email protected]>
Signed-off-by: discord9 <[email protected]>
Signed-off-by: discord9 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-not-required This change does not impact docs. documentation size/XXL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants