Skip to content

Conversation

@waynr
Copy link
Contributor

@waynr waynr commented Jan 26, 2026

This feature is cherry-picked from Enterprise. It is an optional feature (enabled by setting the --checkpoint-duration <duration> flag) that is intended to reduce overall server startup time under extremely high load combined with files regularly deleted via retention periods.

@waynr waynr force-pushed the cherry-pick/main/parquet-snapshot-checkpointing branch from 5d7d610 to 5c46713 Compare January 26, 2026 22:29
waynr added a commit that referenced this pull request Jan 26, 2026
This feature is cherry-picked from Enterprise. It is an optional feature
(enabled by setting the `--checkpoint-duration <duration>` flag) that is
intended to reduce overall server startup time under extremely high load
combined with files regularly deleted via retention periods.
@waynr waynr force-pushed the cherry-pick/main/parquet-snapshot-checkpointing branch from 5c46713 to cade4d6 Compare January 26, 2026 22:38
waynr added a commit that referenced this pull request Jan 26, 2026
This feature is cherry-picked from Enterprise. It is an optional feature
(enabled by setting the `--checkpoint-duration <duration>` flag) that is
intended to reduce overall server startup time under extremely high load
combined with files regularly deleted via retention periods.
@hiltontj
Copy link
Contributor

hiltontj commented Jan 27, 2026

Copy link
Contributor

@hiltontj hiltontj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that are slightly different than in the original PR, but I don't see issue with them.

I made a note on a couple, but do want to double check that the changes to CircleCI config were intentional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some discrepancies in the diff for this file compared to https://github.com/influxdata/influxdb_pro/pull/2271 but from what I see it is for code that was added in Enterprise elsewhere.

@waynr waynr force-pushed the cherry-pick/main/parquet-snapshot-checkpointing branch from cade4d6 to ac2b97c Compare January 27, 2026 17:13
waynr added 2 commits January 27, 2026 10:30
* chore: add trace logs to help troubleshoot slow snapshot loading
* test: add benchmarks for PersistedFiles.get_files
* test: add benchmarks for persisted_files::add_persisted_snapshot
* chore: use swap_remove instead of remove during snapshot merge
…#2030)

* chore: pass compacted data snapshot markers to PersistedFiles constructor
* chore: initialize CompactedData for ingester-only mode (drop CompactedData after obtaining snapshot markers in ingest-only mode)
* chore: enable setting build profile for manual PR trigger
@waynr waynr force-pushed the cherry-pick/main/parquet-snapshot-checkpointing branch from ac2b97c to 08772d6 Compare January 27, 2026 17:30
* feat: introduce PersistedSnapshotCheckpoint to parquet engine
* chore: introduce PersistedSnapshotCheckpoint, SnapshotCheckpointPath, SnapshotCheckpointSequenceNumber
* chore: introduce Persister methods for PersistedSnapshotCheckpoint
* chore: add checkpoint_interval param to Persister::new
* chore: add --checkpoint-interval flag to serve subcommand (disable by default)
* feat: implement periodic snapshot checkpoint creation as part of snapshot creation
* chore: make use of checkpoints during WriteBufferImpl initialization
* chore: retain previous checkpoint in-memory for subsequent snapshot updates
* test: improve checkpoint-related Persister coverage
* chore: store FileIndex with cached snapshot checkpoint in Persister
* fix: use insta redaction to avoid cross-test contamination with id values
* chore: warn if checkpoints cannot be loaded during startup
* test: add WriteBufferImpl startup integration tests with checkpoint edge cases
* chore: register checkpoint-interval flag as non-sensitive
* chore: build and persist checkpoints covering lookback period if they don't exist
* chore: pass Vec<PersistedSnapshot> around as Arc<Vec<PersistedSnapshot>> to reduce peak memory usage
@waynr waynr force-pushed the cherry-pick/main/parquet-snapshot-checkpointing branch from 08772d6 to 6467f9a Compare January 27, 2026 17:38
waynr added a commit that referenced this pull request Jan 27, 2026
This feature is cherry-picked from Enterprise. It is an optional feature
(enabled by setting the `--checkpoint-duration <duration>` flag) that is
intended to reduce overall server startup time under extremely high load
combined with files regularly deleted via retention periods.
@waynr
Copy link
Contributor Author

waynr commented Jan 27, 2026

Note: this PR now also includes snapshot cleanup behaviors added in https://github.com/influxdata/influxdb_pro/pull/2174

@waynr waynr merged commit 26d26f5 into main Jan 27, 2026
13 checks passed
@waynr waynr deleted the cherry-pick/main/parquet-snapshot-checkpointing branch January 27, 2026 18:30
waynr added a commit that referenced this pull request Jan 27, 2026
This feature is cherry-picked from Enterprise. It is an optional feature (enabled by setting the `--checkpoint-duration <duration>` flag) that is intended to reduce overall server startup time under extremely high load combined with files regularly deleted via retention periods.

Cherry-picked PRs:

* influxdata/influxdb_pro#2018
* influxdata/influxdb_pro#2030
* influxdata/influxdb_pro#2173
* influxdata/influxdb_pro#2174

Based on influxdata/influxdb_pro#2173
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants