-
Notifications
You must be signed in to change notification settings - Fork 3.7k
feat: introduce snapshot checkpoints #27153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5d7d610 to
5c46713
Compare
This feature is cherry-picked from Enterprise. It is an optional feature (enabled by setting the `--checkpoint-duration <duration>` flag) that is intended to reduce overall server startup time under extremely high load combined with files regularly deleted via retention periods.
5c46713 to
cade4d6
Compare
This feature is cherry-picked from Enterprise. It is an optional feature (enabled by setting the `--checkpoint-duration <duration>` flag) that is intended to reduce overall server startup time under extremely high load combined with files regularly deleted via retention periods.
|
Cherry-pick of: |
hiltontj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that are slightly different than in the original PR, but I don't see issue with them.
I made a note on a couple, but do want to double check that the changes to CircleCI config were intentional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some discrepancies in the diff for this file compared to https://github.com/influxdata/influxdb_pro/pull/2271 but from what I see it is for code that was added in Enterprise elsewhere.
cade4d6 to
ac2b97c
Compare
* chore: add trace logs to help troubleshoot slow snapshot loading * test: add benchmarks for PersistedFiles.get_files * test: add benchmarks for persisted_files::add_persisted_snapshot * chore: use swap_remove instead of remove during snapshot merge
…#2030) * chore: pass compacted data snapshot markers to PersistedFiles constructor * chore: initialize CompactedData for ingester-only mode (drop CompactedData after obtaining snapshot markers in ingest-only mode) * chore: enable setting build profile for manual PR trigger
ac2b97c to
08772d6
Compare
* feat: introduce PersistedSnapshotCheckpoint to parquet engine * chore: introduce PersistedSnapshotCheckpoint, SnapshotCheckpointPath, SnapshotCheckpointSequenceNumber * chore: introduce Persister methods for PersistedSnapshotCheckpoint * chore: add checkpoint_interval param to Persister::new * chore: add --checkpoint-interval flag to serve subcommand (disable by default) * feat: implement periodic snapshot checkpoint creation as part of snapshot creation * chore: make use of checkpoints during WriteBufferImpl initialization * chore: retain previous checkpoint in-memory for subsequent snapshot updates * test: improve checkpoint-related Persister coverage * chore: store FileIndex with cached snapshot checkpoint in Persister * fix: use insta redaction to avoid cross-test contamination with id values * chore: warn if checkpoints cannot be loaded during startup * test: add WriteBufferImpl startup integration tests with checkpoint edge cases * chore: register checkpoint-interval flag as non-sensitive * chore: build and persist checkpoints covering lookback period if they don't exist * chore: pass Vec<PersistedSnapshot> around as Arc<Vec<PersistedSnapshot>> to reduce peak memory usage
08772d6 to
6467f9a
Compare
This feature is cherry-picked from Enterprise. It is an optional feature (enabled by setting the `--checkpoint-duration <duration>` flag) that is intended to reduce overall server startup time under extremely high load combined with files regularly deleted via retention periods.
|
Note: this PR now also includes snapshot cleanup behaviors added in https://github.com/influxdata/influxdb_pro/pull/2174 |
This feature is cherry-picked from Enterprise. It is an optional feature (enabled by setting the `--checkpoint-duration <duration>` flag) that is intended to reduce overall server startup time under extremely high load combined with files regularly deleted via retention periods. Cherry-picked PRs: * influxdata/influxdb_pro#2018 * influxdata/influxdb_pro#2030 * influxdata/influxdb_pro#2173 * influxdata/influxdb_pro#2174 Based on influxdata/influxdb_pro#2173
This feature is cherry-picked from Enterprise. It is an optional feature (enabled by setting the
--checkpoint-duration <duration>flag) that is intended to reduce overall server startup time under extremely high load combined with files regularly deleted via retention periods.