-
Notifications
You must be signed in to change notification settings - Fork 111
Add support for time-based snapshot intervals #3918
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Test Results 5 files - 2 5 suites - 2 1m 37s ⏱️ - 1m 3s Results for commit a15f9db. ± Comparison against base commit 1265be4. This pull request removes 47 and adds 34 tests. Note that renamed tests count towards both.♻️ This comment has been updated with latest results. |
3492f52 to
ff15f35
Compare
ff15f35 to
00de3b0
Compare
00de3b0 to
a15f9db
Compare
tillrohrmann
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating this PR @pcholakov. The changes look good to me. +1 for merging. Do we need to update our docs to also mention the snapshot_interval?
| } | ||
|
|
||
| /// Retrieve the latest known LSN to be archived to the snapshot repository. | ||
| /// Response of `Ok(Lsn::INVALID)` indicates no existing snapshot for the partition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe update the comment.
| match self { | ||
| ArchivedLsn::None => Ok(Duration::MAX), | ||
| ArchivedLsn::Snapshot { created_at, .. } => { | ||
| SystemTime::now().duration_since(*created_at) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it important that we distinguish between created_at > now and created_at == now? If not, then we might get rid fo the error case.
This change adds a basic time interval configuration option for triggering automated snapshots. When both time and record-based intervals are set, the logic for the trigger condition is an "and" between the two, so the record interval becomes a minimum requirement for time-based snapshotting.
Time interval-based snapshotting will get a lot more useful with incremental snapshot support, but this is already a big improvement as it allows operators to configure snapshots in slightly more intuitive terms of elapsed time.
The time interval is measured from the created-at timestamp of the published snapshot, which is based on the wall clock time of the producing node. Clock drift error is assumed to be relatively negligible compared to typical snapshotting intervals. A small amount of jitter is added to each partition's snapshot age check to spread snapshot uploads out over time.
Configuration examples
Periodic + minimum number of records
Translates into: "snapshot every 30 minutes, as long as the applied LSN has advanced by least 10,000 records since the last snapshot".
Periodic (unconditional)
Translates into: "snapshot once every 24 hours even if zero new changes are committed".