Skip to content

Commit

Permalink
feat(restore): rework progress aggregation (#4225)
Browse files Browse the repository at this point in the history
* refactor(schema): remove unused restore_run_progress skipped column

This column might be interesting for backup which performs
deduplication, but it does not make any sense for restore.
It wasn't even a part of restore progress API, so it's safe
to remove it.

* feat(schema): add restore_run_progress restored column

This column wasn't needed with the sync load&stream Scylla API
(restored is equal to either 0 or downloaded), but it is cleaner
to have it and it can be useful for native restore Scylla API.

* feat(restore): adjust Rclone and l&s restore progress update

Followup to the e112747.
In the past we treated RestoredStartedAt/RestoreCompletedAt
as the time frame for l&s, but we should look at it more
inclusively, so as the time frame of restoring bytes,
which for Rclone API + l&s approach starts with the download.

* refactor(restore): separate progress aggregation from DB query

Previously, the code aggregating restore progress and the code
querying the run progress for aggregation were tightly tangled.
This resulted in difficulties in writing unit tests and a more
confusing code overall.
This commit changes progress aggregation to use iterator over
run progresses, which can be easily mocked in unit tests.

* feat(restore): fill HostProgress RestoredBytes/Duration

Followup to the e112747.

* feat(restore): allow for specifying now in progress aggregation

It is important to specify now when testing.

* feat(restore): add unit tests for progress aggregation

Thanks to the previous commit, it is now possible
to add unit tests for progress aggregation!
  • Loading branch information
Michal-Leszczynski authored Jan 31, 2025
1 parent 84fbef9 commit 42f72f8
Show file tree
Hide file tree
Showing 16 changed files with 743 additions and 192 deletions.
2 changes: 1 addition & 1 deletion pkg/schema/table/table.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 5 additions & 4 deletions pkg/service/restore/index.go
Original file line number Diff line number Diff line change
Expand Up @@ -122,13 +122,14 @@ func (w *tablesWorker) filterPreviouslyRestoredSStables(ctx context.Context, raw
w.logger.Info(ctx, "Filter out previously restored sstables")

remoteSSTableDirToRestoredIDs := make(map[string][]string)
err := forEachProgress(w.session, w.run.ClusterID, w.run.TaskID, w.run.ID, func(pr *RunProgress) {
seq := newRunProgressSeq()
for pr := range seq.All(w.run.ClusterID, w.run.TaskID, w.run.ID, w.session) {
if validateTimeIsSet(pr.RestoreCompletedAt) {
remoteSSTableDirToRestoredIDs[pr.RemoteSSTableDir] = append(remoteSSTableDirToRestoredIDs[pr.RemoteSSTableDir], pr.SSTableID...)
}
})
if err != nil {
return nil, errors.Wrap(err, "iterate over prev run progress")
}
if seq.err != nil {
return nil, errors.Wrap(seq.err, "iterate over prev run progress")
}
if len(remoteSSTableDirToRestoredIDs) == 0 {
return rawWorkload, nil
Expand Down
6 changes: 5 additions & 1 deletion pkg/service/restore/model.go
Original file line number Diff line number Diff line change
Expand Up @@ -194,13 +194,15 @@ type RunProgress struct {
ShardCnt int64 // Host shard count used for bandwidth per shard calculation.
AgentJobID int64

// DownloadStartedAt and DownloadCompletedAt are within the
// RestoreStartedAt and RestoreCompletedAt time frame.
DownloadStartedAt *time.Time
DownloadCompletedAt *time.Time
RestoreStartedAt *time.Time
RestoreCompletedAt *time.Time
Error string
Downloaded int64
Skipped int64
Restored int64
Failed int64
VersionedProgress int64
}
Expand Down Expand Up @@ -252,6 +254,8 @@ type KeyspaceProgress struct {
type HostProgress struct {
Host string `json:"host"`
ShardCnt int64 `json:"shard_cnt"`
RestoredBytes int64 `json:"restored_bytes"`
RestoreDuration int64 `json:"restore_duration"`
DownloadedBytes int64 `json:"downloaded_bytes"`
DownloadDuration int64 `json:"download_duration"`
StreamedBytes int64 `json:"streamed_bytes"`
Expand Down
Loading

0 comments on commit 42f72f8

Please sign in to comment.