-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(1-1-restore): adds 1-1-restore cli and service stub #4221
base: 1-1-restore-feature-branch
Are you sure you want to change the base?
Conversation
f07e746
to
94d7777
Compare
pkg/service/one2onerestore/model.go
Outdated
SourceClusterID string `json:"source_cluster_id"` | ||
SnapshotTag string `json:"snapshot_tag"` | ||
NodesMapping []nodeMapping `json:"nodes_mapping"` | ||
BatchSize int `json:"batch_size,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think batch size is needed, it's ok to call RClone to copy all SSTables of the given keyspace/table of a given node at once.
RClone tracks the progress on its own, SM just calls for the progress using jobID.
pkg/service/one2onerestore/model.go
Outdated
Parallel int `json:"parallel,omitempty"` | ||
Transfers int `json:"transfers"` | ||
RateLimit []DCLimit `json:"rate_limit,omitempty"` | ||
AllowCompaction bool `json:"allow_compaction,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don;t need this parameter.
Compaction should be disabled to not waste CPU cycles and disk IO.
But it doesn't have an impact on the correctness of the restore.
This adds a new task type for fast vnode restore. Fixes: #4200
This introduces new cli command for fast vnode restore procedure. Fixes: #4200
This adds support for the fast restore task type to the task scheduler
This is the result of running `make docs`
This renames fastrestore to 1-1 restore and also makes it a subcommand of restore cmd.
This describes what 'same topology' means. Also adds references to new subcommands to the 'restore' cli page.
…s uuid This removes 1-1-restore options that are not needed at the moment (we can added them later if nedded). Also changed source-cluster-id type to uuid.UUID
37409c0
to
7e9f877
Compare
@karol-kokoszka @Michal-Leszczynski |
This dowgrades some of the deps that was updated by accident, so now it looks more like the master branch.
pkg/command/one2onerestore/res.yaml
Outdated
The `<dc>` parameter is optional. It allows you to specify the datacenter whose nodes will be used to restore the data | ||
from this location in a multi-dc setting, it must match Scylla nodes datacenter. | ||
By default, all live nodes are used to restore data from specified locations. | ||
|
||
Note that specifying datacenters closest to backup locations might reduce download time of restored data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that it applies to the 1-1-restore.
Since we have node mapping, we shouldn't allow to specify the location DC.
pkg/service/one2onerestore/model.go
Outdated
if _, err := SnapshotTagTime(t.SnapshotTag); err != nil { | ||
return err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can use IsSnapshotTag
instead.
pkg/service/one2onerestore/model.go
Outdated
if _, err := SnapshotTagTime(t.SnapshotTag); err != nil { | ||
return err | ||
} | ||
if t.SourceClusterID.String() == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can check if it's unset by t.SourceClusterID == uuid.Nil
"github.com/scylladb/scylla-manager/v3/pkg/util/uuid" | ||
) | ||
|
||
// Service for the 1-2-1 restore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's keep the same naming convention (1-1-restore or one2onerestore).
I guess that we prefer the 1-1-restore and use it when dashes are not a problem.
// One2OneRestore creates and initializes worker performing 1-2-1 vnode restore with given properties. | ||
func (s *Service) One2OneRestore(ctx context.Context, clusterID, taskID, runID uuid.UUID, properties json.RawMessage) error { | ||
s.logger.Info(ctx, "1-1 restore", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: align 1-1-restore naming convention (also in the log).
@@ -27,6 +27,7 @@ const ( | |||
UnknownTask TaskType = "unknown" | |||
BackupTask TaskType = "backup" | |||
RestoreTask TaskType = "restore" | |||
One2OneRestoreTask TaskType = "1_1_restore" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: yet another version of writing the 1-1-restore.
This adds usage of IsSnapshotTag and uuid.Nil. Also aligns 1-1-restore service naming convention across the code, comment and docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks !
I have no major concerns.
healthSvc *healthcheck.Service | ||
backupSvc *backup.Service | ||
restoreSvc *restore.Service | ||
one2OneRestoreSvc *one2onerestore.Service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you define the interface for the one2OneRestoreSvc.Service (like Servicer) and reference it here instead of the struct ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I can easily do this, but I'd like to clarify one detail in advance. I would like to define interface in scylla-manger/server.go
package, but I see that in the SM code base it's often other way around (e.g ConfigCacher defined in configcache
pkg). i think it's better to define interfaces on the consumer side https://go.dev/wiki/CodeReviewComments#interfaces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that the practice in go is to define the interface in the package that consumes the service, but here, the interface defines the API for the whole service which definition and implementation stays in the "one2onerestore" package.
So, for the clarity of what this package is expected to do, I prefer to have this interface that defines the API.
Returning interface from the constructor aligns with the intention to make the service’s API explicit and maintain a clear package boundary, even though it deviates a bit from the “accept interfaces, return concrete types” idiom.
Btw, packages like net/http
http.Handler (defined in the same package).
Another example of the same approach (breaking the idiom), is in go-kit https://github.com/go-kit/examples/blob/master/addsvc/pkg/addservice/service.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that the practice in go is to define the interface in the package that consumes the service, but here, the interface defines the API for the whole service which definition and implementation stays in the "one2onerestore" package.
So, for the clarity of what this package is expected to do, I prefer to have this interface that defines the API.
That make sense and it is a good intention, but why public methods of a Service struct doesn't define API and boundaries of a "one2onerestore" package?
In provided examples, they all have different pattern - interface is public, but implementation is private. In std lib it make sense - when you promise backward compatibility you don't want to expose some helper structs that can be changed in near future, e.g. http.FileServer. (and they use this pattern only for some helper functions).
In go-kit
they use public interface and private struct for the services (business logic), because they have a lot of wrappers around Service, to hide details about various transports (e.g. http, grpc) that can be used, see https://github.com/go-kit/examples/blob/master/addsvc/pkg/addendpoint/set.go#L34
P.S. I'm not trying to argue for the sake of arguing or to resist making the change, I'm modestly trying to understand and learn why we want to do that.
This adds 1-1 restore cli and stubs.
For now cli just schedules the 1-1 restore task type and when task is executed, service logs "Not yet implemented".
1-1-restore is added as a subcommand to the restore cli, so it looks like below:
Note: I've temporarily used
v3/pkg/managerclient
from theva/fast-restore-part-1
branch as requried changed not in the master yetP.S. Feel free to suggest another name for this cli and service 😄
Please make sure that: