Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

br: a more straight forward operator interface #5710

Open
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

RidRisR
Copy link

@RidRisR RidRisR commented Aug 15, 2024

What problem does this PR solve?

Closes #5699

  1. Use a more straight forward operator interface to replace the old one.
  2. support pause and resume pitr task

What is changed and how does it work?

Now, the crd to control log backup would loops like

  backupMode: log
  subcommand: log-start/log-stop/log-pause

(If no subcommand is provided, the default value is log-start)

If the task is not created before:
log-start: would start the task
log-pause/log-stop: would not run until start finished

If the task is running:
log-start: would skip and do nothing
log-pause: would pause the task
log-stop:would stop the task

If the task is paused:
log-start: would resume the task
log-pause: would skip and do nothing
log-stop: would stop the task

If the task is stopped:
log-start/log-pause/log-stop: would do nothing

截屏2024-09-02 17 50 46

Code changes

  • Has Go code change
  • Has CI related scripts change

Tests

  • Unit test
  • E2E test
  • Manual test
  • No code

Side effects

  • Breaking backward compatibility
  • Other side effects:

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

Release Notes

Please refer to Release Notes Language Style Guide before writing the release note.

1. Use a more straight forward operator interface to replace the old one.
Now, the crd to control log backup would loops like
`spec:
  backupMode: log
  subcommand: `log-start`/`log-stop`/`log-pause`
`
(If no subcommand is provided, the default value is `log-start`)

2. Support pause and resume pitr task
Now, users could pause and resume pitr task by using `log-pause` and `log-start`

Copy link
Contributor

ti-chi-bot bot commented Aug 15, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign charleszheng44 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-bot
Copy link
Contributor

sre-bot commented Aug 15, 2024

CLA assistant check
All committers have signed the CLA.

@ti-chi-bot ti-chi-bot bot added the size/XXL label Aug 15, 2024
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 20.00000% with 8 lines in your changes missing coverage. Please review.

Project coverage is 61.47%. Comparing base (9ef26f8) to head (9419c07).
Report is 16 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff            @@
##           master    #5710    +/-   ##
========================================
  Coverage   61.47%   61.47%            
========================================
  Files         235      235            
  Lines       30653    30825   +172     
========================================
+ Hits        18843    18951   +108     
- Misses       9920     9963    +43     
- Partials     1890     1911    +21     
Flag Coverage Δ
unittest 61.47% <20.00%> (+<0.01%) ⬆️

@RidRisR RidRisR marked this pull request as ready for review August 29, 2024 09:35
@ti-chi-bot ti-chi-bot bot requested a review from shonge August 29, 2024 09:35
@ti-chi-bot ti-chi-bot bot removed the needs-rebase label Aug 30, 2024
@RidRisR RidRisR changed the title Subcommand br: a more straight forward operator interface Aug 30, 2024
@BornChanger BornChanger requested review from csuzhangxc and WangLe1321 and removed request for howardlau1999 and shonge August 30, 2024 10:24
Comment on lines -324 to -327
if backup.Spec.LogStop {
switch backup.Spec.LogSubcommand {
case "log-start":
if IsLogBackupAlreadyPaused(backup) {
return LogResumeCommand
}
return LogStartCommand
case "log-stop":
return LogStopCommand
}
if backup.Spec.LogTruncateUntil != "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it needed to compatible with the old fields?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, user shouldn't mix up these two interface to avoid mistakes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the old existing Backup user MUST update their pipeline to use these new fields, right?

Copy link
Author

@RidRisR RidRisR Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we don't want handle corner case like:

Spec:
    logSubcommand: log-start
    logStop: true
```

BTW, I am fresh to operator. Would there be any problem?
I tested updating operator online while a pitr task is running.
Seems there wouldn't be any problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to keep compatible with the old fields, because tidbcloud depends on current implementation. I think we can make old fields have higher priority. For example:

if backup.Spec.LogStop
    return LogStopCommand
if backup.Spec.LogTruncateUntil != "" && backup.Spec.LogTruncateUntil != backup.Status.LogSuccessTruncateUntil
    return LogTruncateCommand

switch backup.Spec.LogSubcommand
...

// default command is log start
return LogStartCommand

pkg/apis/pingcap/v1alpha1/types.go Outdated Show resolved Hide resolved
@@ -145,6 +145,7 @@ func (c *Controller) sync(key string) error {
return err
}

klog.Infof("Syncing Backup %s/%s", ns, name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log should be debug level.

"resume",
fmt.Sprintf("--task-name=%s", backup.Name),
}
// if bo.CommitTS != "" && bo.CommitTS != "0" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest to remove these comment codes. They seem useless.

Comment on lines -324 to -327
if backup.Spec.LogStop {
switch backup.Spec.LogSubcommand {
case "log-start":
if IsLogBackupAlreadyPaused(backup) {
return LogResumeCommand
}
return LogStartCommand
case "log-stop":
return LogStopCommand
}
if backup.Spec.LogTruncateUntil != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to keep compatible with the old fields, because tidbcloud depends on current implementation. I think we can make old fields have higher priority. For example:

if backup.Spec.LogStop
    return LogStopCommand
if backup.Spec.LogTruncateUntil != "" && backup.Spec.LogTruncateUntil != backup.Status.LogSuccessTruncateUntil
    return LogTruncateCommand

switch backup.Spec.LogSubcommand
...

// default command is log start
return LogStartCommand

pkg/apis/pingcap/v1alpha1/backup.go Outdated Show resolved Hide resolved

func IsBackupPaused(backup *Backup) bool {
if backup.Spec.Mode == BackupModeLog {
return IsLogBackupSubCommandOntheCondition(backup, BackupPaused)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems log backup sub command condition has no BackupPaused status

@@ -230,6 +230,10 @@ func (c *Controller) updateBackup(cur interface{}) {
return
}

if newBackup.Spec.Mode == v1alpha1.BackupModeLog && newBackup.Spec.LogSubcommand == string(v1alpha1.LogStartCommand) && v1alpha1.IsBackupPaused(newBackup) {
newBackup.Spec.LogSubcommand = string(v1alpha1.LogResumeCommand)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why add this hack logic. We shouldn't modify the spec fields even the user inputs wrong value.

return IsLogBackupSubCommandOntheCondition(backup, BackupPaused)
}
_, condition := GetBackupCondition(&backup.Status, BackupPaused)
return condition != nil && condition.Status == corev1.ConditionTrue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only means the log backup is ever paused, not means it is paused. I guess you want to check if the log backup is paused. Only backup.Status.Phase represents the current status.

@@ -392,3 +416,13 @@ func IsLogBackupAlreadyTruncate(backup *Backup) bool {
func IsLogBackupAlreadyStop(backup *Backup) bool {
return backup.Spec.Mode == BackupModeLog && backup.Status.Phase == BackupStopped
}

// IsLogBackupAlreadyStop return whether log backup has already paused.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// IsLogBackupAlreadyStop return whether log backup has already paused.
// IsLogBackupAlreadyPaused return whether log backup has already paused.

return backup.Spec.Mode == BackupModeLog && backup.Status.Phase == BackupPaused
}

// IsLogBackupAlreadyRestart return whether log backup has already resumed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// IsLogBackupAlreadyRestart return whether log backup has already resumed.
// IsLogBackupAlreadyRunning return whether log backup has already resumed.

@BornChanger
Copy link
Contributor

BornChanger commented Sep 17, 2024

For the state machine figure, please also add the scenarios where the state change due to gc-ttl. @RidRisR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pitr: add support to log backup subcommand of status/pause/resume
6 participants