Skip to content

Conversation

@pooknull
Copy link
Contributor

@pooknull pooknull commented Nov 11, 2025

K8SPXC-1214 Powered by Pull Request Badge

https://perconadev.atlassian.net/browse/K8SPXC-1214

DESCRIPTION

This PR adds a new field: .spec.backup.ttlSecondsAfterFinished.
The value will be applied to both backup and restore jobs.

To prevent breaking backup or restore processes when a small ttlSecondsAfterFinished value is used, a new finalizer, internal.percona.com/keep-job, has been introduced.
This finalizer will be added to every backup and restore job and will be removed once the corresponding pxc-backup or pxc-restore resource reaches its final Succeeded or Failed state.

CHECKLIST

Jira

  • Is the Jira ticket created and referenced properly?
  • Does the Jira ticket have the proper statuses for documentation (Needs Doc) and QA (Needs QA)?
  • Does the Jira ticket link to the proper milestone (Fix Version field)?

Tests

  • Is an E2E test/test case added for the new feature/change?
  • Are unit tests added where appropriate?
  • Are OpenShift compare files changed for E2E tests (compare/*-oc.yml)?

Config/Logging/Testability

  • Are all needed new/changed options added to default YAML files?
  • Are all needed new/changed options added to the Helm Chart?
  • Did we add proper logging messages for operator actions?
  • Did we ensure compatibility with the previous version or cluster upgrade process?
  • Does the change support oldest and newest supported PXC version?
  • Does the change support oldest and newest supported Kubernetes version?

@pull-request-size pull-request-size bot added the size/L 100-499 lines label Nov 14, 2025
@pooknull pooknull marked this pull request as ready for review November 20, 2025 11:29
deploy/cr.yaml Outdated
backup:
# allowParallel: true
image: perconalab/percona-xtradb-cluster-operator:main-pxc8.0-backup
# ttlSecondsAfterFinished: 60
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

people might understand this as a recommendation and 60 seconds is too short, especially for failed backups. let's make it something like 3600

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And for example for restore we have the same problem. We run job and could delete it before the cluster becomes ready.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 114 to 134
for _, jobName := range []string{
naming.RestoreJobName(cr, false),
naming.RestoreJobName(cr, true),
} {
if err := k8sretry.RetryOnConflict(k8sretry.DefaultRetry, func() error {
job := new(batchv1.Job)
if err := r.client.Get(ctx, types.NamespacedName{
Name: jobName,
Namespace: cr.Namespace,
}, job); err != nil {
if k8serrors.IsNotFound(err) {
return nil
}
return errors.Wrap(err, "failed to get job")
}
job.Finalizers = slices.DeleteFunc(job.Finalizers, func(s string) bool { return s == naming.FinalizerKeepJob })
return r.client.Update(ctx, job)
}); err != nil {
return reconcile.Result{}, errors.Wrap(err, "failed to remove keep-job finalizer")
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this to a separate function like we do in backup controller?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pooknull pooknull requested a review from egegunes November 24, 2025 11:52
@JNKPercona
Copy link
Collaborator

Test Name Result Time
auto-tuning-8-0 passed 00:21:48
backup-storage-tls-8-0 passed 00:21:58
cross-site-8-0 passed 00:34:32
custom-users-8-0 passed 00:13:06
demand-backup-cloud-8-0 passed 00:57:32
demand-backup-encrypted-with-tls-8-0 passed 00:55:39
demand-backup-8-0 passed 00:41:40
demand-backup-flow-control-8-0 passed 00:10:28
demand-backup-parallel-8-0 passed 00:09:12
demand-backup-without-passwords-8-0 passed 00:15:52
haproxy-5-7 passed 00:14:22
haproxy-8-0 passed 00:14:08
init-deploy-5-7 passed 00:18:41
init-deploy-8-0 passed 00:17:01
limits-8-0 passed 00:12:08
monitoring-2-0-8-0 passed 00:22:28
monitoring-pmm3-8-0 passed 00:18:20
one-pod-5-7 passed 00:14:16
one-pod-8-0 passed 00:13:55
pitr-8-0 passed 00:44:40
pitr-gap-errors-8-0 passed 00:56:04
proxy-protocol-8-0 passed 00:09:44
proxysql-sidecar-res-limits-8-0 passed 00:08:45
proxysql-scheduler-8-0 passed 00:16:05
pvc-resize-5-7 passed 00:14:19
pvc-resize-8-0 passed 00:15:57
recreate-8-0 passed 00:17:14
restore-to-encrypted-cluster-8-0 passed 00:26:09
scaling-proxysql-8-0 passed 00:08:46
scaling-8-0 passed 00:10:32
scheduled-backup-5-7 passed 01:09:37
scheduled-backup-8-0 passed 01:02:46
security-context-8-0 passed 00:26:00
smart-update1-8-0 passed 00:36:57
smart-update2-8-0 passed 00:38:14
storage-8-0 passed 00:11:07
tls-issue-cert-manager-ref-8-0 passed 00:08:43
tls-issue-cert-manager-8-0 passed 00:11:31
tls-issue-self-8-0 passed 00:13:24
upgrade-consistency-8-0 passed 00:11:25
upgrade-haproxy-5-7 passed 00:24:55
upgrade-haproxy-8-0 passed 00:26:17
upgrade-proxysql-5-7 passed 00:13:54
upgrade-proxysql-8-0 passed 00:16:01
users-5-7 passed 00:26:40
users-8-0 passed 00:26:34
validation-hook-8-0 passed 00:01:48
Summary Value
Tests Run 47/47
Job Duration 02:35:56
Total Test Time 18:01:37

commit: ff96f03
image: perconalab/percona-xtradb-cluster-operator:PR-2234-ff96f032

backup:
# allowParallel: true
image: perconalab/percona-xtradb-cluster-operator:main-pxc8.0-backup
# ttlSecondsAfterFinished: 3600
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pooknull do we have helm chart PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L 100-499 lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants