Skip to content

[OSDOCS-11124]: Add automated backup/restore with OADP docs #94958

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lahinson
Copy link
Contributor

@lahinson lahinson commented Jun 18, 2025

@lahinson lahinson added this to the Continuous Release milestone Jun 18, 2025
@openshift-ci openshift-ci bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 18, 2025
@lahinson lahinson force-pushed the osdocs-11124-hcp-automate-capture branch 7 times, most recently from aad960c to b430ae0 Compare June 18, 2025 18:31
.Procedure

* If you use a bare metal platform, you can create a DPA by creating a manifest file similar to the following example:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jparrill If we need to call out anything specific about the parameters in the manifest, let me know.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, It's perfect. This is suitable once the storage provider it's compatible with S3 API like MinIO. That could differ if the storage provider is not compatible with S3 API, then the DPA manifest would be different. Maybe we can remark this.

Copy link

@jparrill jparrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

.Procedure

* If you use a bare metal platform, you can create a DPA by creating a manifest file similar to the following example:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, It's perfect. This is suitable once the storage provider it's compatible with S3 API like MinIO. That could differ if the storage provider is not compatible with S3 API, then the DPA manifest would be different. Maybe we can remark this.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 19, 2025
@lahinson lahinson force-pushed the osdocs-11124-hcp-automate-capture branch from b430ae0 to 30fccd3 Compare June 20, 2025 14:39
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 20, 2025
Copy link

openshift-ci bot commented Jun 20, 2025

New changes are detected. LGTM label has been removed.

@lahinson lahinson force-pushed the osdocs-11124-hcp-automate-capture branch from 30fccd3 to c50093e Compare June 20, 2025 14:52
@lahinson
Copy link
Contributor Author

@LiangquanLi930 When you're available, please provide QE review. Thanks!

@lahinson lahinson added the peer-review-needed Signifies that the peer review team needs to review this PR label Jun 20, 2025
@mburke5678 mburke5678 added the peer-review-in-progress Signifies that the peer review team is reviewing this PR label Jun 20, 2025
[id="prepare-aws-oadp-auto_{context}"]
== Preparing {aws-short} to use {oadp-short}

To perform disaster recovery for a hosted cluster, you can use {oadp-first} on {aws-first} S3 compatible storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I use {oadp-first} on {aws-first} S3 compatible storage? Will a typical user know how do do this? Is there somewhere you can link to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. The intended audience for this content will know how to use AWS S3 compatible storage. Actually, now that I look at this text again, I don't think the sentence on line 34 is necessary. The info that users need is on line 35. I'll remove line 34.

Comment on lines +45 to +44
* Backing up the data plane workload
* Backing up the control plane workload
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be links?

[id="prepare-bm-dr-oadp-auto_{context}"]
== Preparing bare metal to use {oadp-short}

To perform disaster recovery for a hosted cluster, you can use {oadp-first} on bare metal.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct?

Suggested change
To perform disaster recovery for a hosted cluster, you can use {oadp-first} on bare metal.
To perform disaster recovery for a bare-metal hosted cluster, you can use {oadp-first} on bare metal.

or,

Suggested change
To perform disaster recovery for a hosted cluster, you can use {oadp-first} on bare metal.
To perform disaster recovery for a bare-metal cluster, you can use {oadp-first} on bare metal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this line can go away completely.


.Next steps

* Restoring a hosted cluster by using {oadp-short}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a link?

@lahinson lahinson force-pushed the osdocs-11124-hcp-automate-capture branch from c50093e to f96f100 Compare June 20, 2025 16:36
defaultVolumesToFsBackup: true <7>
----
====
<1> Replace `backup_resource_name` with the name of your `Backup` resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with the name of makes it sound like the Backup resource already exists, to me. Is this accurate:

Suggested change
<1> Replace `backup_resource_name` with the name of your `Backup` resource.
<1> Replace `backup_resource_name` with a name for your `Backup` resource.


.Verification

* Verify if the value of the `status.phase` is `Completed` by running the following command:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think verify that is used when confirming something. Verify if is for checking possibility (verify if I can go to Raleigh)

Suggested change
* Verify if the value of the `status.phase` is `Completed` by running the following command:
* Verify that the value of the `status.phase` is `Completed` by running the following command:


.Procedure

* If you use a bare metal platform, you can create a DPA by creating a manifest file similar to the following example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* If you use a bare metal platform, you can create a DPA by creating a manifest file similar to the following example:
* If you use a bare-metal platform, you can create a DPA by creating a manifest file similar to the following example:

<1> Specify the provider for Velero. If you are using bare metal and MinIO, you can use `aws` as the provider.
<2> Specify the bucket name; for example, `oadp-backup`.
<3> Specify the bucket prefix; for example, `hcp`.
<4> The bucket region in this example is `minio`, which is a storage provider that is compatilble with the S3 API.
Copy link
Contributor

@mburke5678 mburke5678 Jun 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I know if my storage provider is compatible.

----
<1> Specify the bucket name; for example, `oadp-backup`.
<2> Specify the bucket prefix; for example, `hcp`.
<3> The bucket region in this example is `minio`, which is a storage provider that is compatilble with the S3 API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I know if my storage provider is compatible?

Comment on lines +125 to +126
* Backing up the data plane workload
* Backing up the control plane workload
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want links here, where the second item is not contiguous?

Comment on lines +11 to +12
* If you are using an _in-place_ update, InfraEnv does not need spare nodes. You need to re-provision the worker nodes from the new management cluster.
* If you are using a _replace_ update, you need some spare nodes for InfraEnv to deploy the worker nodes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure you need the italics here. You are not defining the term, and in other places where there were different paths for different objects, you didn't italicize. Will typical users know which update they are doing?
Also, do we need to state what InfraEnv is?

Suggested change
* If you are using an _in-place_ update, InfraEnv does not need spare nodes. You need to re-provision the worker nodes from the new management cluster.
* If you are using a _replace_ update, you need some spare nodes for InfraEnv to deploy the worker nodes.
* If you are using an in-place update, InfraEnv does not need spare nodes. You need to re-provision the worker nodes from the new management cluster.
* If you are using a replace update, you need some spare nodes for InfraEnv to deploy the worker nodes.

Comment on lines +21 to +22
* You completed the steps in link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.13/html/clusters/cluster_mce_overview#remove-a-cluster-by-using-the-console[Removing a cluster by using the console] to delete your hosted cluster.
* You completed the steps in link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.13/html/clusters/cluster_mce_overview#removing-a-cluster-from-management-in-special-cases[Removing remaining resources after removing a cluster].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per doc style guide, use the <document_heading_name> (<document_source>) format for external links.

Suggested change
* You completed the steps in link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.13/html/clusters/cluster_mce_overview#remove-a-cluster-by-using-the-console[Removing a cluster by using the console] to delete your hosted cluster.
* You completed the steps in link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.13/html/clusters/cluster_mce_overview#removing-a-cluster-from-management-in-special-cases[Removing remaining resources after removing a cluster].
* You completed the steps in link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.13/html/clusters/cluster_mce_overview#remove-a-cluster-by-using-the-console[Removing a cluster by using the console] [({rh-rhacm} documentation) to delete your hosted cluster.
* You completed the steps in link:https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.13/html/clusters/cluster_mce_overview#removing-a-cluster-from-management-in-special-cases[Removing remaining resources after removing a cluster [({rh-rhacm} documentation).

Comment on lines +63 to +64
<1> Replace `<restore_resource_name>` with the name of your `Restore` resource.
<2> Replace `<backup_resource_name>` with the name of your `Backup` resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<1> Replace `<restore_resource_name>` with the name of your `Restore` resource.
<2> Replace `<backup_resource_name>` with the name of your `Backup` resource.
<1> Replace `<restore_resource_name>` with a name for your `Restore` resource.
<2> Replace `<backup_resource_name>` with a name for your `Backup` resource.

Comment on lines +48 to +47
[id="prepare-bm-dr-oadp-auto_{context}"]
== Preparing bare metal to use {oadp-short}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This an preparing AWS to use OADP should probably be individual modules, as they are distinct use cases. In theory, it would be nice to have a unified heading for both, and the modules would be sub-modules.

Preparing your cluster to use OADP
Preparing AWS to use OADP
Preparing bare metal to use OADP

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. I based the structure here on the existing structure in the other disaster recovery docs for HCP. For this PR, I'll leave the structure as-is, but I will take note of this suggestion to apply it to this content and the other, non-automated, DR docs in the future.

[id="hcp-dr-oadp-dpa_{context}"]
= Automating the backup and restore process by using a DPA

You can automate parts of the backup and restore process by using a Data Protection Application (DPA). The DPA defines information including backup locations and Velero pod configurations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

automate parts of the backup and restore process

Which parts can be automated? And, what do I do after creating the DPA object?

.Next steps

* Restoring a hosted cluster by using OADP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a link here?

Suggested change
* Restoring a hosted cluster by using OADP
* Restoring a hosted cluster by using {oadp-short}

@mburke5678
Copy link
Contributor

mburke5678 commented Jun 20, 2025

@lahinson I added some comments. Let me know if you need explanation or further info. Otherwise LGTM. Sorry it took so long!

@mburke5678 mburke5678 added peer-review-done Signifies that the peer review team has reviewed this PR and removed peer-review-in-progress Signifies that the peer review team is reviewing this PR peer-review-needed Signifies that the peer review team needs to review this PR labels Jun 20, 2025
@lahinson lahinson force-pushed the osdocs-11124-hcp-automate-capture branch from f96f100 to f2f368e Compare June 23, 2025 15:16
Copy link

openshift-ci bot commented Jun 23, 2025

@lahinson: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch/enterprise-4.19 branch/enterprise-4.20 peer-review-done Signifies that the peer review team has reviewed this PR size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants