Skip to content

Restore configuration tips

Esta Nagy edited this page Feb 21, 2024 · 8 revisions

Restoring a backup has plenty of options just like creating them. Please read this article to get a good overview of the available options and the rational behind using them.

Restore options

Option Type Meaning
--at-epoch-seconds integer The date and time using UTC epoch seconds at which the content should be restored.
--backup-source path Defines which directory contains the backup files. All relevant .cargo files must be directly in the specified folder (including the manifest, index and archive chunks).
--delete-missing boolean Allows deleting the files from the target directory that are matching the backup source patterns but are not present in the backup increment which is being restored. Default: false (off).
--dry-run boolean Only simulates file operations if provided. Default: false (off). This can be particularly useful when we want to first see what would happen to our file system if the backup would be restored before actually performing the restore. For example in case we are not sure about our restore configuration.
--include-path archive path Path of the file or directory which should be restored from the backup. Optional. If not provided, all files should be restored. The path defined here must use the original path where the files were at the time of the backup.
--key-alias string The alias of the key inside the P12 store containing the key decryption key. Default: default
--key-store path Defines where the P12 key store containing the key decryption key can be found. Required only if the backup was encrypted.
--permission-comparison enum The strategy we should use when comparing permissions. Possible values are STRICT, PERMISSION_ONLY, RELAXED, IGNORE. Optional. Default: STRICT
--prefix string Defines the prefix of the backup files inside the backup directory.
--target-mapping from_dir=to_dir (mapping) Defines where the files should be restored to by defining mappings that can specify which from_dir of the backup source should be restored to which to_dir (as if the two were equivalent). Optional.. The from_dir must use the path according to the backup's format, while the to_dir side should specify a folder on the current file system, therefore using valid paths on the restore FS. The option can be repeated, defining multiple mappings as a result.
--threads integer Sets the number of threads to use. Default: 1

Restore considerations

This section shows a few interesting considerations you might need to have when preparing for a restore.

Full vs. partial restore

Partial restore is useful when you don't need the full content of the backup, but would like to access a single file or the contents of a single directory. In these cases you just need to define the relevant file or directory using the --include-path option. For example:

java -jar file-barj-job.jar --restore --backup-source /home/user/Backup/ \
     --prefix gzip-backup \
     --include-path /home/user/Pictures/

The above example restores the backup content of the /home/user/Pictures to the original location.

Tip

It is worth mentioning, that restoring the full backup content (somewhere, not necessarily to the original location) can be beneficial when we need access to most of the files.

Restoring to the original place or an alternative directory

A backup is not only useful when we have lost a drive or computer due to a hardware failure or theft. It can be used as a save point as well, letting us to go back in time to access earlier versions of certain files. Restoring the full backup content to the original location can make most sense when the contents of the disk or computer were lost. Using an alternative location as restore destination should be used, when we want to simply access the previous versions of our files, or we are restoring on another machine (or even on a different OS). For example, when we are restoring a UNIX backup on Windows, the original location cannot even exist.

When defining the mappings between the backup files (from) and the desired restore locations (to), we need to make sure we are using the right path formats. On the from side, we need to use the paths as they are found in the backup, while the to side must define paths according to the rules of the destination file system.

The following example shows mappings in case of a backup made on UNIX and the destination directories on Windows.

java -jar file-barj-job.jar --restore \
     --backup-source D:/Backup/ \
     --prefix gzip-backup \
     --target-mapping /home/user/Pictures/=C:/FromUnix/Pictures \
     --target-mapping /home/user/Music/=C:/FromUnix/Music

Important

The pairs we define above, must NOT overlap with each other in any way in order to ensure that the files will be restored as desired.

Should you allow deletions?

In order to answer this question, we should first discuss what this deletion would remove.

What is deleted?

Perhaps the easiest way to understand this is when we have a directory named dir with the following files:

  • /dir/a.txt
  • /dir/b.txt

We create a backup and then modify the content of each file. Then at some point we rename a.txt to c.txt and a new d.txt is created. If we decided that we wanted to restore the earlier backup, because we need the content of the original files, then we would end up with the following contents after the restore:

  • /dir/a.txt - with the content from the backup
  • /dir/b.txt - with the content from the backup
  • /dir/c.txt - with the updated content of the renamed a.txt
  • /dir/d.txt - with the new content

As you can see, even without allowing deletions, we ended up deleting the new content of b.txt when the restore operation overwritten it. At the same time, the c.txt and d.txt remained as left-over files. The --delete-missing intends to delete the similar left-over file, allowing the restore operation to fully restore the original state of the /dir/ backup directory.

Note

The deletions are only performed in case of the files in scope for the restore. For example, when --include-path /dir/a.txt is specified, the backup scope is reduced to only one file and nothing is deleted.

Important

Using --include-path /dir/ (or restoring all files by not specifying the include-path option) broadens the scope to the whole /dir/ directory and all files in it, therefore allowing the deletion of c.txt and d.txt. This is worth keeping in mind to avoid unintended deletions.

When should I allow/prevent deletions?

Preventing deletions can make the most sense when we want to make sure newly created files are not removed or in case we are merging the contents of a backup archive with some already existing other files.

At the same time, allowing deletions can be very useful when we want to have exactly the same content as it was when the backup was made, for example when we want to restore the full contents of a directory to an earlier state.

Finding the optimal thread count

Contrary to popular belief, more threads won't necessarily make something faster. For example, when we are restoring a backup, that used neither compression nor encryption, the whole restore process will be bound by I/O instead of the CPU. Throwing more threads to the problem will not make the process faster, in fact it might even make it slower when the multiple restore processes start to fight for the I/O bandwidth. With that said, let's look at a few examples below.

High thread count use-cases

If we are using compression and/or encryption on a fast disk, we can benefit from using more threads as it will:

  • Let us benefit from the more cores as we are decrypting and unpacking multiple files at the same time.
  • Speed up change detection checks, since we will be able to scan more files parallel.

Low thread count use-cases

Limiting the thread count to a lower number (or even 1) is useful when we suspect, that the disk we are using will become a bottle neck. For example when we are restoring from a network attached volume.

When do you need permission comparison strategies?

The permission comparison strategies are in general not useful when you are restoring your files on the same system you used for the backup. These strategies were created to help reduce the unwarranted errors and effort generated by certain edge cases. In short, when the permissions are compared, the following components can be compared:

  • the POSIX permission string
  • the name of the owner
  • the name of the group

Please find what each strategy does below.

STRICT

Compares all 3 components. This is the recommended strategy when you are restoring on the same (or very similar system). It is important, that the system should use a similar file system (supporting the same permission features) as well as a user and a group must exist with the same name.

PERMISSION_ONLY

Compares only the POSIX permission strings, ignoring the owner's and the group's name. This is the next best option when the file system is similar on the machine where we are restoring, but the user or the group does not exist.

RELAXED

Compares only the first 3 characters of the POSIX permission strings, ignoring the owner's and the group's name. It can be used in case you are restoring a Windows backup on a UNIX computer.

IGNORE

Ignores all 3 components, returning that they are matching regardless of the input. This is the recommended setting for restoring UNIX backups on Windows, as the Windows file system implementation cannot match the POSIX permissions.