Skip to content

Commit

Permalink
Initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
Jason Rust committed Jan 11, 2012
0 parents commit a2fb14e
Show file tree
Hide file tree
Showing 51 changed files with 1,990 additions and 0 deletions.
57 changes: 57 additions & 0 deletions CHANGES
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
2.3
===
* The example tar script now does not throw an error the first time the script runs.
* Now works correctly when only backing directories up once a day (limit => 1).
* Backup directories and exclude directories can now contain spaces.

2.2
===
* A per-host shell command can now be run for both a successful and
unsuccessful backup (Nathan).
* A different style of incremental backups is now available for each
the hard link method is not used and only changed files are put into
the backup directories. Useful for backup setups with lots of files
(Nathan).
* The ssh port is now configurable (Phil Schultz <[email protected]>)
* Include patterns are now supported (Dan Allen <[email protected]>)
* If ribs is backing up its host machine as the current user ssh is not used
and no password is needed (Dan Allen <[email protected]>)

2.1
===
* Added example test configuration and directory for easy testing of RIBS.
* Made default config options more sensible.
* Fixed bug where logfile wasn't created if it wasn't already present.
* Code cleanup.
* Moved reinit to its own function and made it possible to reinit
configurations that aren't turned on (Phil Schultz [email protected]).
* Backups will try and finish all specified backups even if an error
is hit during one of them (Phil Schultz [email protected]).

2.0.1
=====
* Fixed Console_Getopt bugs. Must be using version 1.0 of that library now.
* Cleaned up the README

2.0
===
* Added --test option which performs a dry run.
* Added --reinit option which reinitializes the backup from scratch.
* Added --debug option which outputs debug information.
* Improved error handling so that hitting CTRL-C during execution
shouldn't cause program to hang.
* Fixed bug where multiple excludes weren't working (Thanks to Brian
Johnson blj AT thermoanalytics DOT com).
* Added examples in the README on how to use the excludes feature
* Added example in the README on how to extract an incremental backup.
* Code cleanup.

1.1
===
* Now using PEAR's Getopt utility to make the obtaining of command line
args work with different versions of PHP
* Fixed up some the documentation to be more accurate.

1.0
===
* Initial release
340 changes: 340 additions & 0 deletions LICENSE

Large diffs are not rendered by default.

126 changes: 126 additions & 0 deletions README
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
RIBS (Rsync Incremental Backup System) by Jason Rust <[email protected]>
You can find the latest version of this script over at:
http://www.ribs-backup.org/

Description:
RIBS is an incremental backup system written in PHP which utilizes some
common *nix programs (specifically rsync, ssh and cp). Incremental
backups mean frequent backups can be done (i.e. hourly) with only around
2x the space of the full backup. Using rsync means that RIBS can act as
both a backup script on a local machine, or as a script to backup
several network hosts. It is designed to be highly configurable and
highly informative to the system administrator. There is a high amount
of error checking, and logging/email capabilities.

Requirements:
* rsync - http://samba.anu.edu.au/rsync/
* cp & rm - http://www.gnu.org/software/fileutils/fileutils.html
* PHP - http://www.php.net/
* basic PEAR libraries (as of version 1.1) - http://pear.php.net/
* PEAR's Console_Getopt-1.0 package. This comes with PEAR,
but many people have version 0.11 which won't work. Get it at:
http://pear.php.net/package-info.php?pacid=67

Quick Usage Explanation:
For those in a hurry or just wanting to test out the script, the below
commands should get you up and going:
* Download the latest version of RIBS
* tar -xzvf ribs-x.x.tar.gz
* cd ribs-x.x
* ./ribs.php example hourly

After that the test example backup should be run using the test directory
that comes with RIBS. From there you can customize the options and start
running backups on real data.

Detailed Usage Explanation:
Install rsync. Set it up to run over ssh (you will need to install ssh
keys on the servers you will be backing up (man ssh-keygen). If you set
this up right you should be able to ssh from the backup machine to the
remote host as the backup user without it asking you for a password
Next, go through the user settings of this script to set up the hosts
you want to backup and the different configuration options (such as
email and logging settings). Last you need to set the script up to run
in crontab for your different hosts.

An example crontab entry might look something like the following:
0 0-23/3 * * * /usr/local/bin/ribs my_host hourly # run my_host every three hours
59 1 * * * /usr/local/bin/ribs my_host,big_host daily # run these two hosts daily
58 1 * * 0 /usr/local/bin/ribs small_host weekly # run small_host once a week
57 1 1 * * /usr/local/bin/ribs ALL monthly # use the keyword ALL to run all hosts monthly

Notice that we schedule the daily, weekly, and monthly to occur at a
different hour than the hourly ones.

You can run this script from the command line and may want to do so a
few times before installing it in crontab to make sure you have worked
out the kinks. Also it is important to schedule the cron jobs such that
they will not overlap with each other. In other words, if the daily
backup runs at the same time as the hourly backup you will have
problems. Generally, scheduling the backups 15 minutes apart will work.

Exclude Patterns:
Much of the following explanation of exclude patterns comes from the
rsync man page. The patterns can take several forms. The rules are:

* If the pattern starts with a / then it is matched against the start of the
filename, otherwise it is matched against the end of the filename. Thus
"/foo" would match a file called "foo" at the base of the tree. On the
other hand, "foo" would match any file called "foo" anywhere in the tree
because the algorithm is applied recursively from top down; it behaves as
if each path component gets a turn at being the end of the file name.

* If the pattern ends with a / then it will only match a directory, not a file,
link or device.

* If the pattern contains a wildcard character from the set *?[ then expression
matching is applied using the shell filename matching rules. Otherwise a simple
string match is used.

* If the pattern includes a double asterisk "**" then all wildcards in the
pattern will match slashes, otherwise they will stop at slashes.

* If the pattern begins with a + then the file will be included. However, the
include rule must come before the exclude rule in order to override it.

* Examples:
'directories' => '/etc/rc.d'
'excludes' => '/init.d' // exclude the top level init.d directory in rd.d/
'excludes' => '*.sh' // exclude all shell files
'excludes' => '+foo.sh *.sh rc*.d/' // exclude all shell files, except foo.sh, and all rcX.d directories

Backup Types:
The default backup type is incremental using hard links. This means
that every directory will look like a full backup, but it will only take
the space of the backup plus the changed files. However, for backups
with lots of files (>1000) this can become slow. Thus, the other option
is to use set the 'use_hard_links' option to false for the backup
configuration. This will keep a full backup in the most recent
directory, but only archive changed files in the other directories.
So, roughly the same amount of space will be used, but not every
directory will look like a full backup, and it will be faster for
backups with lots of files.

Extracting Incremental Backups From a hard linked backup:
If an hourly backup is done and you would like to extract all changed files
from that backup the following command will achieve that:
find /backups/backup_name/hourly.0 -type f -links 1 | sed 's, ,\\,g' | xargs tar -czf /tmp/foo.tar.gz

Note on ssh and port forwarding:
If port forwarding with ssh means nothing to you, then you can ignore below.

When connecting to the same 'host' twice, but the second connection is to a port
forward to another host (i.e. behind a firewall), StrictHostKeyChecking (in the
ssh config file on the host running ribs) will need to be disabled because the
host key of the first port will conflict with the host key of the second port.

An example: Machine A is x.x.x.x Machine B is y.y.y.y Machine A has ssh
running on port 22. Machine B also has ssh listening on port 22, but Machine B
is not accessible from the outside. So a port forward is setup on Machine A to
get traffic to Machine B (e.g x.x.x.x:999 -> y.y.y.y:22)

Credits:
Thanks to Mike Rubel for his excellent paper and sample code...
http://www.mikerubel.org/computers/rsync_snapshots/
Thanks to Greg Lawler (http://zinkwazi.com) for the first BASH version
Thanks to Shai (http://shaibn.com/) for maintaining ribs-backup.org
Loading

0 comments on commit a2fb14e

Please sign in to comment.