rsync is a command line utility that is used to synchronize files between two computers over a network to synchronize files between two filesystems. It was written as a replacement for rcp but with many new features. For example it uses an algorithm that will only transfer files that have been modified. SSH will be used to authenticate between the machines and to encrypt the network traffic.
The situation: We have four machines named: server, machine1, machine2, and machine3. The server has a tape drive that is used to do nightly backups. machine1 is used as a development box and has files that need to be backed up in /src and in /home. machine2 is used for mail and needs /home and /mail to be backed up. machine3 is a web server and needs /home, /var/www, and /etc/httpd backed up.
Create a shell script for each machine. Simplify your maintenance by placing the scripts in a central location. I like to use /root/scripts. Decide on where you want to log your output. I like /root/logs but another common option is to have the script mail you the output.
Add entries to your crontab to call the scripts. Make sure you leave enough time before your normal backups of the server that the rsync jobs complete.
Each night the following will occur:
rsync machine1 -> Server
rsync machine2 -> Server
rsync machine3 -> Server
backup server to tape
Let's take a look at the flags used for rsync in the examples:
Next generate a public private key pair with ssh. Place the public key in the ~/.ssh/authorized_keys file in an account on machine1, machine2, and machine3 that has read access to the directories that need to be backed up. It is best not to use the root account on the remote machines, but you should evaluate the risk in your environment. Test that you can login to these accounts using ssh without using a password.
Test each one of the rsync scripts. The first time you run rsync will take the longest as it will need to copy all the files from the remote machines and not just the files that have changed.
Add the /machine1, /machine2, and /machine3 (or whatever you have named them) directories to the servers backup script.
While this process does not backup the entire remote machine, it will ensure that you will not lose irreplaceable data.
Starting with the example scripts included in this tutorial there are many changes that can be made to fit your specific circumstances.
The frequency of the rsyncs can be modified to occur more often or at different times. Simply by adding additional crontab lines the backup from the remote machines could be done everyday at lunch, multiple times a day or even hourly.
The scripts could also be changed to rotate between multiple backups on the server or could be changed to do some sort of processing on the files before they are backed up. For example if the home directories you are backing up contain web browser caches, they could be removed after the rsync but before the system backup.
Using this article as a starting point you should create a backup plan that fit your needs.