Administration: Backup part 1

I am in the process of doing a tidy up of my home servers and software. Several of my current machines were installed a few years ago and have got a little messy over the years. I decided that the first priority (before any reinstalls) was to get backups sorted out.

The first stage of this is to backup data that changes regularly, such as my home directory, web pages and /etc files on machines. I've decided to us rsnapshot to do this.

Introduction

The idea behind rsnapshot is that it uses rsync to connect to a remote machine and make a copy of the areas being backed up locally. It then keep multiple generations of copies but saves on disk space by using hard links unless the files have changed. So 10 generations of a file system hopefully uses only a little more than one generation.

I installed rsnapshot on my archive machine using apt-get to install the default Ubuntu package. I then decided on where I was going to keep the backups and changed snapshot_root in /etc/rsnapshot.conf to point to it.

I then followed Sections 4.3.7 , 4.3.8 and 4.4 of the Howto document so that I was now backing up the /etc/ , /home and other areas of the archive host.

Using ssh keys

To backup other machines I decided to use ssh keys with a wrapper. First I created a key:

$ ssh-keygen -t dsa
then I copied the key over to the target machine (the one to be backed up) added it to the authorized_keys file.

# cd /root/.ssh
# cat back_key.pub >> authorized_keys
I then edited the authorized_keys file and added a line to find out what command rsnapshot was using to backup files:

command="echo $SSH_ORIGINAL_COMMAND >> /tmp/cmd.log" ssh-dss AAAAB3Nz......
To test I manually used the key to connect from the archive to the target machine:

ssh -i back_key root@green.darkmere.gen.nz ls /etc
which failed but would have added the "ls /etc" line to the cmd.log file. The rsnapshot.conf file was now updated to use the sshkey for the connection and to attempt to backup the target.

ssh_args        -i /path/to/back_key
#
backup  root@target:/etc/       target/
A test ( using "rsnapshot -v daily") would fail but the cmd.log on the target should yield something like:

rsync --server --sender -logDtprR --numeric-ids . /etc/
What needs to be done now is to only allow this command to be run. On the target machine create the file /usr/local/bin/validate-cmd and add the content:

#!/bin/sh
case "$SSH_ORIGINAL_COMMAND" in
        *\&*)
                echo "Rejected"
                ;;
        *\(*)
                echo "Rejected"
                ;;
        *\{*)
                echo "Rejected"
                ;;
        *\;*)
                echo "Rejected"
                ;;
        *\<*)
                echo "Rejected"
                ;;
        *\`*)
                echo "Rejected"
                ;;
        rsync\ --server\ --sender\ -logDtprR\ --numeric-ids\ .\ /*)
                $SSH_ORIGINAL_COMMAND
                ;;
        *)
                echo "Rejected"
                ;;
esac
This wrapper which check what command is being run and restricts it as much as possible. The wrapper needs to be made executable and the authorized_keys file updated to use it

command="/usr/local/bin/validate-cmd" ssh-dss AAAAB3NzaC1kc.........
rsnapshot -v daily , should now backup the target machine. Other directories can be added on the target machine and other target machines added fairly easily and in most cases you should just have to copy the authorized_keys and validate-cmd files to the new machine.

Problems

While I was setting up I came across a few problems:

  1. The exclude option to prevent areas being backed up appears to be config wide. There is no simple way to just exclude a file or directory on one machine and not all of them.
  2. configuration lines use tabs between items, spaces results in errors.
  3. The -t option is very good for checking configs and -v for doing test runs
but overall I got everything working and backing up 6 machine in just a couple of hours.

Result

After running the program for 3 days I am now using:

# rsnapshot du
20G     /Backups/rsnapshot/hourly.0/
946M    /Backups/rsnapshot/hourly.1/
834M    /Backups/rsnapshot/hourly.2/
870M    /Backups/rsnapshot/hourly.3/
1.1G    /Backups/rsnapshot/daily.0/
823M    /Backups/rsnapshot/daily.1/
1009M   /Backups/rsnapshot/daily.2/
26G     total
Each additional copy is adding just 5% of additional disk space (most of which is due to some large often changing mail, log and firefox cache files) .

Links