Obviously lots of people have solved this problem before, but my situation was unique enough that many options were off the table. I had a few requirements:
- My home directory (primaries) is encrypted with Ubuntu's TrueCrypt setup, the documents I want to back up the most are financial in nature, and so I wanted my backups always encrypted on disk.
- I want snapshotted backups so that I was resilient against hardware failures, but also from rm -rf stupidity.
- However, I did not want to simply store a series of diffs as that would make recovery more complex and I want recovery to be simple.
- Still, I wanted efficiency and speed so I wasn't choking my internal network at various points in the day.
Most importantly though, I wanted to understand exactly how my backup system works and what it's doing. Rather than trust some other code that I didn't understand and couldn't tweak, I wanted to roll my own. Most folks do this with shell scripts, cron, and rsync. I wanted to do something similar, but since my shell-foo is abysmal, I decided on python.
If this is useful to anyone else, I've shared my code. The script has two modes controlled by arguments: backup and snapshot
Backup:
- Optionally tries to mount a path which should be set up in /etc/fstab. In my case, this is NFS.
- Mounts an encrypted filesystem at /mnt/.../current/
- Rsyncs a series of files and paths to /mnt/.../current/
- Optionally unmounts the encrypted filesytem.
Snapshot:
Makes up to N periodic snapshots of the encrypted files at one of several frequencies. For example, it might be configured to keep 24 hourly snapshots, 7 daily snapshots, 4 weekly snapshots, and 3 monthly snapshots. Any number of snapshots can be kept at any frequency.
The snapshots are taken of the encrypted files, not from the decrypted filesystem. As a result, you can run the snapshots directly on the remote backup system, I run it on my NAS. It works just fine if you run it locally as well.
Both modes are managed using a .backuprc file in the user's home directory. For example, mine looks something like this:
The python source, an example .backuprc and an example crontab are all found over here on github.# Optional, log all events LOG_FILE /home/greg/logs/backup.log # Optional, we try to mount this path first. Failures halt execution. PRE_MOUNT /mnt/backup/ # Required, password and mount point for encrypted/decrypted file # systems. The password can be in plaintext since this file is stored # on an encrypted filesystem anyway. We aren't going for paranoid. ENCFS_PASSWORD AddYourOwnPasswordHere ENCRYPTED_MOUNTPOINT /mnt/backup/desktop/ # This is where we will write files unencrypted. Must be empty, must # not be mounted already. DECRYPTED_MOUNTPOINT /mnt/encryptedbackup/ # Required, rsync flags. RSYNC_FLAGS -CRa --delete # Number of snapshots. Format: [type=,...] e.g. hourly=12,daily=7 SNAPSHOTS hourly=12,daily=7,weekly=2,monthly=1 # List of file paths to rsync. Any line that doesn't contain a space is # a file path. Paths can be filenames or directories. This is simply # the argument passed to rsync. As a result, you can use rsync features # like adding a "./" directory to tell rsync which components of the # path to sync over. /home/greg/./.heartbeat /home/greg/./src/ /home/greg/./financial/ /home/greg/./picasa/
Some other helpful resources I came across while putting this together:
ReadyNas Root Access Add On
Install Python2.6 on a ReadyNas NV
RBackup - Diff based backups with python
The ultimate guide to rsync backups
How to set up encfs for use with rsync