thefoggiest.dev

Automatic backups for sometimes-on (Ubuntu) machines

Backing up data on a regular basis is useful, even necessary, but setting it up is a pain. Mainly that’s because to get it working you have to make some smart choices and existing solutions are all unsatisfactory. Simple Backup Solution creates a single big tarball for every backup and doesn’t tell me when it is making a backup. Flyback can only write to locally mounted ext3 file systems. Timevault works constantly in the background. All of them only backup at specific times, or manually.

Big tarballs are risky: if they get damaged or only partially copied, because, say, I’m shutting down my computer, I lose the entire backup. In such a scenario it is likely I’m not making backups for weeks in a row. The data I want to backup is on computers that are not always on, so I can’t use scheduled backups. Also, most of the files are in compressed formats already so compression won’t do much good anyway. I don’t want to have to remember having to make a backup, so I can’t use manual backups either. The disk I want to store the backups on is not locally mounted. Making a backup takes time and resources, and I don’t want it to get in my way while I’m doing something else. All this is why I have put it off for too long, and I imagine so have many people.

I have one computer that is in fact always on: my home server. For that one I have already set up a simple cron job scheduled rsync based backup system in place which works fine. I have another desktop box, and two laptops. The latter three are the ones that I need a tailored backup mechanism for.

  1. I only want to backup my home directories at this point. I don’t have enough tweaks to warrant backing up /etc, and I use the home server for server purposes, so backing up /var on the client machines isn’t needed either.
  2. I don’t want to use tarballs.
  3. Only changes should be stored to save time and avoid unnecessary bandwidth usage.
  4. I don’t want to backup cache files, thumbnails and other unneeded stuff.
  5. The drive on which the backups will be stored is mounted on a machine that is always on.
  6. I only need a single backup, which should contain the most recent version of each file. I only need to go back in time for very specific files (source code and such) and I already use Subversion for that.
  7. I have several mysql databases that I want to backup as a database dump.

These demands lead me to the following solution:

  1. Backups will take place as part of shutting down the machine. Every time. This way it happens automatically, and when I’m not using the computer. No user action can be asked.
  2. I use rsync for copying individual files. rsync can do incremental backups, which means that only changes are backed up. It can use ssh for network transport, and can be configured easily to exclude specific files and directories. ssh Can be configured for password-less authentication, so no user action is needed.
  3. Before syncing, I will make a dump of all mysql databases, so that it will be included with the backup.

I have written a text on how I achieved this. It is written as a list of instructions, so you can follow it if you like. It starts at the next page. You can direct feedback to this blog post.


page 1 page 2 page 3 page 4 page 5 page 6

Categorised as: howto, linux


2 Comments

  1. Christian says:

    If you want snapshotting, have a look at rsnapshot (or dervish). They make hardlinks and only rsync the deltas. Works quite well.

  2. wow, nice post, I was wondering the same thing. and found your site by yahoo, learned a lot, now i have got some idea. I’ve bookmark your site and also add rss. keep us updated.