For several years, I’ve kept a duplicate copy of my important files on a secondary hard drive. I’d been using a crude “purge-re-copy” script to do this, and it was, obviously, extremely slow for anything but a small number of files. So, I kept my backups small. I wasn’t really happy with this, since I would have preferred to keep a synchronized copy of all of my files.
I dabbled with writing a utility to do the copying, something that would only copy the changed files, but the complexity kept stopping me. Then I discovered that such a utility already exists.
Rsync is a command-line utility that synchronizes sets of directories and files between file systems. It was written primarily for remote file copying, but it works really well for local file copies too.
Here’s an example, showing how I use it for backups:
rsync -lrt --delete /home/jimc/Documents /media/HD2/fullsync |
When you issue this command, you end up with a synchronized copy of the Documents folder on HD2 (HD2 is the target, /home/jimc/Document is the source/working copy). The target path ends up being /media/HD2/fullsync/Documents.
The command line arguments are as follows:
- -l copy symlinks as symlinks
- -r recurse into directories
- -t preserve modification times
- –delete delete extraneous files from destination directories (this ensures that when you delete a file in your source directory, it doesn’t hang around in your target directory)
There are plenty of additional command line arguments, but these are just the ones I use for my needs.
I’ve also created a Python script to simplify the backup process. You can download a copy here. (Look for “FullSync”.)
I use rsync in Linux, but there are various implementations available for Windows. You can find a list in the rsync Wikipedia entry here. If you want to use rsync in Windows, I’d personally recommend installing Cygwin. It will give you rsync, and also a lot of other really useful utilities.