Backup

When being asked about my hobbies, photography is usually among the top three of my answers. However, besides being a quite expensive spare time activity, it is accompanied by an ever-increasing amount of digital images (currently around 280GB). Storing these images in a secure manner is an important matter, because no one wants to lose these digital treasures, especially since they are often connected with precious memories.

Although my setup may not be safe against catastrophes such as fires or nuclear blasts, I claim that I am safe against most remaining data loss scenarios. After taking pictures with my SLT A58 the first step is to copy the data to my workstation. Once this is finished, all files are renamed based on the recording timestamp via Exiftool

exiftool '-filename<CreateDate' -d %Y%m%d_%H%M%S%%-c.%%e -r .

This is necessary because the default file names produced by my camera are neither descriptive nor unique (the image counter wraps around at 10000). After this is finished, the real work begins as I want my files to be properly sorted in folders describing the occasion the pictures were taken. While this is a very basic approach for file management, it actually fits my needs fairly well as it does not depend on any kind of further software for metadata management. Additionally, tagging the photos, e.g. with digiKam, is still possible.

The storage device that holds the first copy of the files is my HP Microserver running FreeNAS. The system is configured in a way, that all data written to the ZFS pool is mirrored between the two enterprise grade hard disks, similar to a RAID-1. Snapshots of the photo dataset are taken automatically every day. However, mirroring and snapshots are fine, but another level of redundancy is still needed. That is why I sync my ZFS datasets to the workstation via zrep.

The intial setup requires some commands to be executed on the workstation (WS) and on the NAS:

zfs create nasbackup/photos # WS: create the dataset 'photos' in the pool 'nasbackup'
zrep changeconfig -f -d nasbackup/photos <NAS-IP> pool1/photos # WS: set the backup properties
zfs set zrep:savecount=2000 nasbackup/photos # WS: set the number of snapshots to keep
zrep changeconfig -f pool1/photos <WORKSTATION-IP> nasbackup/photos # NAS: set the backup properties
zfs set zrep:savecount=2000 pool1/photos # NAS: set the number of snapshots to keep
zfs snap pool1/photos@zrep_000001 # NAS: create the initial snapshot
ssh <NAS-IP> zfs send pool1/photos@zrep_000001 | pv | zfs recv -F nasbackup/photos # WS: sync the initial snapshot to the local disk
zrep sentsync pool1/photos@zrep_000001 # NAS: announce that the initial sync was successful once it's completed
zfs rollback nasbackup/photos@zrep_000001 # WS: rollback the dataset to the snapshot
zfs set readonly=on nasbackup/photos # WS: set the dataset so read-only

After these initial steps, all further backups boil down to one simple command:

zrep refresh nasbackup/photos

This creates a fresh snapshot on the NAS and syncs all new snapshots (i.e. also the ones created automatically by FreeNAS) to the local dataset. In order to get a progress indicator for zrep, I've set the ZREP_INFILTER variable to pv.

As I have multiple datasets on my systems, I wrapped the zrep command in a small Bash script nasbackup:

#!/bin/bash
NASIP=192...
SYNCFOLDERS=(photos video mobile ...)

if ! ping -c1 "$NASIP"; then
        echo "NAS not available"
        exit 1
fi

if [[ $# -gt 0 ]]; then
        for arg; do
                if ! grep -q "$arg" <<< "${SYNCFOLDERS[@]}"; then
                        echo "$arg is not a valid dataset!"
                        continue
                fi
                echo "===== Backing up nasbackup/$arg ====="
                time zrep refresh "nasbackup/$arg"
                echo "===== END ====="
        done
else
        for arg in "${SYNCFOLDERS[@]}"; do
                echo "===== Backing up nasbackup/$arg ====="
                time zrep refresh "nasbackup/$arg"
                echo "===== END ====="
        done
fi

I could easily create a cron job for the backup script, but as I've explained, the process of adding new images always involves manual work - so I run the script manually.

A list of snapshots for a specific dataset can be obtained via:

zfs list -r -t snapshot -o name,creation,used,refer nasbackup/photos

Any snapshot from this list can be mounted with the standard mount command, e.g. mount -t zfs nasbackup/photos@zrep_000001 /mnt/backup would mount the initial snapshot created above to /mnt/backup. This is especially helpful if you would like to restore specific files from a snapshot. For this task, it is sometimes quite handy to know what changed between two snapshots - a simple zfs diff <snapshot1> <snapshot2> gives the answer. After all, I always hope I won't need the backup, but every time I run the backup script I'm pleased to see how smooth and fast it works.