/ root / pages / backups.html

You're using an old link! - Thankfully, you no longer need to specify a nonstandard port (8080) to access my site. You could've used the more standard: http://pbrisbin.com/pages/backups.html.

Backups


Backups are extremely important. In linux, with a little effort and hardrive space, one can easily come up with a fully automated backup solution to suit any needs. Here, I'd like to outline my setup. Feel free to take it and adapt to your needs.

I'll go through what's required, how and why I do it the way I do, as well as the shortcomings of how I'm doing it.

Requirements

My main box runs on one 500G hardrive. So far, this has suited me well even with my extensive movies and music collection. I decided I wanted to have a daily backup and a monthly backup and only one copy of each, so I went out and got a 1TB hardrive, split it, and now use that for both.

All you need is space, so whether you use an internal drive like me, an external USB, or some off-site scp/rsync situation is up to you; you'll just have to modify my below script(s) to suit your setup.

How I do it

I use two scripts daily and have a third that I hope I'll never have to.

The first is a backup script that runs via cron daily and monthly. Here are the cron entries:

  30 3 * * * /home/patrick/.bin/backup -d # every day at 3:30 AM
  45 4 1 * * /home/patrick/.bin/backup -m # first of every month at 4:45 AM

And here is a commented version of that backup script:

#!/bin/bash

message()  { echo 'usage: backup [ -d | -m ]'; exit 1; }
errorout() { echo "error: $*" >> "$log"      ; exit 1; }

# here is a reusable function to backup any directory it specifically handles 
# $HOME in a special way, you could choose to backup all of /home then you can 
# forget that bit, but I only backup my user, so I do it this way
backemup() {
  echo "Backing up directories..." >> "$log"
  for folder in "$@"; do
    echo "  $folder" >> "$log"
    if [ "$folder" = "/home/patrick" ]; then
      rsync -a --delete \
               --delete-excluded \
               --exclude-from="$excludes" \
               "$folder" "$dir/" >> "$log" 2>&1
    else
      rsync -a --delete "$folder" "$dir/" >> "$log" 2>&1
    fi
done
}

# check that we're root
[[ $(id -u) -ne 0 ]] && errorout "you must be root" 

# we need at least one option
[[ $# -lt 1 ]] && message

# allow use as daily or monthly backup here we set the target based on if we 
# pass -d or -m
case $1 in
  -d) dir="/mnt/data/backup_daily"   ;;
  -m) dir="/mnt/data/backup_monthly" ;;
  *)  message                        ;;
esac

# this file contains "patterns" of files to be excluded from the backup. man 
# rsync for the format of this excludes file. I use this to avoid backing up my 
# DVD Rips and Torrent Downloads as they're large and not really all that 
# important to me
excludes="/home/patrick/.backups/backup_exclude.lst"

# easily define/change where to send all output
log="/dev/null"
#log="/dev/stdout"
#log="/home/patrick/.logs/backup.log"

# and go!
echo "start: backup initiated to $dir at $(date +%m.%d.%y-%H:%M)" >> "$log"

# check that all is well; simply touch /mnt/whatever/.lock once so that it can 
# be used to check if a drives mounted (grepping /proc/mounts could work too)
[[ -f "$excludes"  ]] || errorout "excludes file missing"
[[ -f "$dir/.lock" ]] || errorout "drive not mounted"

# back up the directories I'm interested in; add whatever else you'd like
backemup '/home/patrick' '/etc' '/usr' '/var' '/boot'

# generate a list of all currently installed pacman packages
echo "Listing installed packages..." >> "$log"

echo "  pacman" >> "$log"
pacman -Qqe | grep -Fvx "$(pacman -Qqm)" > "$dir/paclog" || errorout "trouble creating paclisting"

# generate a list of all currently installed AUR packages
echo "  aur" >> "$log"
pacman -Qqm > "$dir/aurlog" || errorout "trouble creating aur listing"

echo "success: backup complete at $(date +%m.%d.%y-%H:%M)" >> "$log"
echo >> "$log"

# and, we're done
exit 0

So you can see it's pretty bare bones. Using rsync makes it incredibly fast; after the first run as it will only copy over the changes.

Another script which I use daily is retrieve. This is a little more custom as it expects files to be in the order created by the above backup script. It allows me to retrieve one file or directory out of my daily or monthly backups and restore it into my live drive. This is great if you've seriously screwed up your bashrc or xorg.conf and you want to just roll back to what you had yesterday (or last month).

Here's a commented version:

#!/bin/bash

message()  { echo 'usage: retrieve [ -m ] <file>'; exit 1; }
errorout() { echo "error: $*" >&2                ; exit 1; }

# this handy function will accept any path and return an absolute one
rel2abs() {
  local file="$(basename "$1")"
  local dir="$(dirname "$1")"

  pushd "${dir:-./}" &>/dev/null || exit 1
  dir="$PWD"; popd &>/dev/null

  echo "$dir/$file"
}

# need at least one option
[[ -z $1 ]] && message 

# define the default source for our retrievals
SRC="/mnt/data/backup_daily"

# check for -h or -m and reset the source if needed
case $1 in
  -h) message                               ;;
  -m) SRC="/mnt/data/backup_monthly"; shift ;;
esac

# first, make an absolute path
dst="$(rel2abs "$1")"

# then, remove leading / or /home; this is needed b/c I backup /home/patrick and 
# not /home so, depending on how you've changed my backup script this may or 
# may not be needed
case "$dst" in
  /home*) src="$SRC/${dst/\/home\//}" ;;
  *)      src="$SRC/${dst/\//}"       ;;
esac

# do we have a backed up file?
[[ ! -e "$src" ]] && errorout "$src: file not found"

# don't clobber the real one without first prompting
if [[ -e "$dst" ]]; then
  echo -n "$dst exists, overwrite? [Y/n] " && read A

  [[ "${A:-y}" =~ Y|y ]] || exit 1

  # this is needed if we're replacing an existing dir
  [[ -d "$src" ]] && dst="$(dirname "$dst")"
fi

# and copy it over...
cp -av "$src" "$dst" || errorout "retrieval failed"

The last script that I have, I haven't had to use --knocks on wood--. This restore script is intended to be used after a crash and clean re install to restore your system back from the directories made by my backup script. Remember, if you've changed the way /home is handled you'll need to be careful to adjust this script as well.

Here's a commented version:

#!/bin/bash

# restore from daily or monthly?
dir="/mnt/data/backup_daily"
#dir="/mnt/data/backup_monthly"

# keep a log of all actions
log="$dir/restore.log"
touch $log

# this function uses tee to handle that
logger() {
  now="$(date +%Y\ %m\ %d\ \-\ %H:%M)"
  echo $now :: $* | tee -a $log
}

# handle errors
errorout() {
  logger "--ERROR-- $*" | tee -a $log
  exit 1
}

# print a HUGE warning as this script will
# destroy your current install
warning() {
  cat << EOF
                   
            . : :: BIG fat warning :: : .
            . : :: big FAT warning :: : .
            . : :: big fat WARNING :: : .

  you about to run 'restore' and its goals are as 
  follows:

  you should have been running 'backup' via cron on a
  regular basis. this companion script should have been
  saving all the folders of interest and packages lists 
  needed by 'restore'.

  in the event of catastrophic system failure, you are to
  wipe / replace the drive and reinstall a vanilla
  Arch Linux install which will subsequently be destroyed
  for all intents and purposed by this script.

  once installed, running this script will:

    replace /home/patrick /etc /var and /usr with the 
    backed up versions

    install repo packages from a list of what was
    installed prior to the failure

    install foreign packages, if existing in
    \$HOME/Packages, also based on a list made prior to
    the failure

    manipulate fstab to match customizations made prior
    to the failure

  if this is not the situation you are in, STOP NOW.

  otherwise, i hope it works...

EOF

  echo -n "  so, are you sure? [y/n] " && read A
  [ "$A" != "y" ] && exit 1

}

# big fat warning
warning

# check that we're root
[ $(id -u) -ne 0 ] && errorout "You must be root"

### The actual restore
logger "BEGINNING RESTORATION"

# apply custom fstab entries to new fstab
# TODO handle errors here
logger "Manipulating fstab..."
cp -a /etc/fstab $dir/fstab.clean && logger "  New fstab saved"
cp -a $dir/etc/fstab $dir/fstab.bak && logger "  Old fstab saved"
echo >> $dir/etc/fstab 
grep -A 999 CUSTOM $dir/etc/fstab.bak >> $dir/etc/fstab && logger "  Changes applied to new fstab"

# restore some directories
logger "Restoring /home and /var..."
cp -a $dir/patrick /home/ || errorout "Could not restore directory: /home"
cp -a $dir/var / || errorout "Could not restore directory: /var"
logger "  /home and /var restored"

# install the pacman packages
logger "Installing pacman packages..."
cat $dir/paclog | xargs pacman -S --noconfirm --needed || errorout "Could not install packages"
logger "Pacman packages installed"

# Install the aur packages
logger "Installing aur packages..."
cat $dir/aurlog | while read aur; do
  PACK="$(find /home/patrick/Packages -name $aur*)"
  if [ "$PACK" = "" ]; then
    logger "  $aur NOT installed"
  else
    pacman -U $PACK && logger "  $aur installed"
  fi
done

# restore remaining directories
logger "Restoring /etc and /usr..."
cp -a $dir/etc $dir/usr / || errorout "Could not restore directories: /etc /usr"
logger "  /etc and /usr restored"

# create any custom directories out of fstab
logger "making excluded directories..."
grep -A 50 CUSTOM /etc/fstab | grep \/ | awk '{print $2}' | while read dir; do
  if [ -d $dir ]; then
    logger "  $dir exists"
  else
    mkdir -p $dir && logger "  $dir restored" || logger "  $dir NOT restored"
  fi
done

# hope all went well
logger "RESTORE FINISHED, SUCCESS?"

exit 0

So that's it; three scripts and some hardrive space, and you've never got to worry about data loss again. Well, not quite. My setup is lacking in many areas, but hey, it works for me™. Do what you wish, there's plenty of google results for more intense backup strategies.

Why mine sucks

Not off-site, or even out-of-box.

If my apartment burns down, my backups are useless. To mitigate this, I've started taking manual copies of my monthly backup and storing them on a separate drive in a fireproof box.

Backups are not rolling

This isn't so bad for the dailies, but my monthly backup occurs every month on the first; this means if you have an issue that's more then two days old, and you happen to notice on the 2nd, you don't have a backup old enough to fix it.

Untested

I've never had to use restore, though I do use retrieve all the time. Anyone will tell you, an untested backup solution is no solution at all. Guess I'm just too lazy to hose my install to test it. Worse comes to worst, I know the backed up data is good; if my restore script fails I can always manually copy everything over. I pretty much did this last time I installed a new Arch box; as I tend to reuse configs, just grabbing them off of my main box's backups really sped up the process.

Comments

on Sun, 23 May 2010 21:37:42 -0400, mjheagle8 wrote:

i really like this, i've implemented a bunch of these ideas in my own backup script.





pbrisbin dot com 2010