Tuesday, February 18, 2014

Using pg_dump and pg_dumpall for Database Backups

The pg_dump and pg_dumpall utilities are great for moving data between databases and for making quick backups. While log shipping may have you back in business after a database failure, it doesn't protect you against a user error where pg_dump and pg_dumpall will. I actually like to set up both nightly pg_dump runs and log shipping on production databases.

There really isn't a need to keep pg_dump files around for a long time and so it is helpful to implement some kind of automatic backup rotation. Otherwise you will constantly have to monitor space on the filesystem where you keep your backups. I like to implement the following strategy for my database dumps:
  1. Perform a nightly backup and save them for a week
  2. Keep a copy of each Monday backup for a month
  3. Keep a copy of each first day of the month backup for a year
I figure that if you don't need your backup in a year's time, there is no need to keep a copy of lying around. As I use Linux for all of my production servers, I use cron to schedule all of my backups. My crontab has the following 3 entries for the postgres user:

15 1 * * * <PATH>/nightly_backup.sh >> <PATH>/backup.log 2>&1
15 2 * * 1 <PATH>/weeksave.sh >> <PATH>/backup.log 2>&1
15 3 1 * * <PATH>/monthsave.sh >> <PATH>/backup.log 2>&1

This runs the nightly backup script at 1:15 every day, so it is more of a morning backup instead of a nightly one. You can modify the hour of the backup to meet your needs. I also allow an hour for the backup to complete before I try saving a copy for the weekly backup. You may need to adjust your timing accordingly.

I then have three simple scripts: nightly_backup.sh, weeksave.sh, and monthsave.sh. Here are the contents of each:

nightly_backup.sh
#!/bin/sh
/usr/bin/pg_dumpall > <PATH>/db_backup_$(date +%A).sql

weeksave.sh
#!/bin/sh
let WEEK=($(date +%d)-1)/7+1
/bin/cp <PATH>/db_backup_$(date +%A).sql 
        <PATH>/db_backup_week$WEEK.sql
# the previous 2 lines should all be on 1 line

monthsave.sh
#!/bin/sh
/bin/cp <PATH>/db_backup_$(date +%A).sql
        <PATH>/db_backup_$(date +%B).sql
# the previous 2 lines should all be on 1 line
# Change %B to %B%Y if you want to keep monthly backups
#   forever

You will want to replace all <PATH> references to the paths used in your environment.

The key to all of the scripts is to use dates as part of the file names for each of the backup files. The nightly backup includes the day of the week so that when Monday rolls around again, it overwrites the previous Monday's backup. The weekly backup uses Monday's backup and just copies the file so that it has week1, week2, week3, week4, or week5 in the name.

Notice the comment in the monthsave.sh script that shows how to keep your monthly backups forever. It just adds the year to the month used in the file name.

No comments:

Post a Comment