Monday, February 10, 2014

Backing up Your PostgreSQL Database

PostgreSQL is one of the most reliable database management systems available. As with anything, it is not 100% foolproof though. There are a number of things that can happen to cause data loss and so it is important to back it up. PostgreSQL has three primary methods of saving copies of your data should something go wrong. They all have their strengths and so you may use more than one or even all three, depending on how secure you need your data to be. These are not the only backup options with PostgreSQL but are used a lot and show the different options available.

The pg_dump and pg_dumpall Utilities

PostgreSQL has two similar utilities to help with backups: pg_dump and pg_dumpall. The first is used to create a backup of a single database while the later is used to backup all of the databases on the server. Both utilities create files that you can move to other PostgreSQL database servers and load data into them. You have the option of creating a compressed file if space is a concern or you can create SQL files that you view and edit before running the files to load data into the new database.

Tutorial for pg_dump and pg_dumpall

Filesystem Backup

PostgreSQL stores data in regular operating system files and so you can use regular backup tools that you may already have running in your current server environment. Unfortunately the database will need to be shut down first unless you trust your filesystem to take consistent snapshots. For some environments, this isn't a problem. For others, it makes this backup method unusable.

Tutorial for Filesystem Backup

Streaming Replication

PostgreSQL has a backup method called streaming replication or log-shipping standby servers that performs the same SQL commands executed on a master database to any number of slaves. The slave databases can then be used  as read-only copies for reporting purposes. Should something go wrong with your primary database server, it is simply a matter of flipping a software switch and one of the slave database servers can take over as the primary and business can continue. Other backup options require lengthy restores that may not be possible for time-sensitive database applications.

Tutorial for Streaming Replication

Which One To Use

Streaming replication seems to be the most promising of these three backup strategies. So why would you use any of the either two? The short answer is: operator error. What happens when the database administrator does something like drop an important table, which can't be undone with a ROLLBACK? It is time to go back to that pg_dump file and fix the mistake. In order to set up streaming replication, you need to leverage a filesystem backup. Ultimately they all have their place and should be used appropriately.

Recovering your Database

Having a database backup is kind of like climbing Mt. Everest: Getting to the top is only half of the trip as you still have to climb back down. Sure you may have a backup of your data but what is the best way to restore it. There are a number of considerations that should be decided long before you need to use your backup. Understanding them and having practice with your backups will help when you really need to use them.

No comments:

Post a Comment