Taking physical backups with pgbackrest
Besides logical backup, with PostgreSQL, we can take physical backups. Physical backups are copies of the PostgreSQL cluster files. There are three ways to get physical backups that are consistent:
- Stop PostgreSQL in the server and copy all data files.
- Use built in tool pg_basebackup.
- Use third party tool that orchestrate pg_basebackup and archiving.
Avoid reinventing the wheel and use a third party tool to take PostgreSQL backups. There are many tools, both open-source and commercial, to manage backups in PostgreSQL. Today we will talk about one of the most popular solutions, pgbackrest. This open source tool is maintained by Crunchy Data. For a complete explanation of installing pgbackrest in your system, please refer to their web page. In this blog, we are going to do a brief description of how to set it up. After installing pgbackrest, you should tweak the config file to set up what is called a stanza. Here is an example of the config file:
[pgvirtuoso] pg1-path=/var/lib/postgresql/12/demo [global] repo1-path=/var/lib/pgbackrest repo1-retention-full=2 repo1-retention-diff=1 repo2-path=/demo-repo repo2-retention-full=10 repo2-s3-region=us-east-1 repo2-s3-bucket=pgvirtuoso-bucket repo2-s3-endpoint=s3.us-east-1.amazonaws.com repo2-s3-key=awsaccesskey repo2-s3-key-secret=awssecretkey repo2-type=s3 [global:archive-push] compress-level=0
In the first section, we named the “stanza” and set the location of the cluster files. In the second section, we set two repositories. With pgbackrest, we can take backups to two places. Also, archiving can be done to the two places simultaneously. In the next blog, we will discuss archiving more in-depth. The third section is for settings specific for archive-push. We can turn off compression only for archiving while still having full compression on backups. After having the configuration file set up, need to create the stanza by executing the following command:
sudo -u postgres pgbackrest --stanza=pgvirtuoso stanza-create
After creating the stanza, you need to set up archiving to take backups. Here is a sample:
archive_command = 'pgbackrest --stanza=pgvirtuoso archive-push %p' archive_mode = on listen_addresses = '*' max_wal_senders = 3 wal_level = replica #wal_level should be at least replica.
After setting archiving we are ready to start taking backups. The following command will take a backup.
sudo -u postgres pgbackrest --stanza=pgvirtuoso backup
The application pgbackrest allows you to take full, differential and incremental backups. The ability to backup and restore differential and incremental backups, reduce the need to replay WAL files, expediting the restore process. By default, if not specified, pgbackrest takes incremental backups. But, in the case where there is no previous full backup, pgbackrest will upgrade the backup type to full. You can use cronjobs to automate backups like the following example:
30 03 * * 0 pgbackrest --type=full --stanza=pgvirtuoso backup 30 03 * * 1-6 pgbackrest --type=diff --stanza=pgvirtuoso backup
One of the cool features of pgbackrest is that we can set up retention on each repository we create. So, in on command, we take backups and also clean up old backups. With pgbackrest we can set up encryption, parallel workers to expedite the backup/restore process even more, among other features. No wonder pgbackrest is perhaps the most popular backup tool for PostgreSQL in the industry.
In the last article on this disaster recovery series, we will talk about archiving.