Crunchy Operator (pgBackRest)
PGO Crunchy Operator uses pgBackRest for managing backups.
pgBackRest - is a backup and restore solution for PostgreSQL databases that offers several features, such as parallel backup and restore, compression, full, differential, and incremental backups, backup rotation and archive expiration, backup integrity and etc. It supports multiple repositories, which can be located locally or remotely via TLS/SSH, or be cloud provided storage as S3/GCS/Azure.
Backup configuration
Backup configuration is done through the spec.backups.pgbackrest
parameter. See the example below.
spec:
backups:
pgbackrest:
repos:
- name: repo1 # repo
schedules:
full: "0 1 * * 0" # Full backup once a week on Sunday at 1 AM
incremental: "0 1 * * 1-6" # incremental - from Monday to Saturday at 1 AM
gcs:
bucket: "<BUCKET_NAME>" # GCS bucket name
configuration:
- secret:
name: pgo-gcs-creds # GCS credentials
- configMap:
name: pgbackrest-config # pgbackrest config
global:
repo1-path: /backup/aidboxdb # Backup path in bucket
repo1-retention-full-type: time # Retention policy
repo1-retention-full: "30" # Delete backups after 30 days
manual:
repoName: repo1
options: # Manual backup configuration
- '--type=full'
- '--compress-level=6'
- '--start-fast=y'
- '--process-max=20'
- '--log-level-console=info'
And create additional configs and secrets
---
apiVersion: v1
kind: Secret
metadata:
name: pgo-gcs-creds
namespace: aidboxdb-db
dataString:
gcs.conf: |-
[global]
repo1-gcs-key=/etc/pgbackrest/conf.d/gcs-key.json
gcs-key.json: |-
<GCP SA JSON access file>
---
apiVersion: v1
kind: ConfigMap
metadata:
name: pgbackrest-config
namespace: aidboxdb-db
data:
db.conf: |-
[global]
compress-level=6
start-fast=y
process-max=20
Repositories
repos:
- Defines a pgBackRest repository. This allows you to configure where and how your backups and WAL archive are stored. You can keep backups in up to four (4) different locations.
Supported 4 locations (see full Backup Configuration instructions):
azure
For use with Azure Blob Storage.gcs
For use with Google Cloud Storage (GCS).s3
For use with Amazon S3 or any S3compatible storage system such as MinIO.volume
For use with a Kubernetes Persistent Volume.
GCS configuration example:
- Specify GCS bucket and secret with credentials
spec:
backups:
pgbackrest:
repos:
- name: repo1
gcs:
bucket: "<BUCKET_NAME>"
configuration:
- secret:
name: pgo-gcs-creds
2. Create secret
with GCS connection credentials
apiVersion: v1
kind: Secret
metadata:
name: pgo-gcs-creds
namespace: aidboxdb-db
dataString:
gcs.conf: |-
[global]
repo1-gcs-key=/etc/pgbackrest/conf.d/gcs-key.json
gcs-key.json: |-
<GCP SA JSON access file>
Schedule
In this spec, we define incremental backup from Monday to Saturday and take one full backup every Sunday at 1 AM:
spec:
backups:
pgbackrest:
repos:
- name: repo1
schedules:
full: "0 1 * * 0" # Full backup once a week on Sunday at 1AM
incremental: "0 1 * * 1-6" # incremental - from Monday to Saturday at 1AM
Backup retention
Define backup retention policy. In this spec we store all backups for 30 days, after that period - delete them:
spec:
backups:
pgbackrest:
global:
repo1-path: /backup/aidboxdb # Backup path in bucket
repo1-retention-full-type: time # Retention policy
repo1-retention-full: "30" # Delete backups after 30 days
Create backup
At certain instances, you may find it necessary to perform a singular backup, especially before making significant modifications or updates to an application. To do so, you must first configure the spec.backups.pgbackrest.manual
section, which includes details about the type of backup desired and any additional pgBackRest configuration settings required:
spec:
backups:
pgbackrest:
manual:
repoName: repo1
options: # Manual backup configuration
- '--type=full' # Take full backup
- '--compress-level=6' # Compress GZ
- '--start-fast=y' # Do no wait checkpoint
- '--process-max=20' # Max processes to use for compressing and transfer
For creating a manual backup you should annotate postgrescluster
resource with postgres-operator.crunchydata.com/pgbackrest-backup
annotation:
$ kubectl annotate -n aidboxdb-db postgrescluster aidboxdb --overwrite \
postgres-operator.crunchydata.com/pgbackrest-backup="$(date)"
Recovery
Sometimes you need to recover your database or clone your production database to the stage environment. Generally in the recovery process, we can define two types of recovery: clone the existing cluster to another environment, PITR - recovery database at a specific point in time.
Clone
To create a new clone of the existing PG cluster you should specify dataSource
parameter for the new cluster. In the sample below we create stage
cluster as a copy of aidboxdb
cluster in aidboxdb-db
namespace.
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: stage
namespace: stage
spec:
dataSource:
postgresCluster:
clusterName: aidboxdb
repoName: repo1
clusterNamespace: aidboxdb-db
image: healthsamurai/aidboxdb:15.2.0-crunchy
postgresVersion: 15
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
backups:
pgbackrest:
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
PITR
When you need recovery to a specific point in time you should add recovery options to the new cluster configuration.
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
name: stage-pitr
namespace: stage-pitr
spec:
dataSource:
postgresCluster:
clusterName: aidboxdb
repoName: repo1
clusterNamespace: aidboxdb-db
options:
- --type=time
- --target="2023-04-09 10:00:00-04"
image: healthsamurai/aidboxdb:15.2.0-crunchy
postgresVersion: 15
instances:
- dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
backups:
pgbackrest:
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: 1Gi
Look at dataSource
. In this section, you can specify the type of recovery and target.
spec:
dataSource:
postgresCluster:
clusterName: aidboxdb
clusterNamespace: aidboxdb-db
repoName: repo1
options:
- --type=time
- --target="2023-04-09 10:00:00-04"
Inspect backup
You can list of backups via direct exec pgbackrest info
command on database image
$ export NS=aidboxdb-db
$ kubectl exec -n $NS \
$(kubectl get pod -n $NS -l "postgres-operator.crunchydata.com/data=postgres" -o jsonpath='{.items[0].metadata.name}') \
-- bash -c 'pgbackrest info'
For verifying existing backups you can run pgbackrest verify
command
$ export NS=aidboxdb-db
$ kubectl exec -n $NS \
$(kubectl get pod -n $NS -l "postgres-operator.crunchydata.com/data=postgres" -o jsonpath='{.items[0].metadata.name}') \
-- bash -c 'pgbackrest --stanza=db --log-level-console=info verify'