Database Backup & Disaster Recovery: A Zero-Excuse Guide for Operators

The Uncomfortable Truth About Backups

Every company says they do backups. Almost none of them test restores regularly. A backup that has never been restored is not a backup – it is a file.

Our 3-2-1 Backup Strategy

- 3 copies of data
- 2 different storage media/types
- 1 offsite (different region or provider)

For PostgreSQL:

# Continuous WAL archiving to S3
archive_command = 'aws s3 cp %p s3://backup-bucket/wal/%f'

# Daily base backup
0 2 * * * pg_basebackup -D /backup/base -Ft -z -Xs -P

Point-in-Time Recovery (PITR)

PITR lets you restore to any second in the past. This is the gold standard for databases with continuous write activity:

-- Restore to a specific point in time
recovery_target_time = '2025-12-10 14:30:00'
recovery_target_action = 'promote'

Recovery Time Objective (RTO) Targets

Scenario	Target RTO	Method
Full database failure	< 30 min	Standby promotion
Accidental data deletion	< 1 hour	PITR
Region failure	< 4 hours	Cross-region replica

The Monthly Restore Drill

Schedule a monthly calendar event: restore your production backup to a staging environment and run integration tests against it. This catches backup drift before it becomes a crisis.

Conclusion

Disaster recovery is not an IT concern – it is a business continuity concern. Treat it accordingly.