
Cloud SQL availability and recovery is one of those Professional Data Engineer topics where the exam loves to test whether you can pick the right protection mechanism for a specific failure scenario. The wrong choice on the exam usually looks reasonable on the surface, so the only defense is knowing exactly what each feature does and what it cannot do. I want to walk through the four mechanisms that come up most often: Point-in-Time Recovery, backups, high availability with a failover replica, and read replicas.
These are not interchangeable. PITR protects you against bad writes. Backups protect you against longer-horizon data loss. HA protects you against zonal outages. Read replicas protect the primary from being overwhelmed by reads. If you confuse any two of these on the Professional Data Engineer exam, you will lose points on otherwise straightforward questions.
Point-in-Time Recovery, or PITR, lets you restore a Cloud SQL database to any past state inside the retention window. The mechanism depends on the engine. Cloud SQL for MySQL uses binary logging. Cloud SQL for PostgreSQL uses write-ahead logging, often called WAL.
The two mechanisms differ in when they write the log relative to the change:
Both give you a reliable, ordered trail of changes. That is what makes PITR possible. If a developer runs a destructive UPDATE without a WHERE clause at 14:03, you can restore to 14:02 and undo it without touching anything that happened earlier. This is the scenario you want to reach for when an exam question describes accidental writes, schema corruption, or a bad migration that needs to be unwound to a specific timestamp.
PITR is not a replacement for backups, because the binary logs are kept alongside the live instance. If you lose the instance, you lose PITR. You need both.
Cloud SQL gives you three backup options, and the Professional Data Engineer exam expects you to know when each one applies.
The default 7-day window trips people up. If a question describes a 30-day or 7-year retention requirement, automatic backups alone cannot satisfy it. The correct answer routes through Cloud Storage, either with scheduled exports or a manual export step. Storage classes like Nearline, Coldline, and Archive then handle the long-tail cost story.
One pattern I see on the exam: a regulated workload with a multi-year retention requirement. Automatic backups handle the operational recovery window. Scheduled exports to Cloud Storage handle the compliance window. Both run, and both have a job.
HA in Cloud SQL is built on a failover replica. You provision a primary instance in one zone, and Cloud SQL creates a standby in a different zone. Replication between them is synchronous, which means every commit on the primary is also persisted to the standby before it is acknowledged.
When the primary becomes unavailable, Cloud SQL fails over automatically. The application reconnects to the same connection name and resumes against the new primary. There is no manual promotion step.
A few things to lock in for the exam:
If the scenario says "zonal failure" or "automatic failover with minimal data loss," you are looking at HA.
Read replicas are a different tool for a different problem. They are copies of the primary that handle read traffic so the primary is not overwhelmed. Replication is asynchronous, which means there can be a slight lag between a write landing on the primary and the same row being readable on the replica.
Two limitations matter on the Professional Data Engineer exam:
The exam mental model: read replicas scale read throughput and increase availability for read paths. They do not give you automatic failover. They do not give you synchronous protection. They do not replace backups or PITR.
When you see a Cloud SQL recovery or availability scenario on the Professional Data Engineer exam, map the requirement to the mechanism:
If you can answer those five prompts in one sentence each, the Cloud SQL availability questions on the exam stop being tricky.
My Professional Data Engineer course covers Cloud SQL availability and recovery in the depth the exam expects, including the engine-specific PITR mechanics, the backup retention tradeoffs, and the HA versus read replica distinctions that show up in scenario questions.