Cloud SQL Availability and Recovery for the PDE Exam

May 10, 2026

Cloud SQL availability and recovery is one of those Professional Data Engineer topics where the exam loves to test whether you can pick the right protection mechanism for a specific failure scenario. The wrong choice on the exam usually looks reasonable on the surface, so the only defense is knowing exactly what each feature does and what it cannot do. I want to walk through the four mechanisms that come up most often: Point-in-Time Recovery, backups, high availability with a failover replica, and read replicas.

These are not interchangeable. PITR protects you against bad writes. Backups protect you against longer-horizon data loss. HA protects you against zonal outages. Read replicas protect the primary from being overwhelmed by reads. If you confuse any two of these on the Professional Data Engineer exam, you will lose points on otherwise straightforward questions.

Point-in-Time Recovery

Point-in-Time Recovery, or PITR, lets you restore a Cloud SQL database to any past state inside the retention window. The mechanism depends on the engine. Cloud SQL for MySQL uses binary logging. Cloud SQL for PostgreSQL uses write-ahead logging, often called WAL.

The two mechanisms differ in when they write the log relative to the change:

Binary logging (MySQL): records every change after the transaction is committed. The log captures each modification in binary format so you can replay forward to any point in time.
Write-ahead logging (PostgreSQL): records every change before it is applied to the database. Nothing hits the data files unless it can be written to the log first, which protects integrity.

Both give you a reliable, ordered trail of changes. That is what makes PITR possible. If a developer runs a destructive UPDATE without a WHERE clause at 14:03, you can restore to 14:02 and undo it without touching anything that happened earlier. This is the scenario you want to reach for when an exam question describes accidental writes, schema corruption, or a bad migration that needs to be unwound to a specific timestamp.

PITR is not a replacement for backups, because the binary logs are kept alongside the live instance. If you lose the instance, you lose PITR. You need both.

Backups in Cloud SQL

Cloud SQL gives you three backup options, and the Professional Data Engineer exam expects you to know when each one applies.

Automatic backups: daily backups inside a 4-hour window you configure. By default they are retained for up to 7 days. If you need longer retention, you have to export the backups to Cloud Storage.
Manual backups: on-demand snapshots you create at any time. You decide how long to keep them.
Scheduled exports: exports written to Cloud Storage for long-term retention or compliance needs.

The default 7-day window trips people up. If a question describes a 30-day or 7-year retention requirement, automatic backups alone cannot satisfy it. The correct answer routes through Cloud Storage, either with scheduled exports or a manual export step. Storage classes like Nearline, Coldline, and Archive then handle the long-tail cost story.

One pattern I see on the exam: a regulated workload with a multi-year retention requirement. Automatic backups handle the operational recovery window. Scheduled exports to Cloud Storage handle the compliance window. Both run, and both have a job.

High availability and failover replicas

HA in Cloud SQL is built on a failover replica. You provision a primary instance in one zone, and Cloud SQL creates a standby in a different zone. Replication between them is synchronous, which means every commit on the primary is also persisted to the standby before it is acknowledged.

When the primary becomes unavailable, Cloud SQL fails over automatically. The application reconnects to the same connection name and resumes against the new primary. There is no manual promotion step.

A few things to lock in for the exam:

The standby lives in a different zone within the same region by default. You can also place a failover replica in a different region for stronger disaster recovery.
Synchronous replication is the reason HA gives you a near-zero data loss profile. That is the whole point. Asynchronous mechanisms cannot make the same promise.
The standby does not serve reads. It exists to take over. If a question asks how to offload read traffic, HA is the wrong answer.

If the scenario says "zonal failure" or "automatic failover with minimal data loss," you are looking at HA.

Read replicas

Read replicas are a different tool for a different problem. They are copies of the primary that handle read traffic so the primary is not overwhelmed. Replication is asynchronous, which means there can be a slight lag between a write landing on the primary and the same row being readable on the replica.

Two limitations matter on the Professional Data Engineer exam:

Cloud SQL read replicas are confined to the same region as the primary. If a question describes cross-region read scaling for Cloud SQL, the answer is not a vanilla read replica.
Replication is asynchronous, so reads can be slightly stale. Workloads that demand strong consistency on every read need to go to the primary.

The exam mental model: read replicas scale read throughput and increase availability for read paths. They do not give you automatic failover. They do not give you synchronous protection. They do not replace backups or PITR.

Picking the right tool

When you see a Cloud SQL recovery or availability scenario on the Professional Data Engineer exam, map the requirement to the mechanism:

Undo a bad write at a specific timestamp: PITR.
Recover from data loss inside the last 7 days: automatic backups.
Long-term retention or compliance: export backups to Cloud Storage.
Survive a zonal outage with automatic failover: HA with a failover replica.
Scale read traffic without overwhelming the primary: read replicas, accepting eventual consistency.

If you can answer those five prompts in one sentence each, the Cloud SQL availability questions on the exam stop being tricky.

My Professional Data Engineer course covers Cloud SQL availability and recovery in the depth the exam expects, including the engine-specific PITR mechanics, the backup retention tradeoffs, and the HA versus read replica distinctions that show up in scenario questions.

Cloud SQL Availability and Recovery for the PDE Exam: PITR, Backups, HA, Read Replicas

Point-in-Time Recovery

Backups in Cloud SQL

High availability and failover replicas

Read replicas

Picking the right tool

Get tips and updates from GCP Study Hub