
Cloud SQL high availability, or HA, is a configuration that provides automatic failover when a zone fails. It is worth being precise about what that covers, because the Professional Cloud Database Engineer exam tends to test the edges of it. HA does not protect against a full regional outage, it does not add read capacity, and it roughly doubles the cost of an instance by running a second one alongside the first. Several of the wrong answers in a typical scenario are services that overlap with HA in some way but solve a different problem, so the questions usually come down to telling them apart.
When you enable HA on a Cloud SQL instance, Google Cloud provisions a standby instance in a second zone within the same region and replicates your data to it synchronously. If the active instance's zone has an outage, or the instance itself fails, Cloud SQL shifts traffic over to the standby automatically, and the application continues without manual intervention. Because the standby sits in another zone of the same region, HA is a zonal failover mechanism. It protects against a single zone going offline, not against the loss of an entire region.
HA can be enabled on read replicas as well as on the primary instance, and it is configured on each one independently. If a read replica is serving a reporting dashboard that needs to stay available through a zonal outage, enabling HA on that replica is enough, and the primary does not need to change. That is the setup a scenario is describing when it asks for a low-cost way to keep a reporting replica available during a zonal failure.
Because HA runs a second instance that carries no traffic under normal conditions, it roughly doubles the cost of the deployment. That puts the recovery time objective at the center of the decision. When a workload can tolerate a longer recovery window, automated backups on their own are often sufficient, and the standby becomes an expense you can reasonably avoid. An internal reporting database that is refreshed once a week and can be restored within 24 hours is a common example, since there is little reason to run a redundant instance for it. When a scenario emphasizes keeping costs down and gives a recovery window measured in hours rather than minutes, we would generally lean toward backups rather than HA. If the data also carries a residency requirement, directing those automated backups to a regional bucket keeps it within the region.
A frequent point of confusion is treating HA as a form of disaster recovery. The two address different failures. HA covers zonal failures, and it does nothing for a full regional outage, because the standby lives in the same region as the primary. If that region goes down, both copies go down with it.
Regional disaster recovery is a separate mechanism. You create a cross-region read replica, and if the primary region fails, you promote that replica to a standalone primary. Promotion breaks the replication link and converts the read-only copy into an independent database that can accept writes. Cross-region replication is asynchronous, so promotion is a manual or scripted step, and it is standard practice to check the replication lag beforehand to understand how much recent data might be lost.
In practice the two operate as layers. HA handles zonal failures automatically, and a cross-region read replica handles the regional case through promotion. This is also why a question about validating a regional failover procedure points to promoting the cross-region replica. An HA failover only moves the workload to another zone within the same region, so it does not exercise the regional recovery path.
HA is also sometimes taken for a way to scale read performance, which it is not. The standby does not serve read traffic, so it provides no relief when the primary is under heavy read load. Read scaling is handled separately. You add read replicas and direct read queries to them, or you place a read pool in front of several replicas so that the application connects to a single endpoint while Cloud SQL distributes the requests across them. If the constraint is write throughput rather than reads, replicas do not help either, because writes have to occur on the primary to preserve consistency. In that case you scale the primary vertically with additional vCPUs and memory.
Much of what this topic tests comes down to keeping those three roles distinct. HA is for surviving a zone failure, read replicas and read pools are for scaling reads, and vertical scaling is for additional write throughput. Working out which of the three a given scenario is describing is usually enough to narrow the answer choices.
HA places a synchronous standby in a second zone of the same region and fails over automatically. It roughly doubles cost, so leaving it off is reasonable when a slower, backup-based recovery is acceptable. It addresses zonal failures rather than regional ones, and it is not a tool for scaling reads. Regional recovery comes from promoting a cross-region read replica, read scaling from replicas or a read pool, and write scaling from a larger primary. On the Professional Cloud Database Engineer exam, most Cloud SQL availability questions come down to placing the scenario into one of those categories.
Our Professional Cloud Database Engineer course covers Cloud SQL high availability alongside read replicas, cross-region disaster recovery, and the rest of the Cloud SQL surface area the exam tests, with practice questions that drill these distinctions.