Cloud Spanner for the PDE Exam: When to Choose It

May 10, 2026

Cloud Spanner is one of those services that the Professional Data Engineer exam loves to test because it sits in a very specific niche that no other Google Cloud database fills. If you can identify that niche in a question, you can usually rule out three of the four answer choices before reading the rest of the scenario. In this post I want to walk through how I frame Spanner for the Professional Data Engineer exam, what makes it unusual, and the exam tells that should push you toward it versus Cloud SQL or BigQuery.

What Cloud Spanner actually is

Spanner is a fully managed relational database that was built inside Google to solve a problem the rest of the industry had decided was unsolvable. The CAP theorem says a distributed data system can guarantee at most two of consistency, availability, and partition tolerance. Most relational databases gave up horizontal scale to keep consistency. Most NoSQL stores gave up consistency to get scale and availability. Spanner refuses that trade.

It is a relational database designed for three things at once:

Global availability, so data is reachable from anywhere your application runs
Global scalability, so the database grows horizontally as load increases without re-sharding
Global consistency, so every read sees a single coherent view of the data across regions

That last point is the one that matters for the Professional Data Engineer exam. Spanner offers external strong consistency at planet scale. It does this with trueTime, a Google-built API that exposes time as an interval with bounded uncertainty. By tracking that uncertainty across atomic clocks and GPS receivers in every data center, Spanner can order transactions globally without the usual coordinator bottleneck.

Why trueTime matters on the exam

You will not need to implement trueTime, but you will need to recognize what it buys you. When a question mentions any of the following, Spanner is almost always the right answer:

Strongly consistent reads across multiple regions
Externally consistent transactions in a globally distributed application
Horizontal scaling of a relational workload past what a single instance can handle
Five nines availability for a transactional system

That last bullet is worth memorizing. The multi-region Spanner SLA is 99.999%. Cloud SQL tops out at 99.95% for high-availability configurations. If a scenario calls out 99.999% specifically, the exam is signaling Spanner.

Spanner versus Cloud SQL

Both are relational. Both speak SQL. Both support transactions. So how do you tell them apart in an exam scenario?

Cloud SQL is a managed version of MySQL, PostgreSQL, or SQL Server. It is vertically scaled. You pick a machine size, and that machine has limits. When you outgrow a single primary, your options are read replicas for read scale or sharding at the application layer, which is painful. Cloud SQL is regional, with cross-region replicas as a disaster recovery story rather than an active-active deployment.

Spanner is horizontally scaled by design. You add nodes and capacity grows linearly. You can deploy a multi-region configuration where writes are accepted in multiple regions and the database itself handles the coordination. There is no concept of a primary that you outgrow.

The exam tells that point to Spanner over Cloud SQL:

Workload exceeds what a single Cloud SQL instance can handle, and sharding is called out as undesirable
Application is globally distributed and needs low-latency reads in multiple regions with strong consistency
The question mentions 99.999% availability
Schema is relational and the team needs ACID transactions, but the scale story is enormous

If none of those signals are present and the workload is a normal regional transactional database, Cloud SQL is almost always the cheaper and correct answer. Spanner is expensive. The Professional Data Engineer exam will not reward you for choosing it when Cloud SQL fits.

Spanner versus BigQuery

This one is about workload type, not scale. Spanner is OLTP. BigQuery is OLAP. They are not interchangeable, and the exam tests whether you know the difference.

OLTP, online transaction processing, means many small reads and writes against current data. Order placement, inventory updates, account balance changes. Each transaction touches a few rows. Latency matters per request. Spanner is built for this.

OLAP, online analytical processing, means scanning huge volumes of historical data to answer aggregate questions. Quarterly revenue by region, model training feature extraction, ad performance dashboards. Each query touches millions or billions of rows. Throughput matters more than per-row latency. BigQuery is built for this.

If the scenario describes per-user transactional writes, point-lookups by key, or anything that sounds like an application database, Spanner is the right relational answer. If the scenario describes analytical queries, batch loads, or warehousing patterns, BigQuery is the answer and Spanner would be wildly wrong.

How I rank the relational options

When a Professional Data Engineer exam question presents a relational workload, I walk through three checks:

Is the workload analytical? If yes, it is BigQuery, not a relational database in the traditional sense.
Does the workload need global scale, multi-region strong consistency, or 99.999% availability? If yes, it is Spanner.
Otherwise it is Cloud SQL.

That decision tree resolves most relational questions on the exam without needing to read every answer choice in detail.

My Professional Data Engineer course covers Spanner alongside Cloud SQL, BigQuery, Bigtable, and Firestore so you can map any storage scenario on the exam to the right service quickly. The relational decision tree above is the kind of pattern I want you to be able to run in seconds when the timer is ticking.