
If you sit down to take the Google Cloud Professional Data Engineer exam without a clean mental model for relational versus non-relational databases, you are going to flip a coin on a lot of scenario questions. The exam loves to drop a paragraph about a workload and ask you to pick a storage service. The fastest way to answer those is to first decide which family the workload belongs to, and then narrow down to a specific Google Cloud product.
I want to walk through how I think about that decision when I am studying, and how I teach it inside my Professional Data Engineer prep.
Databases on the exam split into two big buckets. Relational, sometimes called SQL, stores data in tables with rows and columns under a predefined schema. Non-relational, often called NoSQL, is a broader category that covers anything that does not fit neatly into that tabular shape.
On the Google Cloud side, the relational services you should know are Cloud SQL, Cloud Spanner, and BigQuery. The non-relational services are Cloud Bigtable, Firestore, and Memorystore. Outside Google Cloud, the exam expects you to recognize names like MySQL, PostgreSQL, Oracle, and SQL Server on the relational side, and MongoDB, Cassandra, Redis, and DynamoDB on the non-relational side.
Relational databases are highly structured and standardized. The schema is defined up front. Every column has a type. Every row has to conform. That rigidity is a feature, not a bug. It is what lets relational engines guarantee data integrity and run rich joins across tables without surprises.
Three things matter on the exam:
The other classic relational weakness is scale. Traditional relational engines run on a single machine and start to strain as data volume grows. Cloud Spanner is the exception worth memorizing because it gives you relational semantics with horizontal scale, which is why it shows up so often in global, strongly consistent scenarios on the exam.
Non-relational databases are built to store and manage large volumes of data that do not fit a tabular structure. Inside that category, there are four sub-models you should be able to recognize on sight.
The advantages line up the same way every time. Non-relational scales horizontally, handles flexible or evolving data, and tends to deliver higher read and write throughput at scale. The trade-offs are weaker ACID guarantees, no universal query language, and more friction for analytics workloads.
When I read a Professional Data Engineer scenario, I run through a short checklist before I even look at the answer choices.
That little flow handles a surprising share of database questions on the exam. The rest usually come down to a tighter detail, like a region requirement or a latency target, that pushes you between two services in the same family.
The trap is assuming non-relational is always the right call because it scales better. Plenty of exam questions describe a workload that could run on Bigtable or Firestore, but the right answer is still Spanner or BigQuery because the scenario also mentions joins, transactions, or SQL analytics. Read the requirements twice before you commit.
My Professional Data Engineer course covers each of these database categories in depth, including the specific Google Cloud products in each family and the scenario patterns that map to them on the exam.