Spanner Primary Key Design for the Database Engineer Exam

May 29, 2026

In Cloud Spanner, the primary key does more than identify a row. Because Spanner is a distributed database, the primary key also determines where each row physically lives and how write traffic spreads across the system. That makes key design one of the topics the Professional Cloud Database Engineer exam tends to test, usually by asking you to recognize a key choice that will create a performance problem at scale and to pick the strategy that avoids it.

A table in Spanner can have one or more columns defined as the primary key, and that key uniquely identifies each row so every record is distinct and reachable. Primary keys are indexed by default, which is what lets Spanner find a specific row without scanning the whole table. There is also a constraint worth remembering precisely: a table without a primary key can contain only one row. So a primary key is not optional in any practical design, and the column or columns you pick for it carry real consequences.

Why the key choice affects performance

The guidance that drives most of the exam questions is that primary keys should distribute writes uniformly across the key space to prevent hotspots. A hotspot is what happens when a large share of incoming writes lands in the same region of the key space at the same time, which means it lands on the same server. Spanner splits data by key range, so rows with keys that are close together tend to sit together, and writes to those keys go to the same place.

This is why monotonically increasing values should be avoided as primary keys. A simple sequence, a timestamp, or any value that always goes up means that every new row has a key just above the previous one. All of those new rows fall into the same range, so all of the write traffic concentrates on one server instead of being shared. The fix is to choose a key generation strategy that scatters new rows across the key space rather than appending them to one end of it.

Universally unique identifiers

The first strategy is to use universally unique identifiers, or UUIDs. These are long, effectively random values, so consecutive inserts produce keys that fall in unrelated parts of the key space. That spreads data and write traffic across many storage nodes at once instead of piling onto one.

For a UUID key the column can use either the native UUID type or a STRING with a length of 36 characters, since 36 characters is the size a textual UUID needs. To produce the values, Spanner provides built-in functions. You call NEW_UUID() for the native UUID type, or GENERATE_UUID() when the column is a string. This approach is the recommended default for new applications and large tables, because the random distribution gives the best write performance for high-scale workloads in a distributed environment.

Bit-reversed sequences

The second strategy is the bit-reversed sequence, and it exists mainly for cases where you need an integer key rather than a string. It takes a standard increasing number and flips its binary representation, so the stored value is non-sequential even though the underlying counter is sequential. Two consecutive input numbers can map to values that are far apart in the key space, which is what spreads the writes out.

For this method the column must use the INT64 numeric data type. You implement it by creating a sequence object with the bit-reversed positive option, and that object lives in the database and generates the reversed values automatically as rows are inserted. A minimal definition looks like this:

CREATE SEQUENCE my_sequence OPTIONS (
  sequence_kind = "bit_reversed_positive"
);

The recommended use case for bit-reversed sequences is migrations that require integer keys instead of strings. It lets you keep numeric INT64 keys, which a legacy schema or application may depend on, while still avoiding the hotspot that a plain sequential integer would cause.

How this shows up on the exam

For the Professional Cloud Database Engineer exam, the pattern to recognize is the link between key shape and write distribution. If a scenario describes a Spanner table keyed on something that only ever increases, such as an auto-incrementing ID or an insertion timestamp, and reports a write bottleneck, the cause is a hotspot from a monotonically increasing key. The remedy is to switch to a strategy that distributes writes, and the right one depends on the column type the situation calls for. Reach for UUIDs generated by NEW_UUID() or GENERATE_UUID() when you are free to choose, and for a bit-reversed sequence on an INT64 column when the requirement is an integer key.

Our Professional Cloud Database Engineer course covers Spanner primary key design alongside interleaved tables and schema layout for distributed reads, with practice questions that drill these distinctions.

Cloud Spanner Primary Key Design: Avoiding Hotspots on the Database Engineer Exam

Why the key choice affects performance

Universally unique identifiers

Bit-reversed sequences

How this shows up on the exam

Get tips and updates from GCP Study Hub