
Bigtable scales horizontally by spreading data across nodes, but that scaling only works if your row keys cooperate. When they don't, you get hotspotting, and a single node ends up doing the work that should be split across the cluster. This is one of the topics the Professional Cloud Architect exam keeps coming back to, because the answer to "why is my Bigtable cluster slow" is almost always row key design.
Hotspotting is a performance bottleneck that happens when a disproportionate share of requests lands on a small subset of nodes in the Bigtable cluster. The cluster has plenty of capacity in aggregate, but the row key design did not distribute data evenly, so most reads and writes pile onto one or two nodes while the rest sit idle.
The ideal looks like a client sending requests that fan out across every node in the cluster. Hotspotting looks like the same client sending requests that all funnel into one node. The cluster is technically healthy, but you are not getting the distributed performance you are paying for.
The usual culprits are sequential numbers and timestamps used as row keys. Both produce keys that sort next to each other lexicographically, which means Bigtable stores them next to each other, which means traffic targeting recent data hits the same tablet on the same node.
The fix starts at design time. A few patterns reliably produce well-distributed row keys.
Reverse domain names. Instead of www.mywebsite.com, use com.mywebsite.www. Domain names that are not reversed cluster together because so many of them start with www. Reversing them puts the top-level domain first, which spreads entries across com, org, net, and so on.
Timestamps at the end, or reversed. A row key like sensor8102#20231027T200000Z puts the sensor identifier first and the timestamp at the end. If you put the timestamp first, every write for the current minute lands on the same tablet. Putting it at the end (or inverting the timestamp so newer values sort earlier) breaks that pattern.
String identifiers. Stock tickers like AAPL, sensor IDs, and other non-sequential strings distribute naturally because they do not follow a monotonic order.
The patterns to avoid mirror this list. Non-reversed domain names, sequential numbers like 1001, 1002, 1003, and row keys that need frequent updates (something like user123_balance_1500, where the balance changes constantly) all create hotspots. The balance should be a column value, not part of the key.
Sometimes you cannot redesign the underlying data, and you still need to break up sequential keys. Salting is the technique for that.
Salting prepends a random prefix to the row key. Sequential keys like user001, user002, user003 become something like 3_user001, 7_user002, 2_user003. The prefix is usually a small random number or a hash of the original key. Once the prefix is there, the keys no longer sort sequentially, so writes targeting consecutive users get distributed across different tablets and different nodes.
Salting works, but it has costs. Range queries become harder, because if you want all users from user001 to user100, those rows are now scattered across the keyspace under different prefixes. You either issue parallel queries against each prefix bucket and merge the results, or you accept that simple range scans no longer work the way they did. Salting also adds ingest overhead, because you have to compute or assign the prefix on write and account for it on read.
The tradeoff is straightforward. If your workload is point lookups and you have a hotspotting problem, salting is a good fix. If your workload depends heavily on range scans, salting is going to fight you.
Field promotion is the other lever. The idea is to take a column that you frequently filter or query on and promote it into the row key itself.
Take a sensor reading row with key sensor123 and a column family containing timestamp, temperature, and humidity. If your queries usually look like "give me readings from sensor123 between these two times," you are filtering on a column value that is not indexed, which means scans. Promote the timestamp into the row key, and now the key looks like sensor123#20240918T120000Z. Queries can hit the row key directly, which is indexed, and you avoid full table scans.
The downside is that field promotion changes the shape of your writes. Every reading now creates a new row instead of updating columns on an existing row. That can be exactly what you want for time-series data, but it can also balloon your row count if you promote the wrong field. Field promotion needs to be planned around your actual access patterns, not applied reflexively.
Once you have a Bigtable instance running, the Key Visualizer is how you actually see whether your row key design is working. It is a heatmap available in the Cloud Console.
The vertical axis shows row key prefixes, organized into a hierarchy so you can drill down. The horizontal axis is time. The color shows access intensity. Purple regions have few or no operations. Red and yellow regions are where heavy read or write activity is happening.
A healthy cluster looks like a generally even spread of color across the row keyspace. A hotspotting problem looks like a bright red band concentrated in one prefix range while the rest of the heatmap stays purple. That visual makes it immediately obvious which part of your keyspace is overloaded.
For the Professional Cloud Architect exam, the Key Visualizer shows up in scenario questions where the symptoms point to performance issues with no obvious cause. The right answer is usually to use the Key Visualizer to find the hotspot, then apply salting or redesign the row key to fix it. Documentation lives at cloud.google.com/bigtable/docs/keyvis-overview.
Bigtable hotspotting questions tend to follow a few patterns. A workload is described with sequential or timestamp-based row keys and the cluster is underperforming. The fix is to redesign the keys, salt them, or both. A scenario describes uneven node utilization and asks how to diagnose it. The answer is the Key Visualizer. A scenario describes slow queries that filter on a non-key column. The answer is field promotion.
The underlying logic is always the same. Bigtable is only as distributed as your row keys make it. If you give it sequential keys, you get sequential traffic patterns and you waste the cluster. If you give it well-distributed keys with the right fields promoted, you get the horizontal scaling Bigtable was designed for.
My Professional Cloud Architect course covers Bigtable hotspotting alongside the rest of the databases material.