Hotspotting in Cloud Storage for the PCA Exam

GCP Study Hub
Ben Makansi
January 12, 2026

Hotspotting in Cloud Storage is one of those topics that is both practically useful and exam-relevant. It comes up often enough on the Professional Cloud Architect exam that I want to give it its own treatment.

What hotspotting actually is

Hotspotting occurs when many reads or writes target similarly-named objects in the same bucket, which ends up overloading specific storage nodes inside Cloud Storage. The pattern that triggers it is sequential prefixes on object names. If your application writes thousands of objects per second and they all start with the current timestamp, those objects land on the same backend storage range and that range becomes a bottleneck while the rest of the system sits idle.

The naming pattern that causes the problem

Here is the kind of object naming that creates a high bottleneck risk:

bucket-name/2024-12-19-18-00-00/doc01.pdf
bucket-name/2024-12-19-18-00-00/doc02.pdf
bucket-name/2024-12-19-18-00-01/doc03.pdf

Every object starts with the same date and a tightly clustered timestamp. Cloud Storage organizes objects by their full name, so sequential names mean sequential placement, which means concentrated load. When a workload writes a high volume of these in a short window, the backend nodes responsible for that prefix range get hammered. The result is throttling, elevated latency, and sometimes outright errors on writes.

The fix

The fix is to break the lexicographic ordering at the front of the object name by prepending a short random string. The same set of files looks like this with a random hex prefix:

bucket-name/a9f3c1-2024-12-19-18-00-00/doc01.pdf
bucket-name/c7e8b2-2024-12-19-18-00-00/doc02.pdf
bucket-name/f4d5a8-2024-12-19-18-00-01/doc03.pdf

Now the object names spread across the entire keyspace. Cloud Storage distributes the load across many backend ranges instead of one, and the bucket can absorb high-throughput writes without a hot range forming. A short prefix is enough. Six hex characters gives you sixteen million possible buckets of distribution, which is plenty for any realistic write rate.

What to remember for the exam

The Professional Cloud Architect exam will sometimes describe a workload that ingests sensor data, log files, or transaction records into Cloud Storage at high rates and asks why the writes are slowing down or how to design the naming scheme to avoid throttling. The answer is almost always to avoid sequential prefixes and add randomness at the front of the object name.

A few related points worth keeping in mind. Putting the timestamp later in the path is fine. The problem is having it first. Hashing the original key and using the first few characters of the hash as a prefix works just as well as a random string and has the benefit of being deterministic, so the same logical record always lands at the same object name. And this is a write-side concern. Reads do not cause the same kind of hotspot unless you are reading the same object repeatedly, in which case Cloud CDN or a caching layer is the right tool.

Hotspotting is a small topic on its own but it shows up in design questions where the right answer hinges on understanding how Cloud Storage distributes load across object names.

My Professional Cloud Architect course covers hotspotting in Cloud Storage alongside the rest of the storage and analytics material.

arrow