
Cloud Run shows up on the Professional Cloud Architect exam as the default answer for a specific shape of workload, and the trick is recognizing that shape quickly. I want to walk through what Cloud Run is, the use cases it fits, how its autoscaling behavior creates the cold start problem, and what stateless actually means in this context.
Cloud Run is a fully managed, no-ops, serverless platform for running stateless containers that are invoked over HTTP. You hand Google a container image, and Google handles provisioning, scaling, and maintaining the infrastructure that runs it. There is no server for you to patch and no cluster for you to size.
That sentence is doing a lot of work, so let me break out the parts that matter for the Professional Cloud Architect exam:
If a question describes a team that wants to ship a containerized service without managing infrastructure, and the workload is request-driven, Cloud Run is usually the right pick.
The exam-relevant use cases cluster around four patterns:
The unifying theme is rapid scaling with minimal infrastructure management. When a question hands you a workload with bursty traffic and a team that does not want to run a cluster, Cloud Run is the answer most of the time.
Cloud Run scales horizontally. It adds and removes instances in response to incoming traffic rather than resizing the compute on a single instance. Two instances might be running during normal traffic, six instances during a spike, and zero instances when no requests are coming in.
The scale-to-zero behavior is one of the things that makes Cloud Run cheap. If your service receives no traffic, Cloud Run runs no instances, and you pay nothing for compute during that period. For workloads that are quiet most of the day, this is a meaningful cost difference compared to running a VM or a GKE node continuously.
Scale to zero has a tradeoff: when a request arrives at a service that has scaled down to zero, Cloud Run has to start a new instance from scratch before it can serve the request. That startup time is a cold start, and it shows up to the user as added latency on that first request.
Two mitigations come up on the Professional Cloud Architect exam:
The exam framing to watch for: a question describes a Cloud Run service with latency-sensitive first requests after idle periods. The fix is minimum instances. If a question describes a team paying too much for idle capacity on a request-driven service, the answer often goes the other direction, and you let it scale to zero.
Stateless means the container does not retain any data or session information between requests. Each request is handled independently. The container does not care which instance previously served a user, and the platform does not preserve in-memory state when an instance shuts down.
This is the property that makes Cloud Run's scaling model work. Because instances do not hold onto state, Cloud Run can start them, stop them, and route traffic between them freely without breaking the application. If your workload needs to remember things between requests, that state has to live somewhere else, like Memorystore, Firestore, or Cloud SQL.
The opposite is stateful. A stateful workload retains data or session information across requests, so instances cannot be swapped or terminated without coordinating that state. Stateful workloads are not a good fit for Cloud Run, and the exam will usually steer you toward GKE or Compute Engine for those.
The pattern-matching shortcut: if a question explicitly mentions stateless containers, strongly consider Cloud Run. That phrasing is essentially the platform's tagline, and the exam uses it as a signal.
For Professional Cloud Architect questions about Cloud Run, the decision flow looks like this:
That covers the Cloud Run material the Professional Cloud Architect exam will test you on at this level: what it is, when to pick it, why it scales the way it does, and how to handle the cold start tradeoff that comes with scale to zero.
My Professional Cloud Architect course covers Cloud Run alongside the rest of the containers and serverless material.