
Cloud Run gives you a serverless way to run containers, but the moment you need to put a load balancer in front of it or stretch the workload across regions, you run into a structural problem. Load balancers expect to talk to backend instances, and Cloud Run does not expose any. Network Endpoint Groups are how Google Cloud bridges that gap, and they are the foundation of any global, high-availability Cloud Run architecture you will see on the Professional Cloud Architect exam.
When you stand up a load balancer in Google Cloud, the standard mental model is that traffic flows from the load balancer into a backend service, and the backend service points at a managed instance group of Compute Engine VMs or a node pool in GKE. Both of those backends give the load balancer something concrete to route to. There are real instances with real IPs.
Cloud Run is different. It is a fully managed serverless platform, which means there are no VMs you can register as backends. The infrastructure is abstracted away. So if you want to use Cloud Load Balancing in front of a Cloud Run service, the load balancer needs a different kind of target.
That target is a Network Endpoint Group, or NEG. A serverless NEG is a logical wrapper that points at a Cloud Run service (or Cloud Functions, or App Engine). The load balancer attaches to the NEG, and the NEG knows how to forward requests into the serverless service. You are not registering individual endpoints by IP. You are registering a reference to the service itself.
This is almost always paired with an HTTPS Load Balancer. That is the path you should expect to see in PCA exam questions. Internal TCP/UDP load balancers and the older Network Load Balancer do not fit this pattern.
Once you understand that NEGs are the connector, the global architecture becomes a fill-in-the-blanks exercise. The pattern looks like this:
Now when a user in Sydney makes a request, Google's global front end routes them to the closest healthy backend, which is the australia-southeast1 NEG, which forwards to the Cloud Run service running in that region. A user in Frankfurt hits europe-west1. A user in Chicago hits us-central1. Latency stays low because traffic terminates near the user, and availability stays high because if one region goes down the load balancer routes around it.
The microservices in this pattern are typically separate RESTful services. Each region runs an independent copy. There is no cross-region state coordination at the Cloud Run layer itself. Anything that needs shared state lives behind the services in something like Spanner or a multi-region Cloud SQL setup.
If you were doing the same thing with GKE or Compute Engine, you would not need NEGs in this serverless sense. The load balancer would attach directly to managed instance groups or to a GKE Ingress backed by node pools. Those backends already have a concrete shape the load balancer understands.
Cloud Run wins when you do not want to manage that infrastructure. You give up some of the direct control, you gain scale-to-zero billing and zero VM operations, and you accept that NEGs are the mandatory plumbing to connect to global load balancing. That tradeoff is the architectural decision the Professional Cloud Architect exam is testing.
If a PCA question describes a Cloud Run service that needs custom domains, SSL termination, Cloud Armor, Cloud CDN, or multi-region failover, the answer involves a serverless NEG behind an external HTTPS Load Balancer. If the question is about how to make a Cloud Run workload globally highly available, the answer is multiple regional Cloud Run deployments, each fronted by a NEG, all attached to one global HTTPS Load Balancer.
You do not need to memorize the gcloud syntax for creating a serverless NEG to pass the exam. You do need to recognize the pattern when it shows up in a scenario, and you need to know that NEGs exist specifically because serverless products do not expose backends the way VMs and node pools do.
My Professional Cloud Architect course covers Cloud Run NEGs and global high-availability architectures alongside the rest of the containers and serverless material.