Cloud CDN for the PCA Exam: Architecture and Common Use Cases

Ben Makansi

December 24, 2025

Cloud CDN is one of those services that looks simple on the surface but has enough nuance to show up in scenario questions on the Professional Cloud Architect exam. When I work through the content delivery section of the curriculum, I focus on three things: what Cloud CDN actually does, how it integrates with the rest of the GCP networking stack, and the specific situations where reaching for a CDN is the right architectural choice. This article walks through all three.

What Cloud CDN Is

Cloud CDN is a content delivery network that caches content at Google's global edge locations. Instead of every user request traveling all the way back to an origin server, frequently accessed content sits at edge points of presence around the world, much closer to the users requesting it. The result is lower latency, less load on origin systems, and lower egress costs because traffic is served locally rather than pulled repeatedly from the origin.

The integration story is what makes Cloud CDN useful in practice. It plugs into Cloud Storage, Compute Engine, and Cloud Load Balancing. That means a frontend serving static assets out of a Cloud Storage bucket can layer caching on top without any application changes, and a backend running on Compute Engine instances behind a load balancer can cache the parts of its responses that do not need to be regenerated on every request.

How a Request Flows Through Cloud CDN

The mental model I rely on for the exam is straightforward. A user request is directed to the nearest cache endpoint. If the requested file is already cached at that endpoint, Cloud CDN returns it immediately. If it is not, Cloud CDN retrieves the file from the origin (often a multi-region Cloud Storage bucket), caches it at the edge endpoint, and serves it to the user. Future requests for the same file from users in that region are then served straight from the cache.

That flow has a few implications worth holding onto. The first request from a region will always be a cache miss and will pay the cost of fetching from origin. After that, subsequent requests benefit. The cache is regional in the sense that an endpoint in one location does not automatically have the same content as an endpoint somewhere else. Each region warms up independently as users in that region make requests.

When Cloud CDN Is the Right Choice

The use cases the curriculum highlights map directly to the exam patterns I have seen. Cloud CDN is a strong fit when:

The application has a global user base and you want consistent low latency regardless of where a user is located.
Traffic volumes are high enough that repeatedly hitting the origin is wasteful and expensive.
Low latency is an explicit requirement, for example a media site or an application serving large static assets like e-books, images, or video.
You want to reduce egress costs by serving content from cache instead of repeatedly pulling from a multi-region bucket.

The classic pairing is Cloud CDN with a multi-region Cloud Storage bucket holding the actual content. The bucket gives you durability and availability across regions, and Cloud CDN distributes the hot subset of that content to edge locations close to your users. If a question describes static content stored in Cloud Storage that needs to be delivered to a global audience with low latency, that combination is the answer the exam is looking for.

The Common Architecture: Cloud CDN with Load Balancing and Compute

The architecture I keep coming back to in PCA scenario questions combines Cloud CDN, Cloud Load Balancing, and a backend made up of Cloud Storage for static content and Compute Engine for dynamic logic. The flow looks like this.

Global users send requests that first reach Cloud Load Balancing. The load balancer prioritizes Cloud CDN as the primary source for content. If the requested content is cached, Cloud CDN serves it immediately from the nearest edge location and the request never touches the origin. If the content is not cached, Cloud CDN fetches it from Cloud Storage, caches it for future requests, and returns it to the user.

For requests that involve backend logic or dynamic content generation, the load balancer routes traffic to Compute Engine instances. Those instances handle application logic, hit databases, or generate content that cannot be served from cache, and return the result to the load balancer, which returns it to the user.

The architectural value of this pattern is the separation of concerns. Frequently accessed static content is served efficiently from cache without consuming backend compute. The backend is left to focus on the work that genuinely requires real-time computation. The result is a system that scales to high traffic, keeps latency low for global users, and controls cost by minimizing both compute load and egress from the origin.

What to Watch For on the Exam

When a Professional Cloud Architect question describes a workload with global users and static or semi-static content, Cloud CDN should be on your shortlist. Pay attention to the language around latency, traffic volume, and egress cost. Those are the cues that point to a caching answer rather than scaling the origin further. If the scenario also mentions backend logic or dynamic responses, the answer is usually the combined architecture: Cloud CDN in front, load balancer routing between CDN and Compute Engine, and Cloud Storage holding the durable content.

The other thing to remember is what Cloud CDN does not do. It is not a database accelerator and it is not a substitute for designing a global database strategy. It caches HTTP(S) responses at the edge. When the response is appropriate to cache, it is excellent. When the response is unique to the user or changes constantly, the cache will not help and the question is probably pointing somewhere else.

My Professional Cloud Architect course covers Cloud CDN alongside the rest of the security material.