Canary, Blue/Green, and Rolling Deployments for the PCA Exam

Ben Makansi

December 23, 2025

When you push a new version of an application to production, the way you roll it out matters as much as the code itself. The Professional Cloud Architect exam expects you to know the trade-offs between three main deployment strategies: canary, blue/green, and rolling. Each one balances risk, cost, and rollback speed differently, and the right choice depends on the workload, the blast radius of a bad release, and how much spare capacity you can afford to run.

I want to walk through each strategy, what it actually does on Google Cloud, and the situations where the exam tends to favor one over the others.

Rolling Deployments

A rolling deployment replaces instances of the old version with the new version a few at a time until the entire fleet is updated. If you have ten instances behind a load balancer, a rolling update might swap two at a time, wait for them to pass health checks, then swap the next two, and so on.

This is the default behavior for Managed Instance Groups when you update a template. You set maxSurge to control how many extra instances can exist during the rollout and maxUnavailable to control how many instances can be down at once. Kubernetes Deployments work the same way, with the same parameter names on the rolling update strategy.

Rolling deployments are cheap because you do not need to double your capacity. They are also slow, and during the rollout your fleet is running a mix of old and new versions at the same time. That last part matters. If the new version changes a database schema or an API contract, a mixed fleet can produce inconsistent behavior for users hitting different instances. Rollback is also slow because you have to roll the old version back the same way you rolled it forward.

Blue/Green Deployments

A blue/green deployment runs two complete environments side by side. Blue is the current production environment serving all traffic. Green is a full copy running the new version, sized to handle the same load. Once green passes its checks, you flip the load balancer to send all traffic to green. Blue stays online for a while in case you need to roll back, which is just another flip of the load balancer.

On Google Cloud you typically implement this with two Managed Instance Groups behind a single global external HTTPS load balancer, or with two separate Cloud Run revisions and a traffic split. For Kubernetes you can run two Deployments in the same cluster and switch a Service selector, or run them in separate clusters and switch at the load balancer level.

The main advantage is rollback speed. If something is wrong on green, you flip back to blue in seconds. There is no mixed-version state because at any moment all traffic is on one environment or the other. The cost is that you have to pay for double the infrastructure during the cutover window, and possibly longer if you keep blue warm as a safety net.

Canary Deployments

A canary deployment routes a small slice of production traffic to the new version while the bulk of users stay on the old version. You might start at one percent, watch error rates and latency, then ramp to five percent, then twenty-five, then one hundred. If anything goes wrong at any stage, you cut the canary back to zero and the old version is still serving everyone else.

This is the strategy you reach for when the new version is risky and you want to limit the blast radius to a small fraction of users before committing. It is also the slowest of the three to fully roll out because you are deliberately gating the ramp on real user signals.

On Google Cloud, Cloud Run makes canaries easy because traffic splitting is a first-class feature on revisions. You deploy the new revision with zero traffic, then assign it a percentage. GKE supports canaries through Anthos Service Mesh or any service mesh that does weighted routing. For Managed Instance Groups, the canary update mode lets you mark a subset of instances for the new template, so a percentage of your fleet runs the new version and the load balancer naturally distributes traffic across them.

Choosing on the Exam

The exam will usually give you a scenario and ask which strategy fits. A few patterns to watch for.

If the question emphasizes minimizing cost or there is no spare capacity to double the environment, rolling is the answer. If the question emphasizes fast rollback or the new version changes behavior in ways a mixed fleet cannot tolerate, blue/green is the answer. If the question emphasizes limiting exposure of a risky change to a small percentage of users before full rollout, canary is the answer.

Watch for words like phased rollout, which is another name for canary. Watch for traffic split or weighted routing, which usually points to canary on Cloud Run or a service mesh on GKE. Watch for instant rollback or zero-downtime cutover, which point to blue/green.

One more thing the exam likes to test. These strategies are not mutually exclusive. You can do a canary on top of a blue/green by sending one percent of traffic to green before flipping the rest. You can do a rolling deployment within a single environment of a blue/green pair. The strategies compose, and a Professional Cloud Architect should be comfortable mixing them when the situation calls for it.

Where Each One Lives in Google Cloud

Cloud Run is the easiest place to run any of these because revisions and traffic splits are built in. You deploy a new revision, hold it at zero percent, then ramp it however you want.

Managed Instance Groups support both rolling and canary updates natively. You set the update policy, optionally specify canaryInstances to gate a subset, and the group handles the rest.

GKE supports rolling out of the box on every Deployment. Canary and blue/green require either separate Deployments and Service tricks or a service mesh layered on top.

App Engine supports traffic splitting across versions, which is essentially canary by another name. You can split by IP, cookie, or random percentage, and migrate traffic gradually.

Across all of these, the load balancer is the actual mechanism that makes any non-rolling strategy work. Whatever the platform, the question is always how you route traffic between an old version and a new version, and the answer is some flavor of weighted backend or traffic split.

My Professional Cloud Architect course covers deployment strategies alongside the rest of the architecture and compliance material.