Knative Traffic Splitting on GKE for the PCA Exam

Ben Makansi

April 26, 2026

Knative Serving is the piece of the GKE story that turns a Kubernetes Deployment into something that behaves like a serverless application. For the Professional Cloud Architect exam, the specific capability worth knowing is traffic splitting between Revisions, because that is what powers canary deployments on GKE without bolting on a separate service mesh.

What Knative Serving Actually Does

Knative Serving abstracts the operational overhead of running an application on Kubernetes. Instead of managing Deployments, Services, HPAs, and Ingress objects yourself, you describe the application and Knative handles scaling and traffic routing on top of GKE. The unit of deployment is a Revision, which is an immutable snapshot of your application at a point in time. Every time you ship a new version, Knative produces a new Revision rather than mutating the old one.

That immutability is the foundation for everything that follows. Because Revision 1 still exists after you deploy Revision 2, you can keep both running and decide how much traffic each one gets.

Traffic Splitting Between Revisions

The Knative traffic splitter sits between your users and your Revisions. You configure percentages, and the splitter distributes incoming requests accordingly. The canonical example is a 90/10 split: 90 percent of traffic to Revision 1 (the stable version in production) and 10 percent to Revision 2 (the new version under test).

Concretely, the flow looks like this:

Users send requests to your application.
The Knative traffic splitter receives the requests and applies the configured percentage split.
Requests are routed to Revision 1 or Revision 2 based on those percentages.

If Revision 2 starts throwing errors or its latency degrades, you flip the split back to 100 percent on Revision 1. The rollback is a configuration change, not a redeploy, because Revision 1 was never torn down.

Why This Matters for Canary Deployments

Canary deployment is the pattern of releasing a new version to a small slice of real production traffic before sending it the full load. The goal is to surface bugs, performance regressions, or capacity problems while the blast radius is still 10 percent of your users instead of 100 percent.

Knative gives you canary deployments natively on GKE. You do not need to write custom routing rules or stand up Istio yourself. You declare the split on the Knative Service, Knative reconciles it, and the traffic splitter enforces it. Once you are confident in the new Revision, you shift the split to 100 percent on Revision 2 and Revision 1 becomes the rollback target.

What the PCA Exam Wants You to Recognize

For the Professional Cloud Architect exam, the recognition pattern is straightforward. When a scenario describes a team running containers on GKE and asking how to release a new version to a small percentage of users with the option to roll back quickly, Knative Serving with traffic splitting between Revisions is the answer. The exam will not ask you to write the YAML, but it expects you to know that:

Knative Serving runs on GKE and adds serverless semantics to your Kubernetes applications.
Each deployment produces a new immutable Revision.
The traffic splitter routes a configurable percentage of requests to each Revision.
This is the GKE-native mechanism for canary deployments, distinct from Cloud Run, which offers traffic splitting at the managed-service layer.

The Cloud Run vs Knative-on-GKE distinction is worth holding onto. Cloud Run gives you the same Knative-style traffic splitting without you running GKE. Knative on GKE gives you the same capability when you need to stay on a Kubernetes cluster you control. Both come up on the Professional Cloud Architect exam, and both rest on the same Revision plus traffic-split model.

My Professional Cloud Architect course covers Knative traffic splitting on GKE alongside the rest of the containers and serverless material.