Vertex AI Pipelines as Managed Kubeflow for PCA

Ben Makansi

November 2, 2025

Vertex AI Pipelines is one of those services that is much easier to remember once you understand what it replaces. On the Professional Cloud Architect exam, the question is rarely about pipeline syntax. It is about which layers of an ML pipeline stack are managed for you and which layers you still own. The cleanest way to keep this straight is to look at what the world looked like before Vertex AI Pipelines existed.

The Three Layers of an ML Pipeline

An ML pipeline stack has three distinct layers, and every framework slots into one or more of them.

The authoring layer is the Python code where you define each step of the workflow, things like data ingestion, preprocessing, training, and evaluation, and how those steps connect into a directed acyclic graph.
The orchestration layer is what actually executes that graph in the right order. It picks up the compiled pipeline spec, runs each step as a containerized task, and handles dependencies between steps.
The infrastructure layer is the compute and storage underneath. Containers have to run somewhere. Historically that meant a Kubernetes cluster, virtual machines, or in some cases a local workstation.

Every Vertex AI Pipelines question on the Professional Cloud Architect exam is really asking which of these layers Google manages versus which ones you are responsible for. Hold on to that framing.

Kubeflow Pipelines and Its SDK

Before Vertex AI, the dominant pattern for running ML pipelines on Kubernetes was Kubeflow Pipelines together with the Kubeflow Pipelines SDK.

The Kubeflow Pipelines SDK is the authoring layer. You write Python that describes each step and how those steps wire together. Once defined, the SDK compiles your pipeline into a specification that Kubeflow Pipelines can execute.

Kubeflow Pipelines itself is the orchestration layer. It picks up that compiled spec and runs the DAG using containerized workloads. The catch is that Kubeflow Pipelines runs on Kubernetes, which is where the Kube in Kubeflow comes from. That meant you had to stand up and maintain a Kubernetes cluster yourself: cluster lifecycle, networking, compute scaling, availability.

It was a meaningful step forward at the time, but the operational burden was real. You needed Kubernetes expertise just to keep the orchestration layer healthy, on top of the ML work itself. That cost is what Vertex AI Pipelines later removes.

TensorFlow Extended and Where It Differs

TensorFlow Extended, usually shortened to TFX, is the other authoring framework worth knowing for the exam.

TFX is a library of standardized components designed for production ML workflows, with first-class support for TensorFlow models. Like the Kubeflow Pipelines SDK, TFX handles the Python authoring layer, but it is more opinionated. It ships with built-in components for common tasks such as data ingestion, transformation, model training, and evaluation, so you are composing standard pieces rather than writing every step from scratch.

One important distinction: TFX is only the software layer. It is just the SDK. Whereas Kubeflow gives you both an SDK and a runner, TFX is purely an authoring framework. When people say "TFX" or "TensorFlow Extended" in a pipeline context, they mean the TFX SDK. Those terms are interchangeable.

Because TFX is only an SDK, it needs a separate runner to orchestrate the pipeline. The most common runner used with TFX is actually Kubeflow Pipelines. TFX also supports Apache Beam and Airflow as runners. Whichever runner you pick, that runner still needs an execution environment underneath: Kubernetes, virtual machines, or a local machine, depending on the runner. So a TFX setup before Vertex AI looked very similar to a Kubeflow Pipelines setup in the layers you had to operate.

Vertex AI Pipelines as a Managed Runner

Now to the part the Professional Cloud Architect exam actually tests. Vertex AI Pipelines is a managed runner. It replaces both the orchestration layer and the infrastructure layer that Kubeflow Pipelines and the underlying Kubernetes cluster used to provide.

The authoring layer does not change. You still write your pipeline using either the Kubeflow Pipelines SDK or the TFX SDK. What changes is everything below that. Instead of standing up Kubeflow Pipelines on a Kubernetes cluster you maintain, you hand the compiled pipeline spec to Vertex AI Pipelines and Google runs it.

That means no Kubernetes cluster setup, no manual scaling of orchestration nodes, no separate orchestration engine to operate. The orchestration layer and the infrastructure layer are both fully managed. You focus on pipeline logic. Vertex takes care of executing it.

One detail worth holding on to: regardless of whether you author with the Kubeflow Pipelines SDK or the TFX SDK, both compile down to the same standard pipeline specification, which is essentially the Kubeflow Pipelines spec. Vertex AI Pipelines executes that spec in either case. So Vertex AI Pipelines is, under the hood, a managed Kubeflow runner that happens to also accept TFX-authored pipelines through the same compiled format.

How This Maps to PCA Exam Questions

When a question on the Professional Cloud Architect exam describes a team that is currently running ML pipelines on Kubeflow Pipelines on GKE and wants to reduce operational overhead, the answer is almost always Vertex AI Pipelines. The framing usually highlights Kubernetes maintenance, cluster scaling, or orchestration upgrades as a pain point. Vertex AI Pipelines removes all of those concerns because it manages the orchestration and infrastructure layers itself.

If a question asks whether existing pipeline code has to be rewritten, the answer is no. Kubeflow Pipelines SDK code and TFX SDK code both run on Vertex AI Pipelines without changing the authoring layer. That is a frequent distractor because the obvious-looking answer is "rewrite for the managed service," and it is wrong.

If a question contrasts TFX with Kubeflow Pipelines and asks what TFX provides, remember that TFX is only the SDK. It does not provide its own orchestration layer, so it always needs a runner, which is typically Kubeflow Pipelines or, in the managed world, Vertex AI Pipelines.

The shortest version of all of this: Vertex AI Pipelines is the managed runner that replaces the Kubeflow orchestration layer and the Kubernetes layer underneath it, while leaving the SDK choice up to you.

My Professional Cloud Architect course covers Vertex AI Pipelines alongside the rest of the ML and AI material.

Vertex AI Pipelines as Managed Kubeflow for the PCA Exam

The Three Layers of an ML Pipeline

Kubeflow Pipelines and Its SDK

TensorFlow Extended and Where It Differs

Vertex AI Pipelines as a Managed Runner

How This Maps to PCA Exam Questions