Vertex AI Is Replaced by Gemini Enterprise Agent Platform

GCP Study Hub
Ben Makansi
May 2, 2026

Vertex AI was Google's end-to-end MLOps platform. And until now, Google's approach to generative AI was to integrate generative AI tools and products into Vertex AI. Now it's the other way around.

Instead, Vertex AI's features, including Model Garden, Custom Training, AutoML, Model Registry, Endpoints, Pipelines, and everything else, are now rebranded and subsumed under a new service in GCP called "Agent Platform." The full name is actually "Gemini Enterprise Agent Platform."

So literally, when you go to Google Cloud Console now, you cannot go to Vertex AI. There is no Vertex AI to go to.

Even if you search for Vertex AI, GCP redirects you to Agent Platform.

And within Agent Platform, you can see most of the previous Vertex AI features under the "Models" menu within Agent Platform.

Meanwhile, Google has an entirely separate "Agents" menu within Agent Platform that apparently contains most of what was previously under "Vertex AI Agent Builder," as well as some new features.

Here's what the Agents section of Agent Platform seems to include:

Under the "Build" sub-menu we have Agent Garden, ADK, MCP Servers, RAG Engine, Vector Search, and Search.

Agent Garden is like Model Garden but for agents. It has prebuilt agents and templates you can use.

The ADK refers to the Google ADK, which is Google's open source framework for building agents programmatically, including with non-Google models.

MCP servers is a registry of Model Context Protocol servers, managed by the Cloud API Registry. You can use prebuilt MCP servers like for BigQuery or Google Maps, as well as transforming existing APIs into custom MCP servers.

RAG Engine, Vector Search, and Search are the different solutions for grounding / RAG. These were available before, but now Vertex AI Search is just called Search.

Under the "Scale" sub-menu we have Deployments, Memory Bank, and Sessions.

Deployments is what was formerly called Agent Engine. It's a scalable, managed runtime for deploying and managing agents.

Memory Bank is a managed service that stores long-term memory to allow more context-aware agent interactions across multiple sessions with your agent.

Sessions manages stateful data and context within a single agent interaction. The session holds the back-and-forth and sequence of actions for one conversation, and Memory Bank uses those stored sessions as the source for generating cross-session memories.

Under the "Govern" sub-menu we have Agent Registry, Policies, Gateways, and Security.

Agent Registry is like Model Registry, but for agents. It's your catalog for tracking and managing agents.

Policies allows you to mitigate risks like data leakage and ensure compliance.

Gateways provide secure, unified connectivity between agents and tools across any environment, while enforcing security policies and Model Armor protections. They're like the air traffic controllers for your agent ecosystem.

And then the "Security" feature is basically Agent Identity plus Model Armor plus threat scanning. Agent Identity gives every agent a unique cryptographic ID, creating a clear auditable trail for every action it takes, and then you have real-time threat detection and vulnerability scanning specific to agentic systems.

Finally, we have the "Optimize" sub-menu, which contains Topology and Evaluation.

Topology is a visual map of how agents, tools, and infrastructure connect to each other. Topology is pretty new for agents, although it's built on App Hub, which has been around for some time.

And we have Evaluation, which lets you test agents against simulations and synthetic user interactions in a controlled environment. You can automatically score your agent on task success and safety across multi-step conversations. Agent Evaluation also continuously scores agents against live traffic with "Online monitors" and allows you to debug reasoning paths with a trace viewer.

Finally, there's also, in a separate menu, Agent Studio. Which I assume they are keeping separate because it's the most accessible way to build agents, including for people who are nontechnical / not coders. So they want to have that up front even though it's not exactly.

If you're opening up Agent Platform for the first time, especially in a new project, you'll be prompted to enable a bunch of APIs to allow everything in the platform to work. And the list of APIs that you have to enable is actually pretty revealing about how Agent Platform works.

  • aiplatform.googleapis.com is the main Agent Platform API and covers Agent Engine, RAG Engine, Vector Search, Memory Bank, Sessions, and Model Garden. It's kind of funny because AI Platform, the current name of this API, was the old name for Google's end-to-end ML development service before it was renamed to Vertex AI. And now it's been renamed again, to Agent Platform, but still the API is aiplatform.googleapis.com. I'm not saying this is a bad thing but I do find it amusing.
  • compute.googleapis.com, iam.googleapis.com, and storage-component.googleapis.com are foundational dependencies that other services rely on. Your Compute Engine / networking API, Cloud IAM API, and one of the Cloud Storage APIs.
  • notebooks.googleapis.com is necessary for Colab Enterprise and Workbench - managed Jupyter notebooks.
  • dataform.googleapis.com provides managed data transformations, often used in agent data pipelines. It's the API for Dataform.
  • agentregistry.googleapis.com runs the Agent Registry, the centralized catalog for agents, tools, and MCP servers.
  • cloudapiregistry.googleapis.com runs the managed MCP servers. As we discussed above, the MCP Servers feature uses Cloud API Registry under the hood.
  • modelarmor.googleapis.com provides prompt-injection and data-leakage protection at the gateway layer.
  • networksecurity.googleapis.com and networkservices.googleapis.com give you Agent Gateway routing and policy enforcement.
  • cloudtrace.googleapis.com, logging.googleapis.com, and monitoring.googleapis.com make up the standard Cloud Observability Suite which will give you the metrics about your agents as well as the trace viewer in Evaluation.
  • observability.googleapis.com powers Application Monitoring (app-level dashboards and topology), extended with GenAI metrics like token usage
  • telemetry.googleapis.com is the OTLP ingestion endpoint for Cloud Trace, used by ADK agents since ADK emits OpenTelemetry natively
  • apphub.googleapis.com and apptopology.googleapis.com generate the Topology view, reusing Google Cloud's application-graph and dependency-mapping services to map relationships between agents, tools, and infrastructure
  • texttospeech.googleapis.com is the Text-to-Speech API which lets you do voice output for voice agents
  • discoveryengine.googleapis.com does not seem to be enabled by default as part of the initial list, but you would probably need to enable it to use Search for RAG use cases.

My thoughts on what this means

It's honestly a pretty bold move to make everything related to AI development on GCP agent-first. The "Models" part, which is where you would train your own AI models, whether they be gen AI, classical or other deep learning models, is a sub-section of Agent Platform, not the other way around.

Conceptually it feels a little backwards to me. Agents are a type of AI, and a supervised model that you might build on Agent Platform, like an XGBoost model or an image classifier is not a type of agent. So I'm not sure how it makes sense for Google to put everything under a platform called Agent Platform, rather than have agents be part of a larger AI platform that also does other AI things.

It also seems to be a bit risky, because if agents specifically do not become ubiquitous, and another type of AI does, then Google will have to backtrack or awkwardly maintain a platform name/focus that is incongruous with the direction of the AI industry at that future time.

I can only infer from this that Google truly does believe that agents are the future. I think that's a risky bet, because "agents" are just LLMs + tools + a loop. We have not fully discovered their limitations and the extent of their application. But Google is betting on agents by making their entire AI platform focused on agents, at least for now.

The four types of stacks for building a deploying agents, and where Agent Platform fits in

There are a lot of tools and platforms out there, and it can be difficult to know where to start or which opportunities you're setting yourself up for if you learn one or another.

I think this framework is a useful way to think about the different types of stacks and approaches to building AI agents.

Type 1: The hyperscaler stack

You may have heard the term "hyperscaler" thrown around before - it refers to one of the very large cloud providers that operates data centers at a massive scale globally. It's typically shorthand for AWS, Microsoft Azure, and Google Cloud, sometimes extended to include Oracle and Alibaba.

Accordingly, the "hyperscaler stack" in the context of agent platforms refers to the agent platform offered by one of these companies as part of their broader cloud services, rather than as a standalone product.

Examples include GCP Agent Platform, AWS Bedrock with AgentCore, and Microsoft Foundry.

So GCP Agent Platform is a hyperscaler stack because it's a feature of Google Cloud, not a separate company's product.

The shape of the workflow for each hyperscaler stack is pretty consistent. You build the agent in the cloud's respective SDK, you deploy it to the cloud's managed runtime, and rely on that same cloud for memory, agent tools, governance, and observability. Data integration is native to the cloud's data services, assuming you have the data stored elsewhere on their platform.

This stack optimizes for production scale, enterprise governance, deep data integration, and compliance. What you sacrifice is some multi-cloud flexibility and speed of prototyping especially for simple use cases.

The tradeoff is that this approach is probably slower to adopt new frontier features and creates high friction for small teams who don't need the governance.

The likely buyer is enterprise IT many agents that touch real data and need audit trails.

Type 2: The model-vendor stack

The model-vendor stack is the full agent platform offered by a foundation model company, where the agent lifecycle is bundled with the vendor's model and runs on the vendor's own infrastructure.

Examples include Anthropic Claude Managed Agents and OpenAI's Agents SDK + AgentKit.

You build the agent in the model vendor's SDK, deploy it to the vendor's managed runtime, and rely on the vendor for memory, tools, and tracing. Data integration is whatever you wire up yourself via MCP or API calls.

This stack optimizes for speed to ship, a clean developer experience, and minimal infrastructure overhead. It gives up model flexibility, since you're locked to one vendor, and it gives up enterprise integration depth, governance, and features like identity propagation which you would get with one of the hyperscalers.

A major trade-off is vendor lock-in. You're a tenant in the vendor's environment, and switching is effortful if you don't like where they take things or the costs no longer work for you.

The likely customer for this is product teams shipping a single agent inside a SaaS product, solo developers, and small teams without internal or industry compliance requirements yet.

Type 3: The open-source stack

This is the do-it-yourself approach to building an agent platform, where you assemble the pieces yourself from open-source frameworks and infrastructure of your choosing.

Examples include LangChain or LangGraph paired with Cloud Run or Kubernetes, Postgres, and LangSmith. CrewAI and AutoGen are also common starting points, and you can mix and match any combination of open-source frameworks with DIY infrastructure.

You pick a framework, then wire up your own runtime, memory, tools, observability, and governance from open-source or self-hosted parts. You deploy wherever you want.

This stack optimizes for maximum flexibility, multi-cloud portability, and no vendor lock-in. It gives up everything that comes managed in the other stacks. There's no managed runtime, no managed memory, no managed observability. You have to assemble that yourself if you want it.

So obviously the tradeoff is the high engineering effort. The platform you build is only as good as the team maintaining it.

The likely customer here is engineering-heavy teams that want full control.

Type 4: The workflow automation stack

The workflow automation stack includes the various SaaS integration tools with AI agent capabilities layered on top, where the agent is one node in a visual workflow rather than a standalone application. Examples include n8n, Make.com, Zapier, Workato, and Power Automate.

This is where a lot of the tutorials online have blown up because these tools are so accessible, but it's important to understand how these differ from the hyperscaler and other approaches aforementioned.

The workflow is visual and low-code. You drag and drop a visual workflow, drop in an AI Agent node, configure tools through pre-built SaaS integrations, and deploy by activating the workflow.

This stack optimizes for speed on SaaS-driven use cases and non-coders. It gives up the identity, retrieval, evaluation, and governance needs for running agents in production.

The risk is brittleness when the logic requires reasoning over structure rather than chaining steps. Agents that need to plan, branch, or reason across many tool calls can easily hit the ceiling of the drag and drop workflow automation stack.

The people who would use this are probably solo operators, small businesses, and ops and marketing teams automating SaaS workflows. The use case is more "trigger fires, do these N steps" rather than "deploy a long-running agent over enterprise data."

How the four types differ

The clearest way to think about this is that every stack type basically optimizes for two of three properties (Speed, Control, Governance) and gives up the third.

  • Hyperscaler stacks pick governance and control, and give up some speed.
  • Model-vendor picks speed and control (at least in terms of dev experience), and they give up governance.
  • Open-source picks control and speed, gives up governance.
  • Workflow automation picks speed for non-developers, gives up both governance and control at scale.

The hyperscaler stack is the only one currently capable of identity propagation. That's a much bigger deal than it sounds, because it's the difference between "the agent can read everything" and "the agent can read what you can read."

Who the customers are for each

It's important to note four types aren't really competing for the same customers.

The hyperscaler stack is for enterprises whose data already lives or will live in the cloud, and they want complete governance and scalability.

The model-vendor stack is for product teams shipping fast or solo operators.

The open-source stack is for engineers who want full control.

The workflow automation stack is for non-developers automating SaaS work or automation freelancers.

So a question worth asking yourself is, where do you want to be?

I think a good bet is the hyperscaler stack, because that's where the production agent workloads of the next decade are going to live, and the people who understand how to build, govern, and operate agents inside that environment are the ones enterprises will actually pay a lot of money to hire.

arrow