On-Host Maintenance on Compute Engine for the PCA Exam

GCP Study Hub
Ben Makansi
January 6, 2026

On-host maintenance is one of those Compute Engine settings that looks trivial on the configuration page but quietly determines whether your VMs stay up during Google's physical hardware maintenance windows. For the Professional Cloud Architect exam, you need to know what the two options do, when each one is appropriate, and why the default exists.

What on-host maintenance actually controls

Google runs physical infrastructure underneath every Compute Engine VM, and that hardware needs maintenance. Hosts get patched, firmware gets updated, and machines occasionally have to come down so a hypervisor can be upgraded or a faulty component can be swapped out. The on-host maintenance setting on each VM tells Compute Engine what to do with your instance when the underlying physical host is about to undergo that kind of work.

You set this on the VM at creation time, and you can update it later. There are two values it can take, and the choice between them is one of the cleanest availability tradeoffs in the platform.

Migrate VM instance

The default and recommended setting is Migrate VM instance. When Google needs to perform maintenance on the host, Compute Engine live migrates your VM to a different physical machine before the maintenance event begins. The instance keeps running. Memory state, network connections, and running processes all move with it. From inside the guest OS, the migration is largely invisible aside from a brief performance dip during the transfer.

For production workloads this is almost always what you want. You don't have to schedule downtime around Google's maintenance calendar, and you don't have to architect every workload to tolerate sudden host loss just to handle planned maintenance. Live migration handles it for you.

Terminate VM instance

The other option is Terminate VM instance. When the host needs maintenance, Compute Engine stops your VM. If you've configured automatic restart, the instance comes back up on a healthy host once maintenance is done. If you haven't, it stays stopped until you start it manually.

This setting is appropriate in a few specific cases. Spot and preemptible VMs use terminate behavior because they're already designed to be stopped at any moment. Workloads that pin to specific hardware features (GPUs in some configurations, certain machine types) cannot be live migrated and must terminate. And some non-critical batch or development environments are cheap enough to stop and restart that the operational simplicity is worth it.

For anything resembling a production system though, terminate is the wrong choice. You're trading no benefit for the certainty that maintenance events will cause an outage.

What to remember for the PCA exam

The Professional Cloud Architect exam expects you to default to Migrate VM instance for production workloads and to recognize the narrow situations where Terminate VM instance is required or acceptable. If a question describes a critical application that needs to remain available during Google's planned maintenance, the answer involves the migrate setting. If a question involves a workload type that fundamentally cannot be live migrated, like certain GPU configurations or spot instances, terminate is the only valid option.

It also helps to internalize that this setting is about planned host maintenance, not unplanned hardware failure. Live migration protects you from scheduled events. Surviving a host crash still requires multi-zone or multi-region redundancy at a higher level of the architecture.

My Professional Cloud Architect course covers on-host maintenance alongside the rest of the compute material.

arrow