Managed Instance Groups for the PCA Exam

November 21, 2025

Managed Instance Groups are one of the most heavily tested compute topics on the Professional Cloud Architect exam. They turn Compute Engine from a service where you babysit individual VMs into a service that scales, heals, and updates itself. If you understand four pieces of MIGs well, you can answer most of the compute questions the Professional Cloud Architect exam throws at you. Those four pieces are instance templates, autoscaling, autohealing, and automatic restart.

What a MIG actually does for you

Without a Managed Instance Group, every VM in Google Compute Engine is its own snowflake. You create it, configure it, patch it, and replace it by hand. That works for one VM. It does not work for a fleet.

A MIG layers automation on top of GCE. It rolls out updates while keeping the service available, scales the number of instances up or down to match demand, makes sure every VM in the group is identical, distributes VMs across zones for high availability, and integrates with load balancers so traffic gets spread across the fleet. The exam tests all of that, but the questions almost always come back to one of the four pieces I just mentioned.

Instance templates are the blueprint

An instance template is a reusable configuration object. You define machine type, boot disk, image, network settings, and startup script once, and the template captures all of it. The MIG then uses the template to spin up identical VMs whenever it needs more capacity or needs to replace one.

A few details worth knowing for the Professional Cloud Architect exam:

You can build a template from a custom image, which is how you bake application code, agents, and OS hardening into the template instead of relying on a startup script.
Instance templates are not exclusive to MIGs. You can create a standalone VM from a template too, which is useful when you want consistency without needing the rest of the MIG features.
If you need to change the configuration of a fleet, you do not edit the template in place. You create a new template and roll the MIG over to it.

If a question describes a fleet of VMs that all need to be configured the same way, the answer is almost always an instance template feeding a MIG.

Autoscaling: min, max, and the metric

Autoscaling adjusts the number of instances in the group based on demand. You give the MIG a minimum and a maximum number of replicas, and the autoscaler keeps the group sized somewhere in that range based on a signal you choose.

The default signal is CPU utilization. You set a target CPU percentage and the autoscaler adds or removes VMs to keep the group around that target. CPU is the right answer for a lot of generic workloads, but the autoscaler also supports:

HTTP load balancing utilization, which scales based on how loaded the backend is from the load balancer's perspective.
Cloud Pub/Sub queue length, which is the right call when you have a worker fleet pulling messages off a topic.
Custom metrics from Cloud Monitoring, which lets you scale on request latency, queue depth in your own system, or anything else you can publish to Monitoring.

The Professional Cloud Architect exam likes to test the use cases where MIG autoscaling is the obvious answer. Watch for variable workloads: scaling test environments up and down, gaming workloads with spiky traffic, and web apps that get hammered during business hours and go quiet at night. If a question describes any of those patterns, MIG with autoscaling is on the short list.

Memory-based scaling has a trap

If you scale a MIG on memory utilization instead of CPU, you have to be careful about which memory states you include in the metric. Linux reports memory in several categories, and the exam can lean on this.

The four states that matter:

Used: memory actively held by running applications.
Buffered: memory used to buffer temporary I/O operations.
Cached: memory holding frequently accessed file data.
Slab: kernel-allocated memory for OS structures.

The mistake is filtering on Used memory only. That number underreports total memory pressure because it ignores Buffered, Cached, and Slab. The autoscaler then sees low memory usage even when the system is actually loaded, and it under-scales. To get accurate memory-based autoscaling, the metric has to sum across all of those states.

Health check initial delay matters

This is one of the most quietly important details in the entire MIG feature set, and it shows up on the Professional Cloud Architect exam in over-provisioning questions.

When you turn on autoscaling and autohealing, the MIG runs health checks against every VM. If a VM fails a health check, the MIG marks it unhealthy and the autoscaler reacts by spinning up a replacement. That is the desired behavior for a VM that genuinely crashed. It is the wrong behavior for a VM that is still booting.

If your health check starts probing a VM before the application has finished initializing, the check fails, the autoscaler thinks the VM is unhealthy, and it creates extra VMs to compensate. Now you have over-provisioning: more instances than you need, all because the health check was impatient.

The fix is the initial delay setting on the health check. Set the initial delay to exceed the time your VMs typically take to become fully operational. The health check holds off until the VM has had a fair chance to come up, and the autoscaler stops getting bad signals during boot.

If a Professional Cloud Architect exam question describes a MIG that keeps creating extra VMs even though traffic is normal, the initial delay on the health check is almost always the answer.

Autohealing replaces broken VMs

Autohealing is what gives a MIG its self-healing behavior. A health check service monitors every VM in the group. When a VM fails its health check, autohealing deletes the unhealthy VM and creates a new one from the instance template to take its place.

A few things to keep in mind:

Autohealing requires a health check. Without one, the MIG has no way to know a VM is broken.
The replacement VM is created from the same instance template the MIG is currently using. That is why the template matters so much: every healed VM is a fresh copy of whatever the template says.
Autohealing recreates the VM. It does not try to fix the existing one.

The combination of an instance template and autohealing is what makes a MIG behave like a self-maintaining fleet. You declare the desired state once and the MIG keeps the group matching it.

Automatic restart is the lighter-weight cousin

Automatic restart is a separate feature that reboots a VM after a crash. If the VM crashes, automatic restart brings it back up, and the VM is operational again without anyone touching it.

Two things to remember about automatic restart:

It works on standalone VMs and on VMs inside a MIG. You do not need a MIG to use it.
It restarts the existing VM. It does not delete and recreate it the way autohealing does.

The mental model that helps on the exam: automatic restart is for transient failures where a reboot fixes the problem. Autohealing is for harder failures where you want a fresh VM from the template. Many MIGs use both.

How this fits together for the exam

If you can recognize the four building blocks in a question, MIG questions get fast. Templates define the configuration. Autoscaling sizes the group based on demand. Autohealing replaces unhealthy VMs. Automatic restart reboots VMs that crash. The traps to watch for are using the wrong scaling metric, filtering memory metrics on Used only, and setting the health check initial delay too short.

My Professional Cloud Architect course covers Managed Instance Groups alongside the rest of the compute material.

Managed Instance Groups for the PCA Exam: Templates, Autoscaling, Autohealing