Running a single Compute Engine VM is simple enough. Running a fleet of them reliably - with automatic scaling, health monitoring, and the ability to update software without downtime - requires more infrastructure logic. That is what Managed Instance Groups, or MIGs, provide. A MIG is a group of VM instances that are managed as a single entity, with Google Cloud handling the operational tasks that would otherwise require manual intervention or custom automation. For the Associate Cloud Engineer exam, MIGs are an important topic, and you need to understand instance templates, the core automated behaviors, and how updates work.
Every MIG is built from an instance template. An instance template is a reusable configuration blueprint that defines everything about the VMs the MIG will create: machine type, operating system image, disk size and type, network settings, service account, and any startup scripts. When the MIG needs to create a new VM - whether for scaling or replacing an unhealthy one - it uses the instance template to produce a VM with an identical configuration to every other VM in the group.
This consistency is one of the main values of a MIG. Without instance templates, every VM in a fleet might be configured slightly differently over time as people make manual changes. With a MIG and an instance template, all VMs are always identical, which makes troubleshooting much simpler and deployments more predictable.
Autoscaling is the ability to add or remove VM instances from the group in response to changing demand. You configure a minimum and maximum number of instances, and the autoscaler adjusts the actual count within that range based on a scaling signal.
The default scaling signal is CPU utilization. You set a target CPU utilization - say 60 percent - and the autoscaler adds instances when average CPU across the group is above that target and removes instances when it drops below. This keeps your application from being either under-provisioned during traffic spikes or over-provisioned during quiet periods.
Beyond CPU utilization, you can configure autoscaling based on custom metrics from Cloud Monitoring. Request latency, Pub/Sub queue depth, or any custom metric your application exports can be used as a scaling signal. This is more accurate for many workloads - scaling on queue depth, for example, ensures that a batch processing fleet stays sized to the amount of work waiting rather than the current CPU usage of the VMs.
Autohealing is how a MIG detects and replaces unhealthy VMs. You define a health check - an HTTP, HTTPS, or TCP probe that the MIG uses to determine whether each VM is responding normally. If a VM fails the health check consistently for a configurable period, the MIG considers it unhealthy, deletes it, and creates a replacement from the instance template.
This means the MIG self-repairs without human intervention. If a VM's application process crashes and the health check starts failing, the MIG handles the replacement automatically. Combined with load balancer integration, traffic is automatically routed away from the unhealthy instance while the replacement boots up.
When you need to deploy a new version of your application to a MIG, you update the instance template and then trigger a rolling update. The MIG replaces VMs one at a time (or in small batches, depending on configuration), creating each new VM with the updated template before deleting the old one. This ensures that a portion of the fleet keeps running the current version throughout the update, maintaining availability.
The gradual deployment process follows a defined sequence. The MIG creates one new VM with the updated template - a surge instance - while keeping all existing VMs running. Once the new VM passes health checks and is considered healthy, the MIG deletes one of the old VMs. It continues this pattern, adding new VMs and removing old ones in sequence, until all VMs are running the new version.
This approach limits the blast radius if the new version has a problem. If the first new VM immediately fails health checks, the rolling update can be paused or rolled back before the entire fleet is affected.
A zonal MIG deploys all instances in a single zone. A regional MIG distributes instances across multiple zones within a region. Regional MIGs provide better availability because a zone-level outage does not take down the entire fleet. For production workloads where availability matters, regional MIGs are the right default.
The tradeoff is that regional MIGs are slightly more complex to configure, and cross-zone traffic can add latency and cost if your application makes many internal calls between instances. For most web-facing applications where high availability outweighs these concerns, regional MIGs are appropriate.
The Associate Cloud Engineer exam presents MIG questions in scenarios about scaling, reliability, and deployment. A scenario describing a web application that needs to handle variable traffic automatically maps to MIG autoscaling. A scenario about an application that needs to recover automatically from VM failures maps to MIG autohealing with a health check. A scenario about deploying a software update without downtime maps to a MIG rolling update.
The exam also tests instance templates as a concept separate from MIGs. Understanding that instance templates define what gets created - and that changing the template does not automatically update running instances - is important for deployment questions.
My Associate Cloud Engineer course covers Managed Instance Groups in detail in the Compute Engine section, including autoscaling, autohealing, and the deployment strategies that appear on the Associate Cloud Engineer exam.