Preemptible (Spot) Instances on Compute Engine for the PCA Exam

December 12, 2025

Spot instances (long called "preemptible" instances, and the exam still uses both terms) are one of the most teachable cost-optimization tools on Compute Engine. They are also a near-guaranteed appearance on the Professional Cloud Architect exam, usually as a "pick the cheapest option that still meets the workload's requirements" question. The trick is recognizing when the workload tolerates interruption and when it does not.

What a Spot instance actually is

When you create a VM on Compute Engine, one of the choices you make is the availability policy. You pick between a Standard instance and a Spot (preemptible) instance. A Standard instance gives you guaranteed availability at the normal on-demand price. A Spot instance is the same hardware at a steep discount, with one important catch: Google can reclaim that VM at any time to give the capacity to a higher-priority workload. When that happens, you get roughly a 30-second warning, and then the instance is terminated.

That is the entire trade-off. Cheaper compute in exchange for the right of Google to take the machine back. Everything you need to know about when to use them flows from that one fact.

Workloads that fit Spot instances well

Spot is a strong fit when the work is compute-heavy, can be checkpointed or restarted, and does not depend on persistent state living on the instance itself. A few categories the exam tends to draw from:

Rendering and media encoding. An animation studio rendering 3D frames can checkpoint progress per frame. If a Spot VM gets reclaimed mid-job, the next instance picks up where the last one left off. The dollar savings on a long render are substantial.
Data analysis, batch predictions, and simulations. These workloads usually already write intermediate results somewhere durable, so an interruption costs you a partial chunk of work and not the whole job.
Hadoop and Spark clusters. Distributed compute frameworks are designed to handle node failure. Losing a worker means the framework reschedules its tasks. You get more compute power for the same budget, and the occasional reclamation is just another node failure.
CI/CD pipelines. Build and test runners are stateless and re-runnable. If a build is interrupted, the pipeline retries it. The cost savings on burst capacity for a busy engineering team add up quickly.

The common thread is interruption tolerance plus the ability to resume from somewhere durable. If the answer to "what happens if Google reclaims this VM in the middle of the job" is "we lose 30 seconds of work and try again," Spot is a good fit.

Workloads that should never run on Spot

The flip side is anything that needs continuous, real-time uptime, where an interruption is not just inconvenient but actively breaks the product:

Real-time multiplayer games. Players cannot tolerate a 30-second eviction notice mid-match. The game session ends, and the player experience is ruined.
High-frequency trading. Latency requirements are measured in microseconds and the workload demands guaranteed availability. A reclamation event during market hours is unacceptable.
Live video streaming. A live sports broadcast or concert stream that drops because the VM was reclaimed is a public failure that damages the service.
Critical healthcare systems. Patient monitoring and emergency response cannot pause for a 30-second termination warning. The safety risk outweighs any cost savings.

The pattern here is the inverse of the good fit list. The work demands continuous uptime, the state lives on the running instance, and an interruption causes user-visible damage. For these, you pay for Standard instances.

How this shows up on the PCA exam

Spot questions on the Professional Cloud Architect exam almost always look the same. You are given a workload description and asked for the most cost-effective option. The decision tree is short:

Does the workload tolerate a 30-second termination with no warning beyond that? If no, Standard.
Can the workload checkpoint or be retried from somewhere other than the VM's local disk? If no, Standard.
Is the workload compute-bound batch work, distributed analytics, rendering, or CI? If yes, Spot is almost certainly the right answer.

The exam will sometimes try to trip you up by mixing signals. A "data processing pipeline" sounds like Spot at first glance, but if the question stresses real-time SLAs or per-event latency requirements, that pipeline needs Standard. Read for the words "batch," "checkpoint," "fault-tolerant," "interruption-tolerant," and "cost-optimize" as Spot signals. Read for "real-time," "uninterrupted," "high availability," "low latency," and "mission-critical" as Standard signals.

One naming detail worth knowing

Google rebranded preemptible instances to Spot instances a few years back, but the older "preemptible" terminology still appears in documentation and on the exam. Treat them as the same thing for exam purposes. The behavior, the pricing model, and the 30-second termination warning are identical. If you see either word on a Professional Cloud Architect question, you are looking at the same product.

My Professional Cloud Architect course covers Spot instances alongside the rest of the compute material.