Cloud Monitoring Alerts in GCP: Setting Up Notifications for CPU, Latency, and Errors

Ben Makansi

April 16, 2026

Cloud Monitoring is the observability service for Google Cloud, and one of its most useful features is its alerting system. You define conditions - a metric, a threshold, and a duration - and Cloud Monitoring fires a notification when those conditions are met. For the Associate Cloud Engineer exam, you need to understand how alerting policies work, what kinds of metrics they can watch, and how notification channels connect the alert to the people who need to act on it.

What an Alerting Policy Is

An alerting policy is a rule that defines when Cloud Monitoring should fire a notification. Every policy has three main components: a condition, a notification channel, and optionally some documentation that gets included in the alert message.

The condition is where you define what you are watching. You pick a metric - CPU utilization on a Compute Engine instance, HTTP response latency on a Cloud Run service, error rate on an App Engine application - and you specify a threshold. When the metric crosses that threshold for a defined period, the condition is considered to be met and the alert fires.

The notification channel is how you get notified. Cloud Monitoring supports email, SMS, Slack, PagerDuty, and other destinations. You can attach multiple notification channels to a single policy so that different people or systems are notified at the same time.

Common Metrics to Alert On

The slides for the Associate Cloud Engineer exam list several typical alerting scenarios. CPU utilization exceeding a threshold like 80 or 90 percent is the most common example. This is straightforward: if a VM is consistently using nearly all of its CPU, something is wrong or the instance needs to be resized.

Response time thresholds are used for latency-sensitive applications. If the 99th percentile latency on an API crosses 500ms, you want to know. Cloud Monitoring lets you alert on percentile metrics, not just averages, which is important because averages hide tail latency issues.

Error rate spikes are another common pattern. If your application logs or Cloud Run metrics show that the error rate is climbing above a normal baseline, an alert fires before users start reporting problems.

You can also alert on downward trends. A sudden drop in active users or requests per second can indicate that something is broken even if no errors are being logged - the traffic stopped reaching the service entirely.

How Notification Channels Work

Notification channels are configured separately from alerting policies. You create a channel once - for example, an email address or a Slack webhook - and then reference that channel in one or more policies. This means you can reuse the same channel across multiple alerts without reconfiguring it each time.

For the Associate Cloud Engineer exam, the key notification channels to know are email, SMS, and third-party integrations like Slack and PagerDuty. Email is the default and requires the least setup. PagerDuty is typically used in on-call rotation scenarios where someone needs to be paged immediately regardless of the time. Slack is common in teams where developers monitor a shared alert channel during working hours.

Alerting Conditions and Duration

One detail that the exam occasionally tests is the concept of alert duration, also called the alerting window or condition duration. When you configure a condition, you can specify that the threshold must be exceeded for a certain period before the alert fires. This prevents noisy alerts from brief transient spikes.

For example, you might set a CPU utilization alert to fire only if CPU stays above 90 percent for 5 minutes. A single-second spike would not trigger it, but a sustained high-CPU condition would. This is the right default behavior for most infrastructure alerts because short spikes are often harmless and self-correcting.

Setting Up an Alert with gcloud

You can create alerting policies from the Cloud Console, but the gcloud CLI also supports alert management:

gcloud alpha monitoring policies list

Most teams use the Cloud Console or Terraform for managing alerting policies because the policy definition format is complex JSON. The gcloud command is more useful for listing, describing, and deleting existing policies than for creating new ones.

Alerting and the ACE Exam

On the Associate Cloud Engineer exam, alerting questions almost always involve a scenario where a team needs visibility into a problem before users report it. The question describes a situation - a VM that is running hot, an API that is getting slow, a service that is throwing errors - and asks how to get proactive notification.

The answer pattern is consistent: create an alerting policy in Cloud Monitoring, define a condition based on the relevant metric with an appropriate threshold, and attach a notification channel that reaches the right people. If the scenario involves a compliance team that needs email notifications, email is the answer. If it involves an operations team with a 24/7 on-call rotation, PagerDuty is typically the answer.

One distinction the exam tests: Cloud Monitoring alerts are for operational metrics, not for log-based events. If a question describes alerting on a specific log message pattern, the answer involves a log-based metric in Cloud Logging, which can then be used as a condition in a Cloud Monitoring alerting policy. The two services work together but serve different purposes.

Alerting on Custom Metrics

Cloud Monitoring supports custom metrics beyond the built-in infrastructure and service metrics. Your application can emit custom metrics through the Cloud Monitoring API or through the Ops Agent, and you can create alerting policies that watch those metrics just like any built-in metric.

Common custom metrics include business-level signals like orders processed per minute, queue depth, or number of active sessions. These are useful for alerting on application health in a way that infrastructure metrics cannot capture. A VM might look healthy by CPU and memory measures while the application it runs is silently failing to process requests.

My Associate Cloud Engineer course covers Cloud Monitoring and alerting in the context of the full observability section, including how alerts interact with log-based metrics and uptime checks.