
Memorystore for Redis questions on the Professional Data Engineer exam often shift away from "how do I provision a cache?" and toward "how do I know my cache is healthy, and how do I know when it's time to scale?" That second question is where Cloud Monitoring comes in, and it's the part most candidates underprepare for. In this article I want to walk through the Redis metrics that matter on Memorystore, the alerting policies you should be able to describe on exam day, and the scaling decision that almost always comes up when memory pressure rises.
Memorystore for Redis is a fully managed cache, but "managed" does not mean "hands off." The Professional Data Engineer exam expects you to treat the cache like any other production data system. That means you need observability into how Redis is performing, you need alerting policies wired into Cloud Monitoring, and you need a clear rule for when the instance has outgrown its current tier or size. The exam tends to test this through scenario questions where an application's tail latency is climbing or eviction rates are spiking, and the right answer almost always involves a specific metric, not a vibe.
Memorystore publishes Redis metrics directly to Cloud Monitoring under the redis.googleapis.com namespace, so you do not have to install any agent or sidecar. The metrics are there as soon as the instance is provisioned. Your job is to know which ones to watch.
There are a handful of metrics I expect every Professional Data Engineer candidate to recognize. These are the ones I would memorize before exam day.
maxmemory-policy. Google's guidance is to alert at 80% and treat that as the trigger to upgrade.The single most testable point on this topic is the alerting policy on System Memory Usage Ratio. The pattern is straightforward. You create an alerting policy in Cloud Monitoring, point it at the Memorystore Redis metric, and configure the condition to fire when the ratio is greater than or equal to 0.8 for a sustained window. From there you wire it to a notification channel so your on-call rotation actually hears about it.
If you are setting this up via the gcloud CLI, the shape of the command looks like this.
gcloud alpha monitoring policies create \
--notification-channels=projects/PROJECT_ID/notificationChannels/CHANNEL_ID \
--display-name="Redis Memory Usage 80%" \
--condition-display-name="System Memory Usage Ratio >= 0.8" \
--condition-filter='metric.type="redis.googleapis.com/stats/memory/system_memory_usage_ratio" resource.type="redis_instance"' \
--condition-threshold-value=0.8 \
--condition-threshold-duration=300s \
--condition-threshold-comparison=COMPARISON_GTThe five minute duration matters. You do not want a single momentary spike to wake up your on-call. You want a sustained signal that pressure is building.
This is where the Professional Data Engineer exam tries to trip you up. When System Memory Usage Ratio crosses 80%, Google's recommendation is to upgrade. That can mean either of two things.
The wrong answers on exam questions tend to be "flush the cache," "lower the maxmemory setting," or "add more application instances." None of those address the underlying capacity problem. The right answer is almost always to grow the instance.
Memory is the primary scaling trigger, but it is not the only one. If eviction count climbs while memory ratio sits well below 80%, that usually means your maxmemory-policy is too aggressive or your TTLs are misconfigured rather than a sizing problem. If CPU saturates before memory does, you may benefit from a higher tier with more vCPU per shard rather than just more memory. And if connection count is hitting the per-tier ceiling, the fix is application side connection pooling, not a bigger Redis box.
For exam scenarios, map the symptom to the metric, and map the metric to the action. Memory ratio high means upgrade size or tier. Eviction high with memory low means tune the policy. Latency high with CPU pinned means scale tier. Connection count near ceiling means pool on the client.
If you remember three things from this article, make them these. First, the System Memory Usage Ratio metric in Cloud Monitoring is the canonical signal for Memorystore Redis health. Second, the 80% threshold with a sustained alert is the textbook configuration Google expects you to know. Third, the response to that alert is to upgrade the instance to a larger size or higher tier, not to patch around it.
My Professional Data Engineer course covers Memorystore monitoring alongside the rest of the operational and observability topics on the exam, with worked examples for the alerting policies you are most likely to see in scenario questions.