Cloud Logging for the PDE Exam: Log Types, Retention, Sinks

GCP Study Hub
619c7c8da6d7b95cf26f6f70
August 14, 2025

Cloud Logging is one of those services that quietly underpins a lot of what shows up on the Google Cloud Professional Data Engineer exam. It rarely gets its own dedicated question block, but it threads through scenarios about pipeline observability, audit and compliance, security event handling, and cost management. If you understand what Cloud Logging actually captures, how long it keeps the data, and where you can route it, a surprising number of exam questions become straightforward.

In this post I want to walk through the parts of Cloud Logging I make sure every Professional Data Engineer candidate has nailed down before sitting the exam. That means the three core log types, the default retention behavior, log sinks and their typical destinations, long-term audit log archiving, and SIEM integration.

What Cloud Logging actually does

Cloud Logging lets you store, search, and analyze log data and events generated by your Google Cloud resources and the applications running on them. If you have been around GCP for a few years you probably remember it as Stackdriver Logging. The functionality is the same, just under a different name.

One feature worth keeping in mind for exam scenarios is its tight integration with Cloud Monitoring. You can build analytics and automated alerts on top of log data, which means an unusual error rate or a suspicious admin action can trigger a notification rather than waiting for someone to notice it manually.

The three core log types

Cloud Logging ingests three broad categories of logs, and the exam will expect you to know the difference.

  • Platform Logs originate from GCP services themselves. Compute Engine instances, Cloud SQL databases, GKE clusters, Dataflow jobs, anything you are actually running on Google Cloud emits these. They tell you about the health and behavior of your cloud resources.
  • Application Logs come from your own code running on GCP. They capture application-specific events and errors and are usually what you reach for when debugging a pipeline or backend service.
  • Audit Logs record user and system activities that matter for security and compliance. Think data access, administrative changes, and API calls. When a question mentions "who did what and when," that is audit log territory.

A good mental shortcut: if the question is about service behavior, lean Platform. If it is about your own code, lean Application. If it is about security, compliance, or accountability, lean Audit.

Retention and why the default matters

By default GCP retains logs for 30 days. That number is worth memorizing because it shows up directly in exam questions and indirectly in design ones.

You can configure a custom retention period inside Cloud Logging, or you can export logs to another storage destination for longer-term retention. The reason this matters on the exam is twofold. Compliance regimes often require retention windows measured in years, not weeks, so the default is rarely enough on its own. At the same time, storing every log forever inside Cloud Logging gets expensive, so the right answer usually involves keeping the hot path short and exporting older or archival data elsewhere.

Log sinks

A log sink is an export destination for your logs, defined by a filter. You write a filter that selects the logs you care about, point the sink at a destination, and Cloud Logging streams matching entries out automatically.

Two configurations come up over and over in Professional Data Engineer scenarios:

  • Routing DAG execution logs from a Cloud Composer environment into BigQuery so you can query pipeline behavior with SQL.
  • Routing error logs from Compute Engine into a Cloud Storage bucket for long-term retention.

The destinations you should know cold are Cloud Storage (for archival and compliance), BigQuery (for analysis), and Pub/Sub (for streaming logs into downstream systems in real time). Pub/Sub is the one a lot of candidates skip and then regret, because it is the answer to most "real-time" log routing questions.

Long-term retention of audit logs

Audit log retention is a favorite scenario because it ties together a few concepts. The pattern looks like this: set up an export sink to a Cloud Storage bucket and pick a storage class that matches how often you actually need to read the data.

  • Coldline is the right call when you need the logs available for compliance but rarely touch them.
  • Archive works when access frequency drops to once a year or less, and it is the cheapest option for genuinely cold compliance data.
  • If you have multiple projects, send the audit logs into a single Cloud Storage bucket rather than a per-project setup. Centralizing makes governance and retrieval much simpler.

If you see a question asking about retaining audit logs across an organization for seven years, the answer almost always involves a sink, a central bucket, and Coldline or Archive class storage.

SIEM integration and real-time security

The last piece worth memorizing is how Cloud Logging plugs into a SIEM, which stands for Security Information and Event Management system. A SIEM ingests security events and logs from across an infrastructure, correlates them, and acts on potential threats.

The key word in any SIEM exam scenario is real-time. You cannot batch security events into a SIEM once an hour and expect to catch intrusions in flight. The standard pattern on GCP is to configure a log sink that routes the relevant logs, often from Security Command Center and Cloud IAM, into Pub/Sub, and then have the SIEM consume from Pub/Sub continuously. Cloud Storage and BigQuery are the wrong answers here because they are batch-oriented destinations, not real-time pipes.

How this shows up on the exam

When a Professional Data Engineer question mentions logging, slow down and identify three things: what kind of log is being discussed (Platform, Application, or Audit), how long it needs to be kept, and where it ultimately needs to land. Almost every logging question on the exam can be solved by picking the right log type, the right retention strategy, and the right sink destination.

My Professional Data Engineer course covers Cloud Logging end to end, including sink filters, retention strategy, and how logging fits into the broader observability and security story on GCP.

Get tips and updates from GCP Study Hub

arrow