Pub/Sub Message Lifecycle for the PDE Exam: Retention, Snapshots, Seek

GCP Study Hub
619c7c8da6d7b95cf26f6f70
August 30, 2025

Pub/Sub questions on the Professional Data Engineer exam tend to cluster around the same few mechanics, and the message lifecycle is the spine that ties them together. If you can describe what happens to a message from publication through deletion, and where retention, snapshots, and seek fit into that flow, most of the trickier scenario questions become a lot easier to answer.

I want to walk through the lifecycle the way I think about it on the exam, then layer in the three replay and recovery features that the Professional Data Engineer blueprint loves to test.

The five-step Pub/Sub message lifecycle

Every Pub/Sub message moves through the same sequence. Memorizing this order makes it much easier to spot wrong answers on scenario questions.

  • Topic creation. A topic is named and made available inside Pub/Sub. Think of it as setting up the channel that will hold messages.
  • Publication. A publisher sends a message to that topic. The publisher could be an application, a microservice, or a fleet of sensors. Pub/Sub holds the message in a queue, waiting to serve it through a subscription.
  • Receipt. A subscriber either pulls the message from the topic or has the message pushed to it by Pub/Sub. Both delivery modes feed into the same lifecycle.
  • Acknowledgment. The subscriber acknowledges to Pub/Sub that it received the message. This ack is what tells Pub/Sub the work is done.
  • Deletion. Once the ack lands, Pub/Sub deletes the message from the queue. The default behavior is that an acknowledged message is gone.

The exam likes to attack this last step. Candidates assume an acknowledged message is unrecoverable, but the retention and seek features can change that assumption. That is exactly where the next three concepts come in.

Message retention: topic vs subscription

Message retention duration is the time period for which Pub/Sub keeps messages before deleting them. The whole point of retention is to make messages available for replay later, whether for reprocessing, auditing, or recovering from a bad downstream change.

There are two flavors of retention, and the Professional Data Engineer exam expects you to pick the right one for the scenario.

  • Topic retention. Messages are retained on the topic even after every subscriber has acknowledged them. This is what lets you reprocess history long after the original consumers moved on. The topic default is 7 days. You can extend it up to 31 days.
  • Subscription retention. Only unacknowledged messages are retained, and only for that particular subscription. This is the safety net for a subscriber that goes down and needs to catch up once it recovers. The subscription default is no retention, with a max of 31 days.

The defaults matter. If a question describes a subscriber crashing and asks why messages were lost despite topic retention being set, the trap is usually that the team forgot to configure subscription retention, so unacked messages were never held for that subscriber. Topic retention holds acked messages on the topic. Subscription retention holds unacked messages for a subscription. They solve different problems.

Snapshots: a recovery point for a subscription

A snapshot captures a specific state of a subscription as a recovery point for potential future use. The mental model I use is a photo of the subscription at a moment in time, where Pub/Sub remembers exactly which messages were acked and which were not.

The classic use case is creating a known good state to revert to before a major change. If I am about to roll out a big update to my processing logic, I take a snapshot of the subscription first. If the new code mishandles messages, I can return the subscription to that earlier acknowledgment state and reprocess everything that was in flight at the time the snapshot was taken.

For the Professional Data Engineer exam, the trigger phrase is usually something like "before deploying a new version of the pipeline" or "in case the new processing logic has a bug." When you see that framing, snapshots are almost always part of the answer.

Seek: replaying acknowledged messages in bulk

The Seek feature is what lets you actually use a snapshot or a retention window. Seek lets you change the acknowledgment state of messages, including already-acknowledged messages, in bulk. That is the line that surprises people. Acked does not have to mean gone.

There are two ways to seek.

  • Seek to a snapshot. The subscription is returned to the acknowledgment state captured in that snapshot. Combined with the snapshot you took before the bad deploy, this is your rollback button.
  • Seek to a time. Every message received before that time is marked acknowledged. Every message received after that time is marked unacknowledged and becomes eligible for redelivery. This is the play when there is no snapshot but you know roughly when things went wrong.

Seek only works within the retention window. If a message is no longer being retained, there is nothing to replay. That coupling between retention duration and the seek feature is the piece the exam loves to test. Retention determines how far back you can go. Seek determines how you navigate within that window.

How the three features fit together for the exam

When I read a Pub/Sub scenario on the Professional Data Engineer exam, I run through the same checklist:

  • Is the question about preserving messages after they have been acknowledged? That is topic retention.
  • Is the question about a subscriber going down and needing to catch up on missed messages? That is subscription retention.
  • Is the question about rolling back to a known good processing state before a planned change? That is a snapshot, and then a seek to that snapshot when something goes wrong.
  • Is the question about replaying messages from some point in time without a snapshot? That is seek to a time, constrained by your retention window.

If you can hold those four mappings in your head, you can usually narrow most Pub/Sub questions down to two answer choices before reading the options. The remaining work is just confirming the defaults: topic retention is 7 days by default with a 31 day max, subscription retention is off by default with a 31 day max, and seek can target either a snapshot or a timestamp inside the retention window.

My Professional Data Engineer course covers the full Pub/Sub message lifecycle, including retention, snapshots, and seek, alongside the rest of the streaming ingestion topics on the exam.

Get tips and updates from GCP Study Hub

arrow