Pub/Sub Publishers, Topics and Subscriptions for PDE

619c7c8da6d7b95cf26f6f70

August 25, 2025

Pub/Sub questions on the Professional Data Engineer exam almost always hinge on four words: publisher, subscriber, topic, and subscription. If you can keep these straight under time pressure, you can answer most streaming ingestion questions on the exam without overthinking them. I want to walk through what each term means, how they fit together, and the specific patterns the exam likes to test.

The four core terms

Pub/Sub is short for publisher/subscriber. The whole service is built around that pairing, and the name itself tells you the model. A publisher is any entity that creates and sends messages into Pub/Sub. That could be an application, a microservice, an IoT device, or a Cloud Function reacting to some event. The publisher does not need to know who, if anyone, will read the message. It just hands the message off and moves on.

A subscriber is the entity on the other side that receives messages from Pub/Sub. When a subscriber gets a message, it acknowledges receipt and processes the message. Like the publisher, the subscriber does not need to know anything about who produced the data. This is the decoupling that makes Pub/Sub useful in the first place. Publishers and subscribers can be added, removed, scaled up, or scaled down without coordinating with each other.

A topic is the named resource that publishers send messages to. Think of a topic as a channel or a category. Publishers do not address messages to specific subscribers. They publish to a topic, and the topic becomes the central handoff point. Topics are how you organize streams of related data, like one topic for clickstream events, another for IoT temperature readings, another for transaction logs.

A subscription is the named resource that represents the stream of messages flowing from a topic to a subscriber. Creating a subscription is like signing up to receive every message posted to a particular topic. Subscribers pull from or get pushed messages through subscriptions, not directly from topics. The subscription is the glue between the topic and the consumer.

How they fit together

The mental picture I use is a single topic in the middle with publishers on one side and subscriptions on the other. Multiple publishers can send to the same topic, and multiple subscriptions can attach to that same topic. Each subscription then feeds one or more subscribers.

The detail the Professional Data Engineer exam loves to probe is what happens when you have multiple subscriptions on a single topic. The answer is simple but easy to get wrong if you have not thought about it: each subscription gets its own independent copy of every message. If a topic has four subscriptions attached, every message published to that topic is delivered four times, once per subscription. It does not matter how many subscribers there are. The fan-out happens at the subscription layer.

This is a powerful pattern. You can have one team consuming the topic for real-time analytics through one subscription, another team archiving the same messages to BigQuery through a second subscription, and a third subscription feeding an alerting pipeline. None of them interfere with each other, and none of them affect the publishers.

Why this decoupling matters on the exam

The Professional Data Engineer exam frequently presents scenarios where you need to add a new consumer to an existing data pipeline without disrupting current consumers, or where you need to scale producers and consumers independently. The right answer almost always involves a topic with multiple subscriptions, not multiple topics or some kind of shared queue.

A few patterns worth memorizing:

Adding a new consumer without changing publishers: create a new subscription on the existing topic. The publishers do not need to know it exists.
Multiple independent processing pipelines on the same data: one topic, one subscription per pipeline. Each pipeline gets a full copy of the stream.
Load-balancing work across many workers in one pipeline: one topic, one subscription, and multiple subscriber clients pulling from that same subscription. Pub/Sub distributes messages across the workers.
Scaling publishers independently of subscribers: this is automatic. Publishers do not block on subscribers, and a slow subscriber does not slow down a publisher.

Notice the difference between the second and third bullet. Two pipelines that each need every message means two subscriptions. One pipeline with parallel workers sharing the load means one subscription with multiple subscribers attached. Mixing these up is a common exam trap.

Things publishers and subscribers do not have to worry about

Because the topic sits in the middle, publishers do not need to track who is reading, how many readers there are, whether readers are online, or how fast readers process messages. Pub/Sub buffers messages and retains them until they are acknowledged, up to the retention window you configure on the subscription.

Subscribers, on the other hand, do not need to know who produced the data or how many producers exist. They just consume from their subscription. If a producer goes offline, the subscriber keeps working through whatever is in the queue. If a new producer comes online, the subscriber starts seeing new messages automatically.

This is the architectural property the exam is testing when it asks about decoupling, scalability, or independent scaling. The four terms are the vocabulary you need to give the right answer.

Quick recap before the exam

Publisher sends. Subscriber receives. Topic is where publishers send to. Subscription is the named pipe from a topic to a subscriber. Each subscription gets its own copy of every message. Multiple publishers can write to one topic. Multiple subscriptions can read from one topic. Subscribers attached to the same subscription share the load.

If you have those sentences locked in, the Pub/Sub vocabulary questions on the Professional Data Engineer exam should be quick wins.

My Professional Data Engineer course covers Pub/Sub publishers, subscribers, topics, and subscriptions in depth, along with the streaming patterns the exam tests.

Publishers, Subscribers, Topics, and Subscriptions in Pub/Sub for the PDE Exam

The four core terms

How they fit together

Why this decoupling matters on the exam

Things publishers and subscribers do not have to worry about

Quick recap before the exam

Get tips and updates from GCP Study Hub