Dataflow Windowing for the PCA Exam: Tumbling, Hopping, Session

GCP Study Hub
Ben Makansi
December 8, 2025

Streaming data never stops arriving. If you want to compute an average, a sum, or a count over a stream, you need a way to chop the unbounded flow into finite chunks. That is what windowing does in Dataflow, and it is one of the topics I get the most questions about from people studying for the Professional Cloud Architect exam.

Dataflow gives you three window types: tumbling (also called fixed), hopping (also called sliding), and session-based. Each one chunks the stream differently, and each one is the right answer for a different kind of question. I will walk through all three, then point out the distinctions the PCA exam tends to test.

Why windowing exists

A streaming pipeline has no natural end. Events keep coming in, and any aggregation you want to perform has to be scoped to some interval. You cannot wait for the stream to finish before you compute an average, because it never finishes. Windowing lets you say "give me the average over this slice of time, then start a new slice."

The slice itself can be defined three different ways. It can be a fixed clock interval. It can be a fixed interval that overlaps with the slices around it. Or it can be defined dynamically by gaps in the data itself. Those three definitions correspond to the three window types.

Tumbling windows

Tumbling windows have three properties: fixed duration, non-overlapping, and sequential. You set a window length, say 30 minutes, and Dataflow divides the stream into back-to-back 30-minute buckets. The first window covers 12:00 to 12:30, the next covers 12:30 to 1:00, the next covers 1:00 to 1:30, and so on.

Every event lands in exactly one window. There is no overlap, and there are no gaps. This is the window type you want when you need clean, mutually exclusive aggregates. Hourly request counts, daily revenue totals, five-minute error rates that you display on a dashboard, all of those are tumbling-window computations.

Hopping windows

Hopping windows also have a fixed duration, but they overlap. You set both the window size and a hop interval, which controls how often a new window starts. If the window is 30 minutes and the hop is 5 minutes, then a new 30-minute window opens every 5 minutes. The window opening at 12:30 covers 12:00 to 12:30. The window opening at 12:35 covers 12:05 to 12:35. The window opening at 12:40 covers 12:10 to 12:40.

Each event ends up in multiple windows because the windows overlap. That is the point. Hopping windows are useful when you want to compute a metric frequently, but each computation should look back over a longer period.

A common example is a moving average on a stock price. You might want to update the displayed average every minute, but the average itself should reflect the last 10 minutes of trades. That is a hopping window with a 10-minute size and a 1-minute hop. The fine-grained refresh rate gives you up-to-date numbers, while the larger window smooths out short-term noise.

Session-based windows

Session-based windows do not have a fixed duration at all. They are defined by a gap duration. You tell Dataflow "start a new window whenever there is a gap of at least N minutes with no events," and the window length becomes whatever it ends up being based on the data.

If you set the gap duration to 5 minutes and events keep arriving every 4 minutes and 59 seconds, Dataflow treats them all as one session. As soon as 5 full minutes pass with no event, the current session closes, and the next event starts a new one.

This is the model Google Analytics uses to define a user session on a website. The default Google Analytics gap is 30 minutes. If a user is active, then idle for 25 minutes, then active again, that is one session. If they go idle for 35 minutes and then come back, that is two sessions.

Session windows are dynamic, user-centric, and built around natural groupings in the data. They are the right choice whenever the meaningful unit of analysis is "a burst of activity" rather than "a fixed clock interval."

How to pick the right window for the PCA exam

The exam likes to test whether you can match a window type to a business requirement. The signals I look for in the question wording:

If the requirement says non-overlapping reporting periods, hourly buckets, or any phrasing that implies clean tumbling intervals, the answer is tumbling. "Compute the total error count for each 5-minute interval" is a tumbling window.

If the requirement involves a moving average, a rolling metric, or a metric that updates more frequently than the lookback period, the answer is hopping. "Update the displayed average every minute over the last 10 minutes of data" is a hopping window with size 10 minutes and hop 1 minute.

If the requirement involves user activity, sessions, bursts of events, or any aggregation defined by gaps in the stream, the answer is session-based. "Group all events from a user into a session that ends after 30 minutes of inactivity" is a session window.

One distinction worth memorizing: tumbling and hopping are both fixed-duration. Hopping is the one with overlap. Session is the only one without a fixed duration at all. If a question gives you a duration, it is not session-based. If a question describes overlap, it is hopping. If neither of those is present and the windows are sequential, it is tumbling.

Where this fits on the Professional Cloud Architect exam

Windowing shows up in messaging-and-pipelines questions, especially anything involving Pub/Sub feeding Dataflow, real-time analytics, or streaming aggregation use cases. The exam will not usually ask you to write the windowing code, but it will give you a scenario and ask which window type to use, or it will describe a windowing strategy and ask whether the chosen type is appropriate.

Knowing the three definitions cold and being able to spot the keywords in the question is enough to answer most of these correctly. The harder questions combine windowing with watermarks and triggers, but those build on this foundation.

My Professional Cloud Architect course covers Dataflow windowing alongside the rest of the messaging and pipelines material.

arrow