
Networking questions on the Professional Data Engineer exam tend to sneak up on people. You think you signed up to learn BigQuery and Dataflow, and then a scenario hits you with a subnet sizing constraint or a regional placement decision. VPC subnets show up in those scenarios more than any other networking primitive, so I want to walk through what I think you actually need to know going into the test.
This is the single most important fact to lock in. A subnet in Google Cloud lives at the region level. It spans every zone in that region, which means a VM in us-central1-a and a VM in us-central1-b can sit in the same subnet and talk to each other over private IPs without any extra routing setup.
This matters for the Professional Data Engineer exam because data pipelines almost always involve multiple workers spread across zones for availability. If you're running a Dataflow job, a Dataproc cluster, or a GKE workload that pulls from Pub/Sub and writes to BigQuery, the workers in different zones share a subnet as long as they're in the same region. You do not create one subnet per zone. You create one subnet per region.
The exam will sometimes test the inverse. If a scenario forces you into a second region for disaster recovery or for data residency, you need a second subnet in that region. A subnet cannot stretch across regions, full stop.
When you create a subnet, you assign it a primary IP range in CIDR notation. There are two rules to remember.
One nice operational detail is that if you run out of IPs in a subnet, you can expand the range in place. You do not have to delete and recreate the subnet. That comes up in scaling scenarios where a data pipeline starts small and balloons into a much larger cluster footprint.
Also worth noting for the exam: 0.0.0.0/0 is not a valid range for a private subnet. That CIDR represents the default route for all IPv4 addresses, which is the kind of thing you use in a route definition, not a subnet boundary.
Subnets can carry a primary range plus one or more secondary ranges. Secondary ranges exist to support alias IPs, which is the mechanism that lets a single VM hand out additional private IPs to workloads running on it. The classic case is GKE.
When you run a GKE cluster in VPC-native mode, pods and services do not consume IPs from the primary subnet range. They pull from secondary ranges. You typically configure two secondary ranges on the subnet, one for pods and one for services, then point the cluster at those.
For Professional Data Engineer scenarios, this comes up when a question describes a data platform running on GKE, something like Spark on Kubernetes, or a custom Beam runner, or a real-time inference service that consumes from Pub/Sub. If the scenario mentions IP exhaustion in a GKE cluster, the answer almost always involves sizing the pod secondary range correctly up front. You cannot grow it as casually as the primary range, so plan for the pod count you expect at peak.
When you create a VPC, you pick between auto mode and custom mode.
Auto-mode VPC creates a subnet for you in every Google Cloud region automatically, using a predefined 10.128.0.0/9 address space carved into /20s. It's convenient for getting started, but it gives you no control over CIDR layout, no control over which regions get subnets, and the default ranges tend to collide with on-premise networks during VPN or Interconnect setup.
Custom-mode VPC creates an empty VPC with no subnets. You decide which regions need a subnet, what CIDR each one gets, and whether to add secondary ranges. This is the recommended mode for any production data workload, and it is what I'd expect to see in any Professional Data Engineer scenario involving hybrid connectivity, multi-region pipelines, or carefully planned address space.
The exam framing to watch for: if a scenario mentions on-premise integration via Cloud VPN or Cloud Interconnect, or mentions a network team that needs to coordinate IP allocation, the right answer is custom mode. Auto mode shows up as the wrong answer in those cases because you cannot guarantee non-overlapping ranges.
If you can hold these in your head, you'll handle most subnet questions on the Professional Data Engineer exam without trouble.
My Professional Data Engineer course covers VPC design alongside the rest of the networking surface that shows up on the exam, including firewall rules, Private Google Access, Shared VPC, and how all of it interacts with Dataflow, Dataproc, and BigQuery workloads.