VPC and Private IP Addresses for the PDE Exam

619c7c8da6d7b95cf26f6f70

May 4, 2026

When candidates picture the Professional Data Engineer exam, they usually picture BigQuery, Dataflow, and Pub/Sub. What surprises them is how many questions hinge on networking. A Dataflow job that cannot reach a private Cloud SQL instance, a Dataproc cluster that fails to pull from a customer VPC, a Composer environment stuck on a NAT misconfiguration, these are the scenarios the exam loves to throw at you. To answer them confidently, you need a clean mental model of Virtual Private Cloud (VPC) networking and private IP addressing on Google Cloud. That is what I want to walk through here.

What a VPC actually is

A VPC is a logically isolated virtual network inside Google Cloud. The easiest way to think about it is as a virtualized version of a physical network. In a traditional data center, you stitch together routers, switches, servers, and cables to let machines talk to each other. In a VPC, all of that hardware is abstracted into software. Firewalls become rules. Switches become routes. Segmentation becomes subnets. You get the same primitives, but you configure them with API calls instead of cable runs.

The detail that trips people up is that a VPC on Google Cloud is global. Unlike most other clouds, where a VPC is regional, a single Google Cloud VPC stretches across every region you operate in. You do not need to peer regional networks together to let a service in us-central1 talk to a service in us-east1. They are already in the same VPC and can communicate over Google's backbone using private IP addresses. That is a big deal for data pipelines that span regions.

Subnets are regional, even though the VPC is global

Inside a VPC, you create subnets, and subnets are scoped to a single region. A subnet is just a slice of the IP address space carved off for resources in one region. If you build a VPC with three subnets, say one in us-central1, another in us-central1, and a third in us-east1, every Compute Engine VM, Dataproc node, or Cloud SQL private instance you create inherits an IP from the subnet of the region it lives in.

For the Professional Data Engineer exam, the takeaway is this:

VPC = global container
Subnet = regional IP range
Resources get their private IP from the subnet in their region
Cross-region traffic inside the same VPC stays on Google's network and uses private IPs

IPv4, IPv6, and why private IPs matter

An IP address is the unique identifier a device uses to send and receive data on a network. IPv4 addresses look like 192.168.1.1. IPv6 addresses look like 2001:0db8:85a3:0000:0000:8a2e:0370:7334. IPv4 is still what you will see in almost every PDE scenario, so that is what I focus on.

There are two flavors of IPv4 address you need to keep straight. A public IP is routable on the open internet. A private IP only routes inside a private network, like your home Wi-Fi or a VPC. When a Dataflow worker talks to a Cloud SQL instance over a private connection, both sides are using private IPs, and the traffic never leaves Google's network. When that same worker pulls a public package from PyPI, it is reaching out through a public IP, often via Cloud NAT.

Private IP addressing matters for the Professional Data Engineer exam for three concrete reasons:

Security posture, private IPs keep your data plane off the public internet, which is what most enterprises require
Egress cost, internal traffic on private IPs avoids the public egress charges you would pay routing through the internet
Service connectivity, Dataflow, Dataproc, and Composer all support private IP modes, and the exam regularly asks you to pick them

The RFC 1918 ranges to memorize

There are three IPv4 ranges reserved for private networks by RFC 1918. You should be able to recognize all three on sight.

10.0.0.0/8, about 16.8 million addresses. Used in large corporate networks and the default range for many cloud subnets
172.16.0.0/12, about 1 million addresses. Common in medium-sized enterprise networks
192.168.0.0/16, about 65,000 addresses. The familiar one from home routers and small offices

If you see an IP starting with 10., 172.16. through 172.31., or 192.168., it is a private address. The Professional Data Engineer exam will not ask you to subnet a /28 by hand, but it absolutely expects you to know that 10.128.0.5 is private and 34.135.10.5 is public.

Why data engineers care

Networking decisions show up everywhere in a data pipeline:

Dataflow, workers run in a VPC subnet you specify. If you want them to reach a private Cloud SQL or Bigtable instance, the subnet has to be reachable via private connectivity
Dataproc, clusters live in a subnet. Internal node-to-node traffic uses private IPs. External traffic needs Cloud NAT or a public IP
Cloud Composer, Private IP environments are the standard recommendation. You need to pre-plan the subnet, secondary IP ranges for GKE pods and services, and the master IP range
BigQuery, itself a serverless service, but private connectivity via Private Service Connect is a recurring exam topic

If you understand the three RFC 1918 ranges, the global VPC plus regional subnet model, and the difference between public and private IPs, you will handle the networking questions on exam day without breaking stride.

My Professional Data Engineer course covers VPCs, subnets, private connectivity, and every networking pattern the exam tests, alongside the BigQuery, Dataflow, and Pub/Sub content you would expect.