Connecting VPCs for the PDE Exam: Shared VPC, Peering, VPN, Interconnect

GCP Study Hub
619c7c8da6d7b95cf26f6f70
May 8, 2026

Networking is not the headline topic on the Professional Data Engineer exam, but it shows up in the question style I find trickiest: you are given a data pipeline architecture and asked which connectivity option fits. The trap is that several answers will sound plausible, and the right pick depends on a single attribute in the scenario (same org or not, on-prem or another cloud, latency budget, compliance posture). In this article I walk through the four options that come up most often on the Professional Data Engineer exam: Shared VPC, VPC Peering, Cloud VPN, and Cloud Interconnect.

Shared VPC: one network, many projects, one org

Shared VPC is the answer when a single organization has many projects that all need to live on the same network. You designate one project as the host project, define the VPC network and subnets there, and then attach other projects as service projects. Workloads run in the service projects, but they consume subnets owned by the host project.

The intent is centralized network management. One team controls IP ranges, firewall rules, and routes, while application teams keep ownership of their own resources. You also get unified security policies and cost efficiency from sharing things like load balancers and NAT.

The exam-relevant detail that trips people up is IAM. A Dataflow job running in a service project does not automatically have permission to deploy into a Shared VPC subnet. The service account needs the compute.networkUser role on the specific subnetwork in the host project. If a Professional Data Engineer scenario describes a Dataflow pipeline that fails to launch in a Shared VPC environment, that missing role is almost always the answer.

Two more constraints worth memorizing:

  • Shared VPC works inside a single organization. You cannot share a VPC across organizations.
  • The host project should be dedicated to network resources. Do not run VMs or pipelines in the host project itself.

VPC Peering: connect two VPCs without merging them

VPC Peering is what you reach for when you have two independent VPC networks and you want them to talk over private IPs. The networks stay under separate ownership, separate IAM, separate firewall policies. The peering connection just lets traffic flow privately between them.

This is the option that handles cross-organization connectivity. Shared VPC cannot cross an org boundary, but VPC Peering can. Two companies acquiring each other, a partner that needs to read from your published BigQuery dataset over private endpoints, a SaaS vendor exposing a service into your network: all VPC Peering scenarios.

Two gotchas the Professional Data Engineer exam likes to test:

  • No IAM unification. Each VPC keeps its own permissions model. Peering is purely a network-layer connection.
  • No transitive peering. If A peers with B and B peers with C, A cannot reach C through B. You would need a direct A-to-C peering. This catches people building hub-and-spoke designs.

If a question describes traffic that needs to hop through an intermediate VPC, peering is wrong. You are looking at Shared VPC, a Network Connectivity Center hub, or some kind of proxy in the middle.

Cloud VPN: encrypted tunnel over the public internet

Cloud VPN is the bridge to anything outside Google Cloud. On-prem data centers, another cloud provider, a remote office network. It builds an IPsec tunnel between your on-prem (or other-cloud) VPN gateway and a Cloud VPN gateway attached to your VPC.

The key trade-off is that traffic is encrypted but still travels over the public internet. That means VPN inherits internet latency and jitter. For most data engineering use cases (nightly batch loads, ad-hoc transfers, replicating reference tables) this is fine. For continuous high-volume replication or anything with a strict latency SLA, it is not enough.

HA VPN is the production-grade flavor with a 99.99% SLA when both sides are configured for high availability. If a Professional Data Engineer question describes a hybrid setup that needs to be resilient but does not call out latency or bandwidth as the constraint, HA VPN is usually the answer.

Cloud Interconnect: private fiber to Google

Cloud Interconnect is a separate service that gives you a private, high-bandwidth, low-latency physical connection from your environment into Google Cloud. It bypasses the public internet entirely, which means consistent performance and a stronger compliance story.

This is the option for:

  • Large-scale data replication into BigQuery or Cloud Storage where bandwidth matters.
  • Datastream and other change-data-capture pipelines that need predictable latency.
  • Disaster recovery designs where RPO and RTO depend on steady throughput.
  • Workloads where public internet exposure is not acceptable for compliance reasons.

There are three flavors to recognize:

  • Dedicated Interconnect. A direct physical connection between your data center and Google's network. Highest bandwidth, lowest latency, requires you to be near a Google Cloud edge location.
  • Partner Interconnect. You connect through a supported service provider. Better fit when bandwidth needs are lower or your data center is not near an edge location. Geographic flexibility is the headline feature.
  • Cross-Cloud Interconnect. A direct private link from another cloud provider's network into Google Cloud. Use when you want multi-cloud data movement without touching the public internet.

How I pick the answer on exam day

When a Professional Data Engineer question describes a connectivity choice, I run through this short checklist in order:

  • Are both networks inside the same Google Cloud organization and do you want shared admin? Shared VPC.
  • Are they two independent VPCs (same or different orgs) that just need private IP reachability? VPC Peering.
  • Are you connecting to on-prem or another cloud, and is encrypted-over-internet acceptable? Cloud VPN (HA VPN).
  • Are you connecting to on-prem or another cloud and the scenario calls out bandwidth, latency, or no-public-internet? Cloud Interconnect (Dedicated, Partner, or Cross-Cloud depending on the details).

That order matches how the exam tends to phrase the scenarios, and it has saved me from second-guessing on more than one question.

My Professional Data Engineer course covers each of these connectivity options with the exam-style framing above, plus the IAM detail on Shared VPC that decides a surprising number of Dataflow questions.

Get tips and updates from GCP Study Hub

arrow