
Cloud SQL connectivity questions show up consistently on the Professional Data Engineer exam, and the most common trap is a scenario that pits the Cloud SQL Auth Proxy against the Authorized Networks setting. Both control who can talk to your database, but they work in very different ways, and the exam wants you to know which one is the safer default and which one is acceptable only under tight constraints. I want to walk through both options the way I think about them when I sit down for a Professional Data Engineer practice question.
The Cloud SQL Auth Proxy is a small client that runs alongside your application and brokers the connection to your Cloud SQL instance. Your application connects to a local socket or local port, and the proxy handles everything beyond that. It authenticates the caller using IAM, opens an encrypted TLS tunnel to Cloud SQL, and forwards the database traffic over that tunnel.
Three properties matter for the exam:
cloudsql.instances.connect permission on the instance. No permission, no connection. This is how you control which workloads can reach the database.On the application side, the integration is simple. You point your database driver at 127.0.0.1 on whatever port the proxy is listening on, and the rest of your code looks like a normal local Postgres or MySQL connection.
cloud-sql-proxy my-project:us-central1:my-instance --port 5432That is the entire setup on the workload side. The proxy can run as a sidecar in a Kubernetes pod, as a process on a Compute Engine VM, or inside a container on Cloud Run. The pattern is the same in every case.
Authorized Networks is the other lever. It is a setting on the Cloud SQL instance that takes a list of IP ranges in CIDR form. Only traffic originating from those ranges is allowed to reach the public IP of the instance. Everything else is dropped at the network boundary.
That sounds clean, but there are two important consequences:
The one rule the Professional Data Engineer exam loves to test here is the open allowlist. Never set Authorized Networks to 0.0.0.0/0. That exposes the database public IP to the entire internet, and the only thing standing between the database and an attacker is the database password. If you see that option in a multiple-choice answer, it is almost always wrong.
The default answer on the exam, and in practice, is the Cloud SQL Auth Proxy. Pick it when:
cloudsql.instances.connect to a service account and revoke it without touching the database.Authorized Networks is appropriate in a narrower set of cases, usually when you have a workload outside Google Cloud with a fixed egress IP and you cannot run the Auth Proxy there. A managed BI tool with a documented static IP, a partner system, or an on-prem batch job that hits Cloud SQL once a night can fit this pattern. Even then, you scope the allowlist to that exact IP range and you still enforce strong database-level credentials.
One detail worth memorizing for the Professional Data Engineer exam: when you use the Cloud SQL Auth Proxy, you leave the Authorized Networks section blank. The proxy does not need an entry there. Adding one is unnecessary and just widens the network exposure.
The trap usually looks like a scenario where an application running on GKE needs to connect to Cloud SQL, the team wants IAM to control access, and the answer choices include some mix of: add the GKE node IPs to Authorized Networks, add 0.0.0.0/0 to Authorized Networks, use the Cloud SQL Auth Proxy with a service account that has cloudsql.instances.connect, or enable SSL on the instance. The right answer is the Auth Proxy with the service account, because that is the one option that uses IAM, encrypts traffic, and avoids exposing the instance.
If you remember that the Auth Proxy gives you IAM auth, TLS, and a hidden instance IP in one package, and that Authorized Networks is a network-only control that should be used sparingly and never opened to the world, you will get these questions right.
My Professional Data Engineer course covers Cloud SQL connectivity, IAM patterns for data services, and the rest of the data platform topics that show up on the exam, with the same level of detail I used here.