Predefined vs Custom IAM Roles for the PDE Exam

GCP Study Hub
619c7c8da6d7b95cf26f6f70
August 1, 2025

IAM questions on the Professional Data Engineer exam love to test one specific judgment call. The scenario gives you a team, a job to do, and a security posture, then asks which kind of role you should grant. Basic, predefined, or custom. Most candidates can rule out basic roles quickly. The harder split is predefined versus custom, and that is the one I want to walk through.

The framing I keep in my head is simple. Predefined roles are the answer until something forces you to reach for a custom role. Custom roles cost more to maintain, and Google's own design intent is that you use predefined roles wherever they fit. The exam reflects that bias.

What predefined roles actually are

A predefined role is a bundle of permissions that Google has curated for a specific service and a specific job function. The role name almost always follows the pattern service.function. Compute Engine Admin. Cloud Functions Developer. BigQuery Data Editor. BigQuery Job User. Storage Object Viewer. Cloud Run Developer. Each one corresponds to a typical way someone uses that service.

The important thing is what is inside the bundle. Take the Composer User role. It is not a single permission. It is a collection that includes composer.dags.execute, composer.dags.get, composer.environments.list, composer.userworkloadssecrets.get, and a handful of supporting permissions like serviceusage.services.list that the role needs to function end to end. Together they enable someone to operate inside a Cloud Composer environment without giving them anything outside that scope.

The Storage Object Creator role works the same way. It groups storage.objects.create, storage.folders.create, storage.managedFolders.create, storage.multipartUploads.create, plus the supporting resourcemanager.projects.get and resourcemanager.projects.list. Each permission is small. The role is the assembly.

You do not need to memorize every predefined role for the Professional Data Engineer exam. You need to recognize the pattern. If a scenario says a team needs to run Dataflow jobs, a Dataflow predefined role exists. If they need to read from a BigQuery dataset, a BigQuery predefined role exists. The exam is rarely asking you to recall the exact role name. It is asking whether the right answer is to find a predefined role, build a custom role, or grant basic Editor.

What custom roles are for

Custom roles are roles you define and manage yourself. You pick the exact list of permissions. Google does not curate them, Google does not update them when a service adds new permissions, and you are responsible for the lifecycle.

The scenario where this matters comes up when the work spans multiple services and no single predefined role covers all of it, or when a predefined role gives more access than the security posture allows. The example I keep returning to is a regulated environment. Imagine a financial services company called FinSecure where the data analysts need to view and query BigQuery datasets, run Dataflow jobs, read Cloud Storage objects, and access specific Compute Engine logs for troubleshooting. No predefined role does all four. You could grant four predefined roles, and that works, but if you want a single tightly scoped role you would build something like a FinSecure Data Analyst custom role with exactly these permissions:

  • bigquery.datasets.get, bigquery.tables.get, bigquery.tables.list, bigquery.jobs.create, bigquery.jobs.get
  • dataflow.jobs.create, dataflow.jobs.get, dataflow.jobs.list, dataflow.jobs.updateContents
  • storage.objects.get, storage.objects.list
  • logging.logEntries.list, logging.logServices.list, logging.logMetrics.list

Notice what that role does not include. No storage.objects.create. No bigquery.datasets.create. No write access to logs. The analyst can read what they need across four services and do nothing else. That is the value of a custom role. Just enough access, nothing more.

How the exam phrases the choice

The signals I look for on a Professional Data Engineer scenario question are these.

  • If the task lines up with a typical service role, like running queries in BigQuery or deploying to Cloud Run, the answer is a predefined role.
  • If the scenario specifically says least privilege or strict control or regulated environment and the work spans multiple services in an unusual combination, the answer is a custom role.
  • If the scenario gives someone access to a few projects worth of resources without scoping by service, that is pointing at basic roles, and basic roles are almost never the right answer on this exam outside of test environments.
  • If a predefined role gets you 95 percent of the way there and the question asks how to handle the remaining 5 percent, the answer is usually to add a second predefined role rather than build a custom one. Stacking predefined roles is cheap. Maintaining a custom role is not.

One more habit worth picking up. When you see a permission name in an answer choice, read the prefix. bigquery.jobs.create tells you everything. It is a BigQuery permission about creating jobs. The exam uses these permission names as hints about which service and which action the role is granting. Predefined role names follow the same logic.

The split between predefined and custom is not really about which is better. It is about whether Google's curated bundle fits your scenario or whether you need to assemble your own. Most of the time it fits. When it does not, custom roles exist for that reason.

My Professional Data Engineer course covers IAM roles, permission design, and the specific predefined roles that show up most often on the exam, with worked examples for BigQuery, Dataflow, Cloud Storage, and Composer.

Get tips and updates from GCP Study Hub

arrow