
IAM shows up on the Professional Data Engineer exam in ways that are easy to underestimate. The questions rarely test whether you know what IAM stands for. They test whether you can pick the right role for a data engineering task, whether you can tell a basic role from a predefined role, and whether you can read a technical role name like bigquery.dataEditor and know what it grants. I want to walk through the three concepts that make all of that click: permissions, roles, and the two formats that role names show up in.
Every action a principal can take in Google Cloud is gated by a permission. A principal is just the thing doing the action, which is usually a user account or a service account. Permissions are the granular pieces. Reading a file in Cloud Storage is one permission. Writing to a BigQuery dataset is another. Creating a Dataflow job is another. Cancelling a Dataflow job is yet another.
There are thousands of these permissions across Google Cloud, and a single data engineer working on a real pipeline will need many of them. If you are wiring up an ingestion job that reads from Cloud Storage, lands data in BigQuery, and runs through Dataflow, you are probably looking at dozens of distinct permissions before that pipeline can run end to end.
You will almost never assign permissions one at a time. That would be unmanageable. Instead, permissions get bundled into roles, and roles get granted to principals. The exam wants you to understand both layers because the wrong choice at the role layer is what causes most of the IAM mistakes in the real world.
Roles in Google Cloud fall into three categories, and the differences matter for the Professional Data Engineer exam. They move from broad to specific as you go down the list.
The ordering matters. If you see a question where someone needs to write data to a specific BigQuery dataset and nothing else, the answer is almost never the basic Editor role. It is usually a predefined role like BigQuery Data Editor, scoped to the right resource. If the question constrains things further, like read access to one table but not another, then you are in custom role territory.
One of the small details that trips people up is that every role has two names. They refer to the exact same thing, but they show up in different places.
The descriptive role name is the readable version. Something like Data Catalog Entry Viewer. It is what you see in documentation, in conversation, and often in the console UI when you are picking a role from a dropdown. It describes the role's purpose in plain language.
The technical role name is the fully qualified version. The same role written as datacatalog.entryViewer. This is what you use when you are assigning roles through the gcloud CLI, in Terraform, or anywhere else that takes a string identifier. The format is always service.roleName. The service part is the Google Cloud service, like bigquery, storage, dataflow, or pubsub. The roleName part is camelCased and describes the specific role within that service.
A few examples to make this concrete:
When you see questions that drop a technical role name into the answer choices, you need to be able to map it back to what it does without hesitating. The service prefix tells you which product the role applies to, and the role name itself usually telegraphs the level of access. Anything ending in admin grants the most permissions for that scope. Anything ending in viewer is read-only. Editor and worker variants sit in the middle.
The Professional Data Engineer exam will not ask you to recite the difference between basic and predefined roles in the abstract. It will hand you a scenario where a team needs to do something specific, like let an analyst run queries in BigQuery without giving them the ability to modify datasets, and ask which role to grant. The right answer almost always involves picking the narrowest predefined role that still satisfies the requirement.
The other place IAM shows up is in troubleshooting. A pipeline fails because a service account is missing one permission. You need to look at the technical role name in the error message, recognize which service it belongs to, and figure out which predefined role would add that permission. That is why the two formats matter. The console shows you one, the error logs show you the other, and you have to move between them comfortably.
My Professional Data Engineer course covers IAM permissions, the three role categories, and how to read technical role names alongside the rest of the security and governance topics that show up on the exam.