
Dataproc questions on the Professional Data Engineer exam love to test whether you know which IAM role to assign to which type of user, and at what scope. The five built-in roles look similar on the surface, but each one has a specific job. I want to walk through them the way I think about them when I see a scenario question, because the wrong answer is almost always the role that is too broad.
Google ships five predefined Dataproc roles. Each role grants a different slice of permissions, and most of them can be applied at either the cluster level or the project level. Knowing the scope matters as much as knowing the permissions.
When a question describes a user persona, I match the persona to the smallest role that does the job. Admin is reserved for people who manage the Dataproc environment itself, including templates and operational settings. If a question is about a platform owner who needs to manage everything in Dataproc across the project, that is the Admin role, and it is the only role you cannot scope down to a single cluster.
Editor sits one notch below Admin in terms of breadth. It covers the create, delete, and edit operations on clusters, jobs, and workflows, but it does not include the broader operational and template controls that Admin gets. When a question describes a developer or data engineer who needs to spin up clusters and run workflows, Editor is the right answer. The fact that you can grant it at the cluster level matters here, because it lets you give an engineer full control over a specific cluster without handing them the rest of the project.
The exam likes to test the difference between Job User and Editor, because both can do work on a cluster. The distinction is whether the user needs to create clusters. If the scenario is an analyst or a workload that consumes a long-lived cluster, you want Job User. They can submit jobs against existing clusters, but they cannot create or delete clusters, which is usually exactly what you want for a shared cluster managed by a platform team. If you see a question where a team submits Spark jobs to an existing cluster managed by someone else, Job User is the answer, not Editor.
Viewer is the easy one. It is for audit, oversight, or anyone who needs to look at clusters and jobs but should not change anything. If the persona is a compliance reviewer or a manager who wants visibility without risk, Viewer fits.
The Dataproc Worker role is the one people forget about until they hit it on the exam. It is the role you assign to the service account that the cluster VMs run as, not to a human user. Worker grants the cluster the permissions it needs to read and write to Cloud Storage and write to Cloud Logging, which is what makes the actual data processing work. If a question describes a cluster that cannot read from a Cloud Storage bucket or cannot ship logs, the fix is to make sure the cluster's service account has the Dataproc Worker role and the right Cloud Storage permissions. Worker is also scoped at either the cluster level or the project level, so you can lock a service account to a specific cluster if you want.
For every role except Admin, the Professional Data Engineer exam can ask you to pick the right scope. The pattern I follow is to grant at the cluster level whenever the persona's responsibility is bounded to one cluster, and at the project level when the responsibility spans clusters. A data engineer who owns one production cluster gets Editor at the cluster level. A platform engineer who owns all Dataproc clusters in the project gets Editor or Admin at the project level. Picking the tighter scope is almost always the better answer on the exam, because Google rewards least privilege.
The way to remember the set is to walk down the ladder: Admin owns the platform, Editor manages clusters and workflows, Job User runs jobs on existing clusters, Viewer watches, and Worker is the service account role that makes the cluster itself functional. If you can match a persona to one of those five rungs and pick cluster vs project scope, you will get the Dataproc IAM questions right on the Professional Data Engineer exam.
My Professional Data Engineer course covers Dataproc IAM roles and the rest of the access control surface area you need for the exam.