
Cloud Functions shows up on the Professional Data Engineer exam as the lightweight glue between data services. You will not be asked to write a function or debug a deployment, but you will be asked to recognize when Cloud Functions is the right answer in a pipeline scenario and when something heavier like Dataflow or Cloud Composer is a better fit. I want to walk through how I frame Cloud Functions for the exam so the trigger questions feel obvious when they show up.
Cloud Functions is the most abstracted and lightweight compute option on Google Cloud. It is fully serverless, which means there is no instance to size, no cluster to scale, and no operating system to patch. You hand Google a small piece of code, you tell it what event should trigger that code, and Google runs it on demand. It autoscales from zero, and you pay only for the milliseconds your function actually runs.
That positioning matters for the exam because the Professional Data Engineer test loves to compare compute options. Compute Engine gives you full VM control. GKE gives you containers with orchestration. Cloud Run gives you containers without orchestration. Cloud Functions gives you a single piece of event-driven code with nothing else to manage. When a scenario calls for short, sporadic, event-driven logic with a code development focus and minimal operations overhead, Cloud Functions is the answer.
If you have worked with AWS Lambda, Cloud Functions is the direct GCP equivalent. Same model, same tradeoffs around execution time limits and memory ceilings.
Every Cloud Functions question on the Professional Data Engineer exam comes down to a trigger. There are two families, and the exam wording usually telegraphs which one applies.
HTTP triggers turn the function into a tiny endpoint. You get a URL, and any HTTP request to that URL invokes the function. This is the pattern for webhooks, lightweight APIs, and anything that needs to be called synchronously from outside Google Cloud.
Event triggers hook the function into another Google Cloud service. The two you must know cold for the exam are:
There are other event sources, including Firestore and Eventarc-routed audit logs, but for the Professional Data Engineer exam the Cloud Storage and Pub/Sub triggers are the ones that drive the scenario questions.
When I prep candidates for trigger questions, I anchor on three patterns that show up repeatedly.
File-arrival processing. A vendor drops a CSV into a Cloud Storage bucket on an irregular schedule. You need to validate, rename, or move the file as soon as it lands. A Cloud Storage trigger on object creation is the right answer. You do not need a running cluster, you do not need a polling job, and the function only costs you anything when a file actually arrives.
Light transformation before load. Sometimes a file lands and needs a quick reshape before it goes into BigQuery. If the work is small enough to finish inside the function execution limit and does not require heavy aggregation or joins, a Cloud Function can read the object, transform it, and stream it into BigQuery directly. The moment the transform gets heavy or stateful, the exam expects you to switch to Dataflow.
Triggering an orchestrator. This is the one candidates miss most often. Cloud Functions is excellent for kicking off a Cloud Composer DAG in response to an event. The pattern looks like this: a file lands in Cloud Storage, a function fires, the function calls the Composer Airflow REST API to trigger a specific DAG run, and Composer takes over the actual orchestration. Cloud Functions is the trigger, not the workflow engine. If the question describes a multi-step pipeline with dependencies and retries, the orchestrator is Composer and Cloud Functions is just the starter.
The exam will tempt you with scenarios where Cloud Functions almost fits but does not quite. Watch for these signals to steer away:
One note on naming that can throw exam takers off. Google rebranded the second generation of Cloud Functions as Cloud Run functions. Gen2 functions run on the same underlying Cloud Run infrastructure, which lifts execution time limits, expands memory and CPU options, and broadens the supported event sources through Eventarc. The original Cloud Functions Gen1 product still exists, and exam scenarios may refer to either generation.
For the Professional Data Engineer exam, the conceptual model is the same regardless of generation. Event arrives, function runs, function exits. Knowing the rename exists is enough. You will not be drilled on which generation supports which exact feature.
When a Cloud Functions question appears, I work through three quick checks. First, is the trigger source named? If the scenario mentions a file landing in a bucket, lock in the Cloud Storage trigger. If it mentions a message on a topic, lock in Pub/Sub. If it mentions an external system calling in, lock in HTTP. Second, does the work fit inside a single short function? If yes, Cloud Functions is viable. If no, look at Dataflow or Cloud Run. Third, is there orchestration involved? If the scenario describes a multi-step workflow, the function is probably just kicking off Composer, not doing the work itself.
Those three checks resolve almost every Cloud Functions question I have seen on the Professional Data Engineer exam.
My Professional Data Engineer course covers Cloud Functions alongside the rest of the GCP compute spectrum, with the exam framing you need to pick the right service under time pressure on test day.