Cloud Build for the PDE Exam: YAML, Variables, Triggers

619c7c8da6d7b95cf26f6f70

April 17, 2026

Cloud Build shows up on the Professional Data Engineer exam in a specific shape. You will not get a question that asks you to design a CI/CD strategy from scratch. You will get a question that hands you a cloudbuild.yaml snippet, asks what a step does, what a substitution variable resolves to, or which trigger type fits a described workflow. The exam tests whether you can read the config and tell what will happen when it runs.

This post walks through the four pieces that come up: the YAML structure, the substitution variables, the trigger types, and how Cloud Build hands the resulting artifact off to a registry for downstream services to pull. If you can hold these four pieces in your head, the Cloud Build questions on the Professional Data Engineer exam stop being a trap.

What Cloud Build actually is

Cloud Build is a fully-managed serverless CI/CD platform. You hand it a config file, it runs each step in a container, and it does this without you provisioning any infrastructure. It connects to GitHub, GitLab, Bitbucket, and Cloud Source Repositories. Each build step is a containerized tool, which means anything you can package into a container can run as a step.

The exam framing for this is almost always around automation. A question might describe a team that wants to rebuild a Dataflow template every time someone merges to main, or a team that wants to push a new container image to Artifact Registry on every tagged release. Cloud Build is the answer to both, and the differentiator between the two is the trigger configuration.

The cloudbuild.yaml format

The config file is a list of steps, executed in order. Each step has three things: a name (the container image that runs the step), args (the command and parameters), and optional environment variables.

Here is a representative pipeline that installs dependencies, runs tests, builds a Docker image, pushes it, and deploys to App Engine:

steps:
  # Install dependencies
  - name: 'gcr.io/cloud-builders/npm'
    args: ['install']
  # Run tests
  - name: 'gcr.io/cloud-builders/npm'
    args: ['run', 'test']
  # Build Docker image
  - name: 'gcr.io/cloud-builders/docker'
    args: [
      'build',
      '-t', 'gcr.io/$PROJECT_ID/my-app:$COMMIT_SHA',
      '.']
  # Push Docker image
  - name: 'gcr.io/cloud-builders/docker'
    args: [
      'push',
      'gcr.io/$PROJECT_ID/my-app:$COMMIT_SHA']
  # Deploy to App Engine
  - name: 'gcr.io/cloud-builders/gcloud'
    args: ['app', 'deploy']

The pattern to internalize: each step is a one-shot container that runs a command and exits. Steps run sequentially by default. The gcr.io/cloud-builders/* images are Google-maintained containers for common tools (gcloud, docker, npm, mvn, gradle, gsutil, kubectl). You can also point name at any container image you have access to, which is how custom build steps work.

Substitution variables

Substitution variables let you parameterize the YAML so the same file works across environments. They get injected at runtime based on the trigger conditions or values you pass in.

There are two kinds. The built-in ones come for free and are prefixed with $ directly: $PROJECT_ID, $BUILD_ID, $COMMIT_SHA, $SHORT_SHA, $BRANCH_NAME, $TAG_NAME, $REPO_NAME. The user-defined ones must start with an underscore and capital letter, like $_DEPLOY_REGION or $_ENVIRONMENT. The underscore prefix is how Cloud Build distinguishes your variables from the reserved ones.

steps:
  - name: 'gcr.io/cloud-builders/gcloud'
    args: ['deploy', '--project', '${PROJECT_ID}', '--region', '${_DEPLOY_REGION}']
substitutions:
  _DEPLOY_REGION: 'us-central1'

A common exam scenario: the same cloudbuild.yaml deploys to a dev project on pushes to the dev branch and a prod project on pushes to main. The way you accomplish this is two triggers pointing at the same file, each setting different substitution values. The YAML stays unchanged. This is the answer pattern when a question asks how to handle multi-environment deployment without duplicating the config.

Triggers

Triggers are what kick off a build. There are four flavors and the exam will test that you know which one fits a described scenario:

Branch triggers fire on pushes to a specified branch. Push to main is the canonical example. Useful when a branch represents ready-to-deploy code.
Tag triggers fire when a tag matching a pattern gets created. v1.0, v2.3.1, anything matching the regex you configure. This is the answer for versioned release workflows.
Pull request triggers fire when a PR is opened, updated, or synchronized. The use case is pre-merge testing, so you can validate the change before it lands on the main branch.
Manual triggers are invoked on demand from the console or via gcloud builds triggers run. Useful for hotfix deploys or any build that does not map cleanly to a code event.

When you read a Professional Data Engineer question that describes a workflow, match the words to the trigger type. "Whenever someone cuts a release" means tag trigger. "Validate changes before they merge" means pull request trigger. "Run after every commit to the development branch" means branch trigger on dev.

Artifact Registry integration

The example above pushes to gcr.io, which is Container Registry. The current Google-recommended target is Artifact Registry, addressed as ${REGION}-docker.pkg.dev/$PROJECT_ID/${REPO}/my-app:$COMMIT_SHA. The mechanics are identical, you just point the push step at the new host. Cloud Build authenticates to Artifact Registry automatically as the build service account, so you do not need to manage credentials in the YAML.

Once the image is in Artifact Registry, downstream services pull from there: Cloud Run, GKE, Compute Engine instance templates, Dataflow Flex templates. The handoff between Cloud Build and the runtime service is the registry, and the version pin is the tag, which is why using $COMMIT_SHA as the tag is the immutable, traceable pattern the exam favors over latest.

What to lock in for the exam

Read every cloudbuild.yaml snippet top to bottom and identify the builder for each step. Recognize that $PROJECT_ID and $COMMIT_SHA are built-in while $_ANYTHING is user-defined. Match trigger types to workflow descriptions on instinct. And remember the registry is the handoff point. If you can do those four things, the Cloud Build questions on the Professional Data Engineer exam will go quickly.

My Professional Data Engineer course covers Cloud Build alongside the rest of the deployment and orchestration tooling the exam expects you to know, with the YAML patterns and trigger scenarios drilled the same way they appear on the test.

Cloud Build for the PDE Exam: YAML, Substitution Variables, Triggers