
BigQuery has a handful of query-management features that are easy to confuse on the Professional Data Engineer exam. Saved queries, scheduled queries, and User-Defined Functions all show up in scenario questions, often in the same answer set, and the wrong choice usually comes down to picking the feature that does too much or not enough. In this article I want to walk through all three the way I cover them in my course, including when each one fits and the cost behavior the exam expects you to know.
A saved query is exactly what it sounds like. You write a SQL statement in the BigQuery console, save it under a name, and pull it back up later to run on demand. You can also share saved queries with other users or across projects, which makes them handy for analyst teams that want a common library of starter queries.
The key thing to internalize for the exam is that saved queries are not a caching mechanism. Saving the query stores the SQL text, not the results. Every time you re-run it, BigQuery consumes computational resources and bills you for the bytes processed. If you see a scenario about reducing query cost by re-using prior results, saved queries are the wrong answer. That is what materialized views and the query cache are for.
Saved queries live under the Queries section of the BigQuery console side panel. You can scope them to your personal workspace or to a project so the rest of the team can see them. For PDE questions, treat them as a convenience feature for repetitive analysis, nothing more.
Scheduled queries take the same idea one step further. Instead of re-running a query manually, you tell BigQuery to run it on a recurring cadence such as daily, hourly, or on a custom schedule. The results can be written into a destination table so downstream dashboards and reports always have fresh data.
This is the feature you want for keeping reporting tables up to date without standing up Cloud Composer or Cloud Scheduler plus a function. If a PDE scenario describes a daily revenue report that needs to refresh every morning, or a weekly aggregate that powers a Looker Studio dashboard, scheduled queries are usually the lowest-effort answer.
The same cost caveat applies. Each execution of a scheduled query costs the full bytes-processed price just as if you had run it by hand. There is no discount for running on a schedule. If you want to reduce that cost, you either narrow the query so it scans less data, or you switch to a different abstraction such as a materialized view that incrementally maintains itself.
A few practical points that come up in exam-style questions:
User-Defined Functions, or UDFs, are a different category of feature. Instead of saving or scheduling a whole query, a UDF lets you define a custom function that you then call inside any SQL query. BigQuery supports UDFs written in either SQL or JavaScript.
You reach for a UDF when built-in SQL functions and regular expressions cannot cleanly express the logic you need. Common cases include custom string parsing, business-specific calculations that get reused across many queries, and lightweight data transformations that would otherwise need a CASE expression three screens long.
Here is the basic shape, lifted straight from the kind of example I use in the course:
CREATE OR REPLACE FUNCTION
dataset_name.custom_greeting(name STRING)
RETURNS STRING
LANGUAGE js AS """
return 'Hello, ' + name + '!';
""";
-- Example query using the UDF
SELECT custom_greeting('Kahlil') AS greeting;That returns Hello, Kahlil! in a column called greeting. The function is persistent because it was created with CREATE OR REPLACE FUNCTION and stored under a dataset. You can also write temporary UDFs that exist only for the duration of a single query by using CREATE TEMP FUNCTION.
For the exam, a couple of distinctions are worth keeping straight:
When a PDE question shows you a scenario, work the keywords. If someone wants to bookmark a query for re-use, that is a saved query. If they want it to run automatically on a cadence, that is a scheduled query. If they want to extend SQL with logic that the built-in functions cannot express, that is a UDF. None of the three reduce query cost on their own, so if cost reduction is the goal, look at materialized views, partitioning, clustering, or BI Engine instead.
My Professional Data Engineer course covers BigQuery query management, including saved queries, scheduled queries, UDFs, and the materialized-view and partitioning strategies the exam asks about alongside them.