
Firestore shows up on the Professional Data Engineer exam in a way that catches people off guard. It is not just "the serverless document database." The exam wants you to reason about how documents are organized, how queries are served, and why an index strategy can either keep a workload fast and cheap or quietly drive write costs through the roof. I want to walk through the two pieces that matter most for Professional Data Engineer prep: the Firestore data model, and the exploding composite index problem.
Firestore is a document database, but the hierarchy matters. The building blocks are simple:
The model is meant to mirror real-world relationships. Take an ecommerce app. At the top sits a users collection. Each user is a document with fields like name, email, and address. Underneath each user document sits an orders subcollection, where each order is its own document with fields like orderDate, totalAmount, and status. Each order document can then have an OrderItems subcollection, where each item document carries product ID, quantity, and price.
That gives you a clean path from a user, to their orders, to the items inside each order. The hierarchy is not just visual organization. Queries scoped to a subcollection only see documents inside that subcollection, which is how you keep a user's orders separate from every other user's orders without joining anything.
For the Professional Data Engineer exam, the things to internalize are:
Every Firestore query is served by an index. There is no "scan the collection" option the way you might think about it in a relational database. If an index does not exist to satisfy the query, the query fails.
Firestore creates two kinds of indexes:
The composite index is where the trouble starts.
Here is the rule that makes this dangerous: to support queries on documents, Firestore creates an index entry for every possible combination of values across the indexed fields. When you have multiple fields, especially array fields or fields with high cardinality, the number of index entries multiplies.
The classic example is a job listing document with these fields:
If you build a composite index across all five fields, Firestore needs an index entry for every combination:
6 industries x 4 job types x 4 experience levels x 300 cities x 5 salary ranges = 144,000 combinations per documentThat is 144,000 index entries for a single document. Multiply that across a few million job listings and you have an index that costs a fortune to write, slows every insert, and bloats storage.
The problem gets worse when fields are arrays. If a document has three array fields and you index a composite across them, Firestore generates an entry for the Cartesian product of the arrays. Three arrays of ten elements each is a thousand entries per document, per write.
Firestore gives you two main levers, and the Professional Data Engineer exam expects you to know both.
The first lever is manual index configuration. Rather than letting Firestore auto-suggest composite indexes for every query you happen to run, you maintain an index configuration file that defines only the composites your application actually needs. If your job board never filters by all five fields at once, you do not build the five-field composite. You build the two or three composites that match real query patterns.
The second lever is single-field index exemptions. By default Firestore indexes every field in every document for single-field queries. For array fields or large nested map fields that you never query on, you can exempt them from indexing entirely. This stops the array-contains and ordered indexes from being created on those fields, which alone can save a large chunk of write cost. You can also exempt subfields inside a map, or disable ascending or descending order independently.
If you see a Firestore question on the Professional Data Engineer exam, anchor on three checks:
My Professional Data Engineer course covers Firestore data modeling, composite index design, the exploding index pattern, and the rest of the storage and database section of the exam, with every concept tied back to the question patterns Google actually uses.