
Data in a Bigtable table tends to accumulate over time, and not all of it stays equally useful. Records that were written months or years ago are often kept only for compliance or occasional long-term analysis, while the active queries that the application runs touch a much smaller and more recent slice of the table. Keeping everything in the cluster means paying cluster storage rates for data that is rarely read. Managing old data is the practice of deciding which records have aged out of active use and moving them somewhere cheaper. This comes up on the Professional Cloud Database Engineer exam because the right answer usually involves shifting historical records out of Bigtable rather than scaling the cluster up to hold them.
Bigtable stores data on the nodes of a cluster, and that storage is priced higher per gigabyte than object storage in a Cloud Storage bucket. When a table holds a large volume of records that are no longer queried regularly, you are paying database storage rates to retain data that is effectively cold. The footprint also grows without bound if nothing is ever removed, which keeps pushing the storage requirement up. The goal is to keep the table holding the data that active queries actually need, and to retain the rest in a location that costs significantly less per gigabyte.
The pattern for handling aging data in Bigtable has two phases that run in order. The first is the migrate phase. Here you identify the older records in the table and move them out of Bigtable and into a Cloud Storage bucket. Object storage in a bucket is much cheaper per gigabyte than the storage used by a database cluster, so relocating these records is where the cost saving comes from. The data is not thrown away, it is preserved in the bucket where it remains available for compliance needs or long-term analysis.
The second phase is the delete phase, and the ordering matters. Only after the older records are safely written to the Cloud Storage bucket do you remove those specific records from the original Bigtable table. Deleting before the data is confirmed in the bucket would risk losing records that may still be required, so the migrate step always completes first. Once the delete step runs, the table is left holding only the data needed for active queries, while the historical records continue to live in Cloud Storage.
After both phases run, you have a leaner Bigtable instance. The cluster carries less data, which keeps storage cost aligned with the working set the application reads, and the historical records remain accessible in the bucket whenever they are needed for compliance or analysis. For the exam, the distinction worth holding onto is that long-term retention of cold records belongs in Cloud Storage rather than in the Bigtable cluster, and that the safe sequence is to migrate the data to the bucket before deleting it from the table.
Our Professional Cloud Database Engineer course covers managing old data in Bigtable alongside Bigtable cluster sizing and long-term retention with Cloud Storage, with practice questions that drill these distinctions.