Hosting Static Websites on Cloud Storage for the PDE Exam

GCP Study Hub
619c7c8da6d7b95cf26f6f70
November 23, 2025

One of the smaller but reliably tested topics on the Google Cloud Professional Data Engineer exam is using Cloud Storage to host a static website. It does not feel like a data engineering topic at first glance, which is exactly why it tends to catch people off guard on test day. The exam writers like it because it touches object storage, public access, and HTTP metadata in a single scenario, and those are all things a data engineer needs to know cold.

I want to walk through what a static website on Cloud Storage actually is, why Google offers it, and the one metadata detail that shows up on the Professional Data Engineer exam more than anything else in this corner of the blueprint.

What counts as a static website

A static website is just a set of files that do not change based on who is requesting them. HTML, CSS, JavaScript, images, audio, video. There is no server-side rendering, no database lookup, no session logic. The same bytes go out to every visitor.

That property is what makes Cloud Storage a fit. A bucket is already an HTTP-addressable store of immutable objects. If you mark the objects publicly readable and point a domain at the bucket, you have a website. No VM to patch, no container to scale, no load balancer to provision unless you want one in front for HTTPS or CDN purposes.

Why this shows up on the exam

The Professional Data Engineer exam tests whether you pick the right storage and serving primitive for a workload. Static hosting questions usually come dressed up as something else. A team wants to publish a documentation site generated from a build pipeline. A data science group wants to share interactive HTML reports built from notebooks. A media company wants to serve audio samples from a public catalog. In all three cases, the cheapest and most operationally simple answer is Cloud Storage.

The wrong answers in those scenarios usually involve spinning up Compute Engine or App Engine for content that never changes per request. If you see a workload where the response is identical for every user and the files are produced ahead of time, Cloud Storage is the answer.

The metadata detail you have to know

This is the part of the topic that I see asked most directly. Every object in Cloud Storage carries metadata, and one of those fields is Content-Type. When a browser requests an object, Cloud Storage returns that Content-Type value in the HTTP response header, and the browser uses it to decide what to do with the bytes.

If the Content-Type is wrong or missing, the browser falls back to generic behavior, which usually means prompting a download instead of rendering or playing the file inline. That is the failure mode the exam likes to describe.

The classic version of the question goes like this. A team uploaded MP3 audio files to a Cloud Storage bucket and linked to them from a static page. When users click the link, the browser downloads the file instead of playing it. What do you change? The answer is to set the Content-Type metadata on the objects to audio/mpeg so the browser knows the bytes are an audio stream it can play directly.

The same logic applies to other media types:

  • text/html for HTML pages so the browser renders instead of downloading the source
  • text/css for stylesheets so the page actually picks up styling
  • application/javascript for JS files
  • image/png, image/jpeg, video/mp4, and so on for media

You can set Content-Type at upload time, and you can also patch it on existing objects with gsutil setmeta or the equivalent gcloud storage command. The exam does not usually ask for exact CLI syntax. It asks you to recognize that metadata is the lever.

What else fits in this topic

A few other facts are worth keeping in your head for the Professional Data Engineer exam, even though they show up less often than the Content-Type question.

  • Cloud Storage hosting serves over HTTP by default. To serve over HTTPS on a custom domain, you put a global external HTTPS load balancer in front of the bucket, with the bucket as a backend
  • Cloud CDN can sit on that load balancer to cache objects at the edge, which is the right call for a high-traffic static site
  • The bucket needs the right IAM grant for public reads. The standard pattern is granting allUsers the Storage Object Viewer role on the bucket
  • Index and error pages are configured at the bucket level through the website configuration, so requests to the bucket root return the file you nominated as the index

How to answer these questions quickly

When a question mentions a workload of files that do not change per user, default to Cloud Storage. When a question mentions a browser doing the wrong thing with a file, default to Content-Type metadata. Those two reflexes will carry almost every static hosting question on the test.

My Professional Data Engineer course covers Cloud Storage as a serving layer alongside the rest of the storage decisions you have to make on the exam.

Get tips and updates from GCP Study Hub

arrow