App Engine Traffic Splitting, Dependencies, Memcache for PCA

Ben Makansi

February 16, 2026

Three App Engine topics tend to show up together on the Professional Cloud Architect exam: traffic splitting between versions, dependency management for runtime errors, and Memcache for caching. They are not deeply connected as features, but the exam treats them as part of the same App Engine operational toolkit, so I cover them together. State management belongs in the same conversation because it is the constraint that forces you to use external storage instead of in-memory tricks.

Traffic splitting between versions

App Engine lets you split traffic between different versions of your application. The important word there is versions, not services. If you have two services in the same App Engine app, traffic splitting will not divide requests between them. It only divides requests across versions of one service.

The two main use cases are A/B testing and gradual rollouts. With A/B testing you expose a new version to a small percentage of users while the majority stay on the stable version. With gradual rollouts you start a new version at a small slice of traffic, watch for errors, and ramp it up over time. If something goes wrong, you can shift traffic back to the previous version without redeploying.

You configure splits two ways. The first is the --splits option on a gcloud deploy or service command, which lets you specify the percentage of traffic going to each version. The second is the App Engine console, where you set the same allocation through the UI. Both produce the same result.

For the Professional Cloud Architect exam, the things to remember are: splits work between versions of one service, the typical pattern is to start small and ramp to 100 percent, and rollback is just shifting traffic back to the older version.

Dependency management and runtime errors

Missing or incompatible dependencies are a common source of runtime errors on App Engine. The classic example is a Java application throwing ClassNotFoundException because a JAR file is not available in the runtime environment. The same shape of problem shows up in other languages when a library is missing or pinned to an incompatible version.

App Engine Standard uses predefined runtimes. Those runtimes are optimized for scaling, but they can introduce compatibility issues with libraries that depend on features the runtime does not support. App Engine Flexible gives you more control over the runtime through a custom container, which is one of the reasons it is the answer when a question describes a library that does not work on Standard.

The resolution flow when a dependency error appears in production is straightforward:

Verify dependencies. Check that all external libraries are listed in your build files (pom.xml for Maven, build.gradle for Gradle, requirements.txt for Python) and that they target a runtime version App Engine actually supports.
Update dependencies. If a library is missing or pinned wrong, fix the version to one compatible with the App Engine runtime.
Package and test locally. Build the application and confirm it runs before deploying. This catches most issues without burning a deploy.
Redeploy. Push the corrected build to App Engine so the running version uses the fixed dependencies.

On exam questions, the tell is usually a Java ClassNotFoundException or a Python ImportError after a deploy. The right answer involves verifying the dependency and redeploying, not switching runtimes or rebuilding the project from scratch.

Memcache on App Engine

Memcache is App Engine's in-memory caching service. It sits between your application and your backend database to store frequently accessed data so you do not have to go to the database for every request. The result is lower latency and fewer database queries, which lets the application scale better under load.

There are two modes. Dedicated Memcache gives your application exclusive resources, with better performance and reliability under high load. Shared Memcache uses pooled resources across applications. It is cheaper, but it is more prone to eviction, meaning your cached data can be removed to make room for other applications' data when the pool is under pressure. If a question describes a workload that cannot tolerate eviction or needs predictable latency, the answer is dedicated Memcache.

The flow for a single request looks like this. The application receives a request and computes a cache key, usually a hash of the query. It checks Memcache for that key. If the data is there (a cache hit), the application returns the result immediately without touching the database. If the data is not there (a cache miss), the application forwards the query to the backend, gets the result, and typically writes it back into Memcache so the next request for the same data is faster.

The mental model the Professional Cloud Architect exam wants you to have is: Memcache reduces load on the database by serving repeat reads from memory. It is not a system of record, it is a performance layer. Cache misses still hit the database, and dedicated mode exists when shared eviction is unacceptable.

State management and stateless instances

App Engine instances are stateless. Any data you store in local memory on one instance, like a user_sessions dictionary or a Python global, is tied to that instance only. It does not persist across requests reliably, and it is not visible to any other instance handling the same user's traffic.

This causes inconsistencies as soon as traffic grows enough that the load balancer spreads requests across multiple instances. A user might log in on one instance, get routed to a different instance for the next request, and find that their session no longer exists. Storing state in local memory on App Engine is one of the most common architecture mistakes, and the exam tests it directly.

The fix is to put shared state in external storage that all instances can read from. Three services come up regularly:

Firestore for structured, scalable data like user profiles and preferences.
Memorystore for low-latency session caching, which behaves like Memcache but is a fully managed Redis or Memcached service available outside App Engine too.
Cloud SQL for relational data that needs strong consistency.

The pattern to internalize is: do not trust local memory on App Engine for anything that needs to persist or be shared. Push session data, user state, and shared coordination into Firestore, Memorystore, or Cloud SQL. The instance can fail or be replaced at any moment, and the application has to keep working when that happens.

This is also why Memcache and Memorystore are not the same thing in exam answers. Memcache is App Engine's built-in cache. Memorystore is a separate Google Cloud service for managed Redis or Memcached that any workload, App Engine or otherwise, can use for shared state across instances.

How these four topics connect

Traffic splitting, dependency management, Memcache, and stateless state management are four different operational concerns, but they all live inside the same exam slot for App Engine. Traffic splitting is how you ship safely. Dependency management is how you avoid breaking the runtime. Memcache is how you reduce database load. Stateless state management is the rule that forces you to use external storage in the first place. If you can reason about all four, App Engine questions on the Professional Cloud Architect exam become much more predictable.

My Professional Cloud Architect course covers App Engine traffic splitting, dependency management, Memcache, and state management alongside the rest of the containers and serverless material.

App Engine Traffic Splitting, Dependencies, and Memcache for the PCA Exam

Traffic splitting between versions

Dependency management and runtime errors

Memcache on App Engine

State management and stateless instances

How these four topics connect