A Decent Cloud Web Architecture

What architecture is for

For an app with five users, “architecture” is a strong word — AWS, a ThinkPad, or a Raspberry Pi will all work. But if the target is a cloud web application, the goals are throughput and availability.

Geography

A service on a single machine has no architecture worth discussing. With multiple machines, location starts to matter. Between users and services sits a load balancer that fans incoming requests out to the service instances.

Every cloud provider has a notion of “availability zone”. In simple cases you just pick the one closest to your primary user base. AWS doesn’t offer a truly global load balancer — even Route 53 is per-AZ. That doesn’t mean AWS can’t serve worldwide traffic, but ELB’s source IPs across AZs are dynamic and not under your control.

GCP does offer a global load balancer that gives you a single IP serving everywhere (Global forwarding rules). For software with genuinely global users, GCP is simpler.

Comparison

GCP

Short version: GCP’s load-balancing story is broad. Three balancers:

HTTP/HTTPS
Proxy: supports TCP and SSL
Network: supports TCP and UDP; no global availability

Use HTTP or Proxy for user-facing entry points; use Network for inter-service traffic.

A few bonuses besides global availability:

The HTTP/HTTPS balancer supports WebSocket.
Smarter distribution algorithms. AWS’s Auto Scaling Group ↔ ELB channel is thin, so distribution is mostly even; GCP can route based on signals like CPU.
The HTTP balancer can route by URL across instances (and across zones).

Both Proxy and HTTP/HTTPS forward HTTP — what’s the difference?

Both can sit in front of HTTP traffic. The difference is that Proxy forwards a clone of the request and HTTP/HTTPS forwards the original — so Proxy loses the user’s connection info.

AWS ELB

First: no ELB type supports cross-AZ. That’s the premise. For a global-facing service, you shouldn’t (not “can’t”) use ELB. Other weak spots: distribution algorithm, cross-origin configuration, and a few more.

Switching balancers on an existing project isn’t easy, and GCP’s offerings launched significantly later than ELB — none of this is a broadside against ELB, just a list of actual gaps.

Physical layout

Load balancing assumes stateless services — we don’t care which box catches which request. That said, ELB will reuse a Keep-Alive TCP connection where it can, which saves a handshake per request.

Example: a typical web app

Suppose:

Frontend: React + webpack, packed into bundle.js, index.html, style.css; Express does the SSR (ignoring isomorphism for now).
Backend: a Spring Boot service exposing stateless RESTful APIs. Data comes from a Redis instance and a MongoDB cluster.
A Scala ETL service runs a few times a day, loading data into MongoDB and pulling out anything not yet synced.

How it’s laid out:

Frontend and backend both expose user-facing endpoints. Host them on different subdomains under the same domain. Each points at its own load balancer.
Each balancer fronts an Auto Scaling Group: scales up on pressure, replaces dead instances.
Inside the VPC, the backend talks to Redis and MongoDB over private IPs.
The ETL service lives in the same subnet and exposes nothing externally — no public endpoint, no public IP.