The big-picture journey every scalable system takes. This is the mental model that frames every design decision in this course.
Every massive system started small. Twitter, Instagram, Uber — all began as a single server. Understanding the evolutionary path from Day 1 to 100 million users gives you a mental map for when to add which components — and why.
Everything runs on one machine: web server, application logic, and database. This is fine for thousands of users. You hit the CPU and memory ceiling first.
Bottleneck: One machine does everything. It will fall over under load, and a single crash takes down the whole system (SPOF).
Move the database to its own server. Now you can scale each independently. You can vertically scale the DB (more RAM for indexes) separately from the web tier.
Bottleneck: Still one web server (SPOF), and the DB is still a SPOF.
Add a load balancer in front of multiple web servers. Now you can add web servers to handle more traffic. Separate read replicas from the primary DB — reads (which are 80–90% of most traffic) go to replicas, writes go to primary.
Web tier is stateless and easy to scale horizontally. Database scaling is hard — so you squeeze as much as you can from the web tier first, then tackle the DB.
Add a cache (Redis) between the web servers and database. Frequently read data doesn't hit the DB at all — cuts read load by 80%+ in most apps.
At massive scale, you shard your database horizontally (split data across multiple DB servers). Add a CDN to serve static assets from edge nodes close to users. Add message queues to decouple slow async work from the request path.
At this stage you also separate your monolith into services (not always microservices — "services" can be coarse-grained).
To scale the web tier horizontally, servers must be stateless — they must not store any user session data locally. Store sessions in a shared cache (Redis) instead. This way any request can go to any server.
Server 1: stores User A's session
Server 2: stores User B's session
Problem: User A's next request MUST
go to Server 1 — sticky sessions.
Can't freely load balance.
Server 1: stateless, reads session from Redis
Server 2: stateless, reads session from Redis
User A's request can go to any server.
Add servers freely. Remove servers freely.