Gaurav's explanation of consistent hashing and load balancing is one of the clearest available. Watch it alongside this lesson.
A load balancer sits in front of a pool of servers and distributes incoming traffic across them. It solves two problems simultaneously: scalability (spread load so no server is overwhelmed) and availability (if a server dies, route around it).
Routes based on IP address and port.
Cannot see HTTP content — no path routing.
Very fast — minimal processing per packet.
Use when:
- Raw TCP throughput matters
- Non-HTTP protocols (DB connections)
- Maximum performance is critical
Examples: AWS NLB, HAProxy (TCP mode)
Routes based on URL path, headers, cookies.
Can inspect content → smarter routing.
Supports SSL termination, rate limiting.
Use when:
- Path-based routing (/api vs /static)
- Sticky sessions via cookies
- Content-based decisions
Examples: NGINX, AWS ALB, Traefik
Default to L7 in interviews unless you have a specific reason for L4. L7 gives you much more control. Say "I'll use an L7 load balancer since we need path-based routing between our API and static file servers."
| Algorithm | How It Works | Best For |
|---|---|---|
| Round Robin | Requests go to each server in sequence: A, B, C, A, B, C… | Servers with equal capacity and stateless requests |
| Weighted Round Robin | Like round robin, but server A might get 2× the traffic of server B based on capacity | Heterogeneous server fleet |
| Least Connections | Route to the server with the fewest active connections | Variable-duration requests (some take 1ms, some 10s) |
| IP Hash | Hash the client IP → same client always goes to same server | Sticky sessions without cookies (but fragile) |
| Consistent Hashing | Hash ring distributes load with minimal reshuffling when servers are added/removed | Distributed caches, database routing |
Load balancers periodically ping backend servers (usually every 5–30 seconds) to verify they're healthy. If a server fails N consecutive checks, it's removed from the rotation. When it recovers, it's re-added.
Active health check — LB probes servers directly (HTTP GET /health → 200 OK).
Passive health check — LB monitors real traffic responses; too many 5xx errors → mark unhealthy.
Sometimes a user's subsequent requests must go to the same server (e.g., server stores local state). The LB uses a cookie to pin that user to a specific server. Avoid sticky sessions when possible — they complicate scaling and defeat the purpose of stateless servers. Use a shared cache (Redis) for session state instead.