Explores the mathematical algorithms behind limiters and how to build a high-performance distributed rate limiter using Redis.
A rate limiter controls the rate of traffic sent by a client or service. If a client exceeds the limit, the server rejects their request with an HTTP status code 429 Too Many Requests. This protects backends from:
Different use cases call for different algorithms. Here are the 5 classic algorithms:
| Algorithm | How it Works | Pros / Cons |
|---|---|---|
| Token Bucket Used by Amazon, Stripe |
A bucket holds tokens. Tokens are added at a constant rate. Each request takes a token. If empty, request is dropped. | ✓ Handles short bursts of traffic ✓ Simple to implement in Redis |
| Leaky Bucket Used by Shopify |
Requests enter a queue. The queue drains (processes requests) at a constant rate. If queue fills, requests drop. | ✓ Guarantees smooth, stable output rate ✗ Bursts are delayed rather than instant |
| Fixed Window Counter | Divides time into windows (e.g., 1 min). Tracks request counts per window. Resets count at start of next window. | ✗ Traffic spike at boundaries can double the allowed limit (e.g., 200 requests in 2 seconds) |
| Sliding Window Log | Keeps timestamps of all client requests in a sorted set. Removes expired timestamps. Count is size of set. | ✗ Memory-intensive: stores every single timestamp even if request is blocked |
| Sliding Window Counter | Combines fixed window and sliding window. Estimates rate by computing weight of current and previous window counts. | ✓ Low memory footprint, smooths out boundary spikes |
In a distributed system, a client's requests hit different API servers. Therefore, counters must be stored in a centralized, ultra-low-latency cache like Redis rather than local server memory.
Good APIs inform clients about their rate limit state using response headers:
X-RateLimit-Limit: The maximum number of allowed requests in the time window.X-RateLimit-Remaining: The remaining number of allowed requests in the current window.X-RateLimit-Reset: The Unix epoch time when the rate limit window resets.Retry-After: The number of seconds the client must wait before making another request (sent on 429 errors).