Presents a complete end-to-end case study on building a production-grade distributed rate limiter.
In the interview, establish the scope immediately:
429 Too Many Requests.Rate limiting rules are typically written in configuration files (YAML) and loaded into server memory or cached in a relational database for management.
# Example Rate Limiting Rules (YAML)
domain: auth_service
rules:
- key: user_id
endpoint: /login
rate_limit: 5/minute
- key: ip_address
endpoint: /signup
rate_limit: 2/minute
As covered in Lesson 16, race conditions happen when checking and updating the token count are split into separate operations. We solve this using **Redis Lua scripts** which run as a single atomic unit.
-- KEYS[1]: User/IP rate limit key
-- ARGV[1]: Max tokens capacity
-- ARGV[2]: Fill rate (tokens/sec)
-- ARGV[3]: Request cost (usually 1)
-- ARGV[4]: Current timestamp (seconds)
local key = KEYS[1]
local limit = tonumber(ARGV[1])
local rate = tonumber(ARGV[2])
local cost = tonumber(ARGV[3])
local now = tonumber(ARGV[4])
-- Load bucket state
local state = redis.call('HMGET', key, 'tokens', 'last_updated')
local tokens = tonumber(state[1])
local last_updated = tonumber(state[2])
if not tokens then
-- Init bucket
tokens = limit
last_updated = now
else
-- Add refilled tokens
local elapsed = now - last_updated
tokens = math.min(limit, tokens + elapsed * rate)
last_updated = now
end
-- Allow request?
if tokens >= cost then
tokens = tokens - cost
redis.call('HMSET', key, 'tokens', tokens, 'last_updated', last_updated)
redis.call('EXPIRE', key, 3600) -- Clean up after 1 hr
return 1 -- Allowed
else
return 0 -- Rejected
end
The script is uploaded to Redis once:
SCRIPT LOAD "..."
→ Returns sha1_hash
For every incoming API request:
EVALSHA sha1_hash 1 "rate:ip_1.2.3.4" 10 0.5 1 1719211000
Advantages:
1. One network round-trip.
2. Completely atomic — Redis is single-threaded,
so no other request can modify token counts
in the middle of execution.
What happens if the Redis cluster becomes unavailable or slow? Your rate limiter middleware must handle this exception gracefully.
When a client receives a 429 status, they should retry using Exponential Backoff with Jitter. Instead of retrying every 1 second (which creates a retry storm that crushes your database), clients double their wait time on each attempt and add a random delay (jitter) to distribute the load.