← Course Index

Design a Rate Limiter

~25 min · Case Studies · Alex Xu Vol 1, Ch 4

Ref
Primary Source
Alex Xu Vol 1, Chapter 4 — "Design a Rate Limiter" (Deep Dive)

Presents a complete end-to-end case study on building a production-grade distributed rate limiter.

Step 1: Clarify Requirements

In the interview, establish the scope immediately:

Step 2: Database Schema & Rules Storage

Rate limiting rules are typically written in configuration files (YAML) and loaded into server memory or cached in a relational database for management.

# Example Rate Limiting Rules (YAML)
domain: auth_service
rules:
  - key: user_id
    endpoint: /login
    rate_limit: 5/minute
  - key: ip_address
    endpoint: /signup
    rate_limit: 2/minute

Step 3: Atomic Checks with Redis Lua Scripting

As covered in Lesson 16, race conditions happen when checking and updating the token count are split into separate operations. We solve this using **Redis Lua scripts** which run as a single atomic unit.

Token Bucket Lua Script
-- KEYS[1]: User/IP rate limit key
-- ARGV[1]: Max tokens capacity
-- ARGV[2]: Fill rate (tokens/sec)
-- ARGV[3]: Request cost (usually 1)
-- ARGV[4]: Current timestamp (seconds)

local key = KEYS[1]
local limit = tonumber(ARGV[1])
local rate = tonumber(ARGV[2])
local cost = tonumber(ARGV[3])
local now = tonumber(ARGV[4])

-- Load bucket state
local state = redis.call('HMGET', key, 'tokens', 'last_updated')
local tokens = tonumber(state[1])
local last_updated = tonumber(state[2])

if not tokens then
    -- Init bucket
    tokens = limit
    last_updated = now
else
    -- Add refilled tokens
    local elapsed = now - last_updated
    tokens = math.min(limit, tokens + elapsed * rate)
    last_updated = now
end

-- Allow request?
if tokens >= cost then
    tokens = tokens - cost
    redis.call('HMSET', key, 'tokens', tokens, 'last_updated', last_updated)
    redis.call('EXPIRE', key, 3600) -- Clean up after 1 hr
    return 1 -- Allowed
else
    return 0 -- Rejected
end
How it runs in Redis
The script is uploaded to Redis once:
SCRIPT LOAD "..."
→ Returns sha1_hash

For every incoming API request:
EVALSHA sha1_hash 1 "rate:ip_1.2.3.4" 10 0.5 1 1719211000

Advantages:
1. One network round-trip.
2. Completely atomic — Redis is single-threaded, 
   so no other request can modify token counts 
   in the middle of execution.

Step 4: Handling Failures (Fail-Open vs Fail-Closed)

What happens if the Redis cluster becomes unavailable or slow? Your rate limiter middleware must handle this exception gracefully.

💡 Client Integration: Exponential Backoff & Jitter

When a client receives a 429 status, they should retry using Exponential Backoff with Jitter. Instead of retrying every 1 second (which creates a retry storm that crushes your database), clients double their wait time on each attempt and add a random delay (jitter) to distribute the load.

Check Your Understanding

1. In a production rate limiter, why is a "Fail-Open" strategy typically chosen when the Redis cache becomes unreachable?
2. How does using a Redis Lua script prevent race conditions?
3. Why should client SDKs include "jitter" (random noise) along with exponential backoff when retrying failed requests?