← Course Index

Design a News Feed (Twitter / Instagram)

~25 min · Case Studies · Alex Xu Vol 1, Ch 11

Ref
Primary Source
Alex Xu Vol 1, Chapter 11 — "Design a News Feed System"

A high-frequency system design question focusing on read/write fan-out scaling, caching strategies, and handling the "celebrity problem".

What is a News Feed?

A news feed is a constantly updating list of status updates, photos, and videos from entities a user follows. The system consists of two major pipelines:

  1. Feed Publishing: When a user publishes a post, the data is written to databases and populated to all followers' feeds.
  2. Feed Generation / Retrieval: When a user refreshes their homepage, the system gathers, ranks, and returns their news feed list.

The Fan-Out Models

Fan-out is the process of delivering a post to all followers. The choice of fan-out model is the most important decision in this interview:

1. Fan-out-on-Write (Push Model)
When a user posts: the post ID is pushed 
immediately to all followers' precomputed 
feed lists (stored in a fast Redis cache).

✓ Read path is extremely fast (O(1)). 
  Just fetch the user's cached feed.
✗ Slow writes for users with millions of 
  followers (e.g. celebrities). Replicating 
  one post to 50 million feeds creates 
  massive write latency/bottlenecks.
2. Fan-out-on-Read (Pull Model)
When a user loads their homepage: the server 
queries the graph DB for their following list, 
fetches all posts for those users, and merges/
sorts them in memory.

✓ Write path is O(1) (simply append to user's 
  post history).
✓ Never duplicates post IDs in caches.
✗ Read path is slow. Aggregating and sorting 
  posts for hundreds of users on every click 
  is highly CPU/DB intensive.

The Hybrid Solution (Interview Standard)

Modern social media systems combine both approaches to handle standard users and celebrities efficiently:

Regular User Post Push path Celebrity Post Pull path Fan-out Service Workers & Queue Follower Feed Cache Stored in Redis Lists Celebrity Posts cache Fetched dynamically on read
Hybrid Fan-Out: Regular posts are pushed to follower caches; celebrity posts are stored separately and pulled on read

Cache Architecture

Caching is what makes news feeds scale. Standard caches include:

Check Your Understanding

1. In social architectures, what is the "Celebrity Problem" (hotkey issue)?
2. How does the hybrid fan-out model resolve the celebrity issue?
3. Why is it best practice to store only Post IDs in the News Feed Cache list, rather than complete post bodies?