← Course Index

Design YouTube / Video Streaming

~25 min · Case Studies · Alex Xu Vol 1, Ch 14

Ref
Primary Source
Alex Xu Vol 1, Chapter 14 — "Design YouTube"

Covers building highly-resilient file upload paths, video transcoding pipelines, CDN caching edge servers, and adaptive streaming protocols.

Video Streaming Mechanics

A video streaming service like YouTube or Netflix splits the system into two distinct workflows:

  1. Upload & Transcoding Pipeline: Accepting large video files and processing them into multiple resolutions and formats.
  2. Streaming Delivery Pipeline: Delivering chunks of video to clients dynamically based on network speeds.

1. Transcoding Pipeline (DAG)

When a raw video is uploaded, it is massive (e.g. 10GB raw 4K recording). We cannot stream this raw file directly to mobile users on 4G networks. We must transcode it.
We build a **Directed Acyclic Graph (DAG)** pipeline to run processing tasks in parallel:

Raw Video Splitter Video Encoding Audio Extract Watermark Merger Object Store
Transcoding Pipeline: Splitter divides files into chunks, which are processed in parallel and re-assembled by the Merger

2. Streaming Protocol & Delivery

We do not download the entire video file at once. Instead, we stream small, sequential chunks (typically 2-10 seconds long). Two popular streaming protocols are **HLS** (HTTP Live Streaming, built by Apple) and **DASH** (Dynamic Adaptive Streaming over HTTP).

Adaptive Bitrate Streaming:

Adaptive streaming ensures the player switches video quality dynamically based on the user's current internet connection.
• The transcoder generates multiple streams (e.g., 240p, 480p, 1080p) along with an index Manifest file listing the URLs for all chunks.
• The player downloads the manifest file first. It constantly measures network speed.
• If network speed drops, the player downloads the next 5-second chunk from the 360p pool. If speed increases, it requests the 1080p pool chunk.

💡 CDN Strategy for Video

Videos are highly bandwidth-intensive. We must use a Content Delivery Network (CDN) to serve video files from edge caches located close to users. However, caching all videos at the edge is too expensive. We cache **hot videos** (new uploads and viral videos) on CDNs, while **cold videos** are fetched directly from our origin Object Store nodes.

Check Your Understanding

1. Why does YouTube build a transcoder pipeline based on a DAG (Directed Acyclic Graph) architecture?
2. How does Adaptive Bitrate Streaming work on the client side?
3. How do we balance CDN caching costs for video files?