← Course Index

CAP Theorem & Consistency Models

~25 min · Foundations · DDIA §5, §9 · Alex Xu Vol 1, Ch 6

Ref
Primary Source
DDIA (Kleppmann) — Chapters 5 and 9

The most rigorous treatment of consistency models available. Chapter 9 (Consistency and Consensus) is essential reading once you have the basics. For now, focus on Chapter 5.

Why This Lesson Comes Before Databases

Every database choice is a consistency choice. When you pick MySQL over Cassandra, or when you add a read replica, you are making a trade-off between consistency, availability, and partition tolerance. Without understanding CAP, your database choices will be guesswork.

The CAP Theorem

In a distributed system, you can guarantee at most two of these three properties:

C Consistency Every read returns the most recent write A Availability Every request receives a (non-error) response P Partition Tolerance System works despite network partitions CP HBase, ZooKeeper AP Cassandra, DynamoDB CA Traditional SQL (single node)
The CAP triangle — in a distributed system, pick two. Since P is unavoidable, the real choice is CP vs AP.
The key insight

Network partitions (nodes losing communication) are inevitable in distributed systems. So you always need P. The real choice is: when a partition happens, do you sacrifice C (return potentially stale data but stay available) or A (refuse to answer to avoid returning stale data)?

CP Systems — Consistency over Availability

During a partition, a CP system refuses to answer rather than risk returning stale data. You get an error instead of a potentially wrong answer.

Use when: Data accuracy is critical. Financial transactions, inventory management, distributed locks, leader election.

Examples: ZooKeeper, HBase, Spanner, Redis (single-leader)

Real example

Your bank balance must be consistent. If the primary DB is partitioned from its replica, you'd rather get an error ("service temporarily unavailable") than see the wrong balance.

AP Systems — Availability over Consistency

During a partition, an AP system keeps serving requests but may return stale data. The system "eventually" becomes consistent once the partition heals.

Use when: Availability matters more than perfect accuracy. Social media feeds, shopping cart, DNS, content delivery.

Examples: Cassandra, DynamoDB, CouchDB, most DNS systems

Real example

Your Twitter follower count being 1 second out of date is acceptable. The feed staying available during a datacenter partition is critical.

Consistency Models (Beyond CAP)

CAP is a binary framing. In practice, consistency exists on a spectrum:

ModelGuaranteeTrade-off
Strong ConsistencyEvery read sees the most recent writeHigher latency, lower availability (must wait for all replicas)
LinearizabilityOperations appear to happen at one point in time, in real-time orderVery expensive — used for consensus systems (Raft, Paxos)
Sequential ConsistencyOperations from each client appear in the order they were issued, but clocks aren't synchronized across clientsModerate cost
Eventual ConsistencyGiven no new writes, all replicas will converge to the same value eventuallyHigh availability, low latency — most NoSQL DBs
Causal ConsistencyOperations that are causally related are seen in order; concurrent ops may be seen differentlyGood middle ground (MongoDB default)
💡 Interview use

When you pick a database in an interview, always qualify your consistency requirement. "For the user profile service, I'll use PostgreSQL with strong consistency since we can't have users see stale profile data. For the activity feed, I'll use Cassandra with eventual consistency — slightly stale is fine and we need the write throughput."

ACID vs BASE

ACID — Relational DBs
Atomicity:   all or nothing
Consistency: DB always in valid state
Isolation:   concurrent txns don't interfere
Durability:  committed = permanent on disk

Use: financial records, inventory,
     anything where partial writes are catastrophic
BASE — NoSQL DBs
Basically Available:  system stays up
Soft state:           state can change over time
Eventually consistent: converges eventually

Use: social feeds, shopping carts,
     anything where scale > strict accuracy

Check Your Understanding

1. You're designing a system to manage hotel room bookings. Two people must not be able to book the same room. Which CAP trade-off do you prioritize?
2. Cassandra is described as an AP system. What does this mean in a partition scenario?
3. What does "eventual consistency" guarantee?
4. Which property does NOT belong in ACID?

🎓 CAP is deceptively simple. Ask me about linearizability vs sequential consistency, or why "CA" isn't really a choice in distributed systems, or how Google Spanner achieves CP without sacrificing too much availability.