Explains decentralized masterless database design (like DynamoDB or Cassandra) focusing on consistency, replication, and node coordination.
In standard replication systems (like MySQL replication), a single leader node processes writes, which replicate to follower nodes. In a **masterless** key-value store (modeled after Amazon's Dynamo paper), any node can accept read or write requests. This structure maximizes write availability and scalability at the cost of consistency model complexity.
A distributed key-value store is built on a stack of specific patterns:
N - 1 physical nodes going clockwise on the hash ring.In masterless stores, we tune consistency using three numbers:
N = The Replication Factor (number of replicas storing the data).W = The Write Quorum (number of replica nodes that must acknowledge a write before it is considered successful).R = The Read Quorum (number of replica nodes that must respond to a read query).| Quorum Configuration | Consistency Level | Use Case |
|---|---|---|
| R + W > N e.g. N=3, W=2, R=2 |
Strong Consistency | Default setting. Guaranteed that the read quorum overlaps with the write quorum, so reads always return the latest value. |
| W = 1, R = 1 N=3 |
Eventual Consistency | Highly optimized for write speed and read speed. Replicas synchronize in the background. Risk of reading stale values. |
| W = N, R = 1 | Strong Consistency (Read optimized) | Fast reads, but writes fail if even a single replica node goes offline. |
When multiple nodes accept concurrent writes for the same key, conflict resolution is needed. In databases like Cassandra, **Last-Write-Wins (LWW)** is used, which resolves conflicts by checking timestamps (susceptible to NTP clock drift).
Dynamo uses **Vector Clocks**: a list of [server, counter] pairs attached to every version of a data object.
Client writes V1. Server Sx processes it:
State: V1, Clock: [Sx, 1]
Client edits V1 to V2. Server Sy processes:
State: V2, Clock: [Sx, 1], [Sy, 1]
If two clients read V2 and make concurrent,
conflicting edits:
Client A writes to Sx → [Sx, 2], [Sy, 1]
Client B writes to Sz → [Sx, 1], [Sy, 1], [Sz, 1]
When the next reader fetches the key:
Server detects the clocks are conflicting
(neither is a direct ancestor of the other).
Server returns both versions (siblings)
to the client application.
The client app must merge the values
(e.g., merging shopping cart items) and
write the merged version back to the store.