Skip to main content

CAP Theorem

The CAP Theorem is one of the most important ideas in distributed systems. It was formally proven by Eric Brewer in 2002.

TL;DR

In a distributed system, you can only guarantee 2 out of 3 properties at any given time:

  • Consistency
  • Availability
  • Partition Tolerance

What Do the 3 Properties Mean?

🔒 Consistency (C)

Every read receives the most recent write — or an error. Every node in the system sees the same data at the same time.

Think of it like a bank balance. If you deposit money and immediately check your balance, you should always see the updated amount — regardless of which server handled each request.

✅ Availability (A)

Every request receives a response — but it's not guaranteed to be the most recent data. The system never goes down.

Think of it like a social media "like" count. If you're in a terrible network zone, you might see a slightly stale count — but the page still loads.

🌐 Partition Tolerance (P)

The system continues operating even when network packets are dropped or delayed between nodes (a "network partition").

This is almost always a requirement in real distributed systems. Networks fail. You can't assume perfect communication.


The Trade-off: You MUST Choose P in Practice

In real distributed systems, network partitions will happen. You can't avoid them. So in practice, the real choice is:

Trade-offPick When...Examples
CP (Consistency + Partition Tolerance)Data accuracy is criticalHBase, Zookeeper, MongoDB (strict mode)
AP (Availability + Partition Tolerance)Uptime is critical, stale data is okayCassandra, CouchDB, DynamoDB
The "CA" Myth

CA systems (Consistency + Availability) technically require a perfectly reliable network — which doesn't exist in distributed systems. CA databases like traditional RDBMS can exist in single-node deployments, but not truly distributed ones.


Visualizing CAP


Real-World Example: Two ATMs

Imagine two ATMs connected to the same bank, but their connection drops (network partition).

Scenario: You have $100. You withdraw $100 at ATM-1. Then you walk to ATM-2 and try to withdraw $100 again.

  • CP System → ATM-2 refuses the transaction (returns error) until it syncs with ATM-1. (Consistent, but unavailable during partition)
  • AP System → ATM-2 allows the transaction, temporarily letting you withdraw $100 twice. (Available, but inconsistent — overdraft!)

Banks usually prefer CP here. Stale data means real money loss.


Beyond CAP: PACELC

CAP only describes behavior during a partition. PACELC extends it:

Even when the system is running normally (no partition), there's a trade-off between Latency and Consistency.

During PartitionNo Partition
CAPC vs A
PACELCC vs ALatency vs Consistency

Key Takeaways

  • ✅ You cannot have all 3 — choose based on your use case.
  • ✅ Network partitions are unavoidable — design for CP or AP.
  • ✅ Most modern databases let you tune this (e.g., Cassandra consistency levels).
  • ✅ Real systems often mix strategies at different layers.