Cassandra consistency levels explained: QUORUM, ONE, LOCAL_QUORUM, and EACH_QUORUM

Consistency level (CL) balances latency, availability, and correctness. It does not control how many replicas store data; replication factor (RF) does. CL controls how many replicas must acknowledge a read or write before the coordinator returns to the client. The wrong choice produces UnavailableException during rolling restarts that should be safe, or leaves replicas inconsistent for hours after a write is acknowledged at CL=ONE. The four CLs that define most production topologies are ONE, QUORUM, LOCAL_QUORUM, and EACH_QUORUM.

What it is and why it matters

Every write arrives at a coordinator node, which forwards it to every replica that owns the partition key. Reads follow the same path. The consistency level only determines how many replicas must respond before the coordinator replies to the client. A write at CL=ONE still reaches all replicas eventually, but the coordinator returns success after the first acknowledgement. Remaining replicas receive the mutation asynchronously. If replicas are down, the coordinator stores hints locally and replays them when they recover, but only within max_hint_window_in_ms.

Because CL and RF are independent, the same CL behaves differently at different replication factors. At RF=3, QUORUM requires two replicas. At RF=5, it requires three. This interaction defines tolerance for node failures, exposure to stale reads, and whether a rolling restart can proceed without client errors. It determines whether a single node restart turns into an outage or whether a cross-DC network blip kills write latency.

How it works

The coordinator sends requests to all replicas, then waits for acknowledgements according to the requested CL.

For QUORUM, the coordinator waits for a majority of replicas across the entire cluster, calculated as floor(RF/2) + 1. At RF=3, this is two replicas. If two of three nodes are healthy, the request succeeds. If two are down, the coordinator fails immediately with UnavailableException before attempting the operation.

For ONE, the coordinator waits for a single acknowledgement from any replica. This minimizes latency but provides no guarantee that other replicas hold the same value. At CL=ONE, the coordinator does not compare digests across replicas, so read repair is not triggered for that point read. If a replica is down and the write uses CL=ONE, the write still succeeds because one replica acknowledged, but the coordinator stores a hint for the missing replica. When the replica recovers, the hint is replayed. If the replica is down longer than max_hint_window_in_ms, hints stop being stored and that replica requires full anti-entropy repair to catch up.

For LOCAL_QUORUM, the coordinator waits for a majority of replicas in its own local datacenter only. Remote datacenters are ignored for the purpose of satisfying the CL. In a multi-DC deployment using NetworkTopologyStrategy, this keeps read and write latency bounded by local network RTT regardless of what happens in other datacenters.

For EACH_QUORUM, the coordinator waits for a quorum of replicas in every datacenter individually. With two datacenters and RF=3 in each, the coordinator must receive two acknowledgements from DC1 and two from DC2 before returning. This ensures every datacenter has independently acknowledged the operation, but it adds cross-DC latency to every request and fails if any single datacenter cannot achieve its own quorum.

If you write at CL=W and read at CL=R, then W + R > RF guarantees the read quorum and write quorum overlap on at least one replica. At RF=3, QUORUM write (2) plus QUORUM read (2) satisfies 2 + 2 > 3. That overlapping replica ensures you read the latest acknowledged write.

flowchart TD
  A[Client request arrives at coordinator] --> B{Consistency level}
  B -->|ONE| C[Return after first ack from any replica]
  B -->|LOCAL_QUORUM| D[Return after majority acks from local DC only]
  B -->|QUORUM| E[Return after majority acks from all replicas]
  B -->|EACH_QUORUM| F[Return after majority acks from every datacenter]
  C --> G[Coordinator responds to client]
  D --> G
  E --> G
  F --> G

Where it shows up in production

Rolling restarts with RF=3. With RF=3 and CL=QUORUM, a single node down does not produce UnavailableException. Restart exactly one node at a time. If two nodes are down simultaneously, whether from a cascading failure or a bad rolling procedure, quorum is lost and every QUORUM request fails immediately. Watch DownEndpointCount during maintenance. At RF=3 and CL=QUORUM, the safe threshold is one node down.

Multi-DC latency amplification. LOCAL_QUORUM is the default for most multi-DC workloads because it avoids cross-DC round trips. If your client is in DC1, a LOCAL_QUORUM read only waits for DC1 replicas. EACH_QUORUM is used when the application requires every datacenter to independently acknowledge a write. The tradeoff is write latency bounded by the slowest cross-DC link plus the slowest replica in each DC. A network blip in one DC can stall all EACH_QUORUM writes globally.

system_auth keyspace alignment. The system_auth keyspace defaults to RF=1. If you set driver consistency to LOCAL_QUORUM or QUORUM for authentication queries against a cluster where system_auth is not replicated, those queries fail because one replica cannot satisfy a quorum. Verify RF on system_auth matches your CL before raising it.

Read repair gaps. CL=ONE reads contact a single replica, so they do not compare digests or trigger read repair on that request. Replica divergence that accumulates while a node is down will not be corrected by CL=ONE reads. It persists until asynchronous anti-entropy repair runs or a higher-CL read hits the same partition. Without repair monitoring, divergence grows silently.

Distinguishing Unavailable from Timeout. UnavailableException is thrown before the coordinator attempts the operation when it knows insufficient replicas are alive. It is a fast fail. WriteTimeoutException or ReadTimeoutException is thrown after the coordinator has sent the request and timed out waiting for enough acknowledgements. If you see unavailables, check node liveness. If you see timeouts, check GC pause duration, pending compactions, and disk I/O on the replicas that did respond.

Tradeoffs and when to use it

CLAcks required at RF=3Node failure toleranceBest used whenProduction risk
ONE12 nodes downCache population, low-latency reads, temporary fallbackStale reads, no digest comparison, no read repair on the request
LOCAL_QUORUMfloor(local_RF/2)+11 node down in local DCMost multi-DC production workloadsRemote DC may lag behind local DC
QUORUMfloor(total_RF/2)+11 node down at total RF=3Single-DC deployments needing strong consistencyCross-DC latency if replicas span datacenters
EACH_QUORUMfloor(RF/2)+1 per DC1 node down per DCWrites requiring independent per-DC durabilityHighest write latency, sensitive to slowest DC

Use ONE when you can tolerate stale data and need the lowest possible latency. Use it for cache population, analytics dashboards that do not need perfect freshness, or as an explicit fallback when higher CL requests time out. Do not use it for financial or inventory operations where stale data creates business risk.

Use LOCAL_QUORUM as the default in multi-DC deployments. It provides strong consistency within the local datacenter while isolating you from remote network events. A partition or latency spike between datacenters will not impact LOCAL_QUORUM operations in the healthy DC.

Use QUORUM in single-DC clusters or when your application treats the entire cluster as one failure domain. It provides cluster-wide majority agreement without the cross-DC latency cost of EACH_QUORUM. At RF=3, it tolerates exactly one node failure.

Use EACH_QUORUM only when your compliance or durability model requires every datacenter to independently acknowledge the write. This is common when each DC is a separate failure domain. Do not use it for reads unless you have the same requirement, because it adds cross-DC latency to both paths.

A frequent misuse is mixing strong write CL with weak read CL. If you write at QUORUM but read at ONE, you break the W + R > RF guarantee. A read at ONE can return stale data from a replica that missed the latest write. Match your read and write CL so that their sum exceeds RF, or accept eventual consistency intentionally.

Signals to watch in production

SignalWhy it mattersWarning sign
Client unavailable exceptions (Read/Write)The coordinator sees too few live replicas to satisfy the CL. This is a hard failure that does not self-resolve without node recovery or a CL downgrade.Sustained spike correlated with DownEndpointCount. If unavailables exceed 1% of request rate, verify node status immediately.
Client timeout exceptions (Read/Write)Replicas are alive but cannot respond within the server-side timeout window. Unlike unavailables, timeouts may resolve if the overload is temporary.Sustained for > 60 seconds or > 1% of request rate. Indicates GC pressure, I/O saturation, or large partition reads on replicas.
Hint accumulationNodes are down but writes at lower CL still succeed. Hints cover the gap for max_hint_window_in_ms (default 3 hours).Hints directory growing on coordinators. If hints exceed the window, data divergence is permanent without repair.
Read repair rateElevated repairs indicate replica inconsistency.Sudden spike after node recovery suggests missed writes during the outage.
Node liveness (Gossip)DOWN nodes shrink the pool of replicas available to satisfy CL.DownEndpointCount > 0 sustained for > 5 minutes.

How Netdata helps

Netdata surfaces the signals to distinguish quorum loss from transient slowdown.

  • Correlate ClientRequest Unavailables with FailureDetector DownEndpointCount. If unavailables spike while multiple nodes are DOWN, you have a quorum loss. If unavailables are zero but timeouts are high, replicas are alive but slow.
  • Track PendingCompactions and SSTable count after node recovery. Hint replay and read repair storms can drive compaction debt and trigger GC pressure on a node that just returned to service.
  • Monitor DroppedMessage rate by scope (MUTATION, READ) alongside timeout rates. Dropped mutations mean replicas are shedding load; this is a leading indicator that timeouts will escalate to unavailability.
  • Alert on SchemaVersions disagreement after topology changes. Schema disagreement can block DDL and complicate RF changes that affect CL calculations.
  • Compare coordinator read latency against local read latency per node. If coordinator latency is high but local latency is normal, the bottleneck is network or a remote replica, not the local node.