Cassandra WriteTimeoutException: coordinator write timeouts and writeType
A WriteTimeoutException means the coordinator did not receive enough replica acknowledgments before write_request_timeout_in_ms expired. It does not mean the write failed; one or more replicas may have already persisted the mutation, so the outcome is ambiguous. The writeType field determines whether a client-side retry is safe.
Distinguish between a slow replica and an unavailable one. Understand the idempotency contract for the write type, and correlate coordinator timeouts with replica-side saturation. The default write_request_timeout_in_ms is 2000 ms. If replicas cannot append to the commitlog and update the memtable within that window, the coordinator throws this exception.
What this means
The coordinator forwards the write to the partition replicas and waits for blockFor acknowledgments. If it receives fewer than blockFor before the timeout fires, it throws WriteTimeoutException. The exception includes the coordinator address, consistency level, acknowledgments received, and writeType.
writeType values:
SIMPLE: a standard single-partition write.BATCH: a logged atomic batch.COUNTER: a counter increment.BATCH_LOG: an internal write to the distributed batch log.CAS: a conditional lightweight transaction (Paxos).
The driver is conservative: if a statement is not marked idempotent, the driver does not invoke the retry policy on a write timeout. The DefaultRetryPolicy only retries BATCH_LOG automatically. For SIMPLE, retry only if the statement is declared idempotent. Never blindly retry COUNTER or CAS. Counter increments are read-modify-write operations; replaying them corrupts the value. CAS timeouts leave the Paxos round in an unknown state, and retrying without reading first produces indeterminate results.
flowchart TD
A[Client write arrives at coordinator] --> B{Received >= blockFor before timeout?}
B -->|Yes| C[Return success]
B -->|No| D[Throw WriteTimeoutException]
D --> E[Check writeType]
E --> F{BATCH_LOG?}
F -->|Yes| G[Driver may retry safely]
F -->|No| H{Idempotent SIMPLE?}
H -->|Yes| I[Retry is safe]
H -->|No| J[Do not retry]
E --> K[COUNTER / CAS]
K --> L[Never blindly retry]
D --> M[Correlate with replica-side signals]
M --> N[Dropped MUTATION]
M --> O[CommitLog PendingTasks]
M --> P[MutationStage PendingTasks]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Commitlog disk saturation | Write latency spikes; CommitLog PendingTasks > 0; commitlog shares a volume with data | iostat -x on the commitlog device; nodetool info for commitlog pending tasks |
| MutationStage saturation | High write throughput; mutation pending tasks sustained > 0; CPU busy | nodetool tpstats mutation stage pending and blocked counts |
| GC pause on replica | Node gossip flaps; long pauses in gc.log; timeouts correlate with GC events | grep "pause" /var/log/cassandra/gc.log; nodetool info heap usage |
| Compaction backlog blocking flushes | Pending compactions growing; memtable flush writers backed up; commitlog segments accumulating | nodetool compactionstats; check MemtableFlushWriter pending in nodetool tpstats |
| Cross-DC or internode latency | Timeouts without local saturation; nodes are UP but slow; EACH_QUORUM in use | nodetool status from multiple nodes; compare internode latency |
| Counter or CAS workload | Timeouts isolated to counter or LWT writes; normal latency for standard writes | Application query patterns; check ClientRequest scopes for CASWrite |
Quick checks
Confirm the failure mode without changing cluster state:
# Check for dropped mutations and thread pool saturation
nodetool tpstats
# Check write latency percentiles at the coordinator
nodetool proxyhistograms
# Verify node liveness and identify DOWN replicas
nodetool status
# Check compaction backlog that may be blocking flushes
nodetool compactionstats
# Check heap and commitlog pressure
nodetool info | grep -E "Heap Memory|Commit Log pending tasks"
# Check disk I/O latency on commitlog and data devices
iostat -x 1
# Search logs for recent WriteTimeoutException or GC pauses
grep -i "writetimeout\|pause" /var/log/cassandra/system.log
How to diagnose it
- Confirm the exception type.
WriteTimeoutExceptionmeans replicas are alive but slow.UnavailableExceptionmeans the coordinator could not find enough live replicas. Checknodetool statusandClientRequest,scope=Write,name=Unavailablesto rule out quorum loss. - Read
writeType. If it isCOUNTERorCAS, treat the outcome as unknown and do not retry without an application-level read. If it isBATCH_LOG, the driver may retry safely. If it isSIMPLE, retry only if the statement is idempotent. - Check the timeout rate. Correlate
ClientRequest,scope=Write,name=Timeoutswith baseline write throughput. A sustained rate above zero is abnormal. - Inspect replica health. Look for high write latency or elevated
DroppedMessage,scope=MUTATIONon individual nodes. One slow replica can drag down aQUORUMwrite. - Check the mutation stage. In
nodetool tpstats, sustained pending tasks (MutationStagein 3.x,MUTATIONin 4.x) mean the node cannot process local writes fast enough. Blocked tasks mean the queue is full. - Check commitlog pressure. Non-zero
CommitLog PendingTasksmeans the commitlog device cannot absorb the append rate. This is common when commitlog and data directories share a disk. - Review GC logs. Stop-the-world pauses longer than a few hundred milliseconds block replica-side write acknowledgment. If pauses approach
write_request_timeout_in_ms(default 2000 ms), the coordinator will time out. - Look for compaction blocking flushes. Pending tasks in
MemtableFlushWriterand growing pending compactions mean memtables cannot flush. Unflushed memtables prevent commitlog segment recycling, which backpressures the write path.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
ClientRequest,scope=Write,name=Timeouts | Direct measure of coordinator write timeouts | Sustained rate > 0 |
DroppedMessage,scope=MUTATION,name=Dropped | Replica is shedding writes it cannot process | Non-zero rate indicates overload or GC stalls |
CommitLog,name=PendingTasks | Commitlog I/O cannot keep up with write rate | Sustained value > 0 |
| Mutation stage pending tasks | Write execution is queuing on the replica | Pending tasks sustained > 0 for > 60 seconds |
| GC pause duration | STW pauses block replica acks | Pauses > 500 ms; pauses approaching timeout threshold |
Disk await on commitlog device | Physical I/O latency on the durability path | await > 10 ms sustained on SSD |
Fixes
Commitlog I/O saturation
Move the commitlog directory to a dedicated volume separate from the data directories. Shared devices cause sequential commitlog writes to contend with random reads and compaction I/O. If separation is not possible immediately, postpone repairs and streaming until peak write load subsides.
MutationStage saturation
Reduce the application write rate or break large batches into smaller ones. Stop background repair and streaming if they overlap with peak traffic. If a hot partition causes a thundering herd, rate-limit that key at the application layer.
GC pauses on replicas
Treat this as a GC death spiral warning. See Cassandra GC death spiral: long pauses, gossip flapping, and recovery.
Warning: Disabling native transport stops all client traffic to the node. Only do this if the node is actively degrading the cluster.
Disable native transport to stop new load, then investigate heap usage, large partition reads, and row cache misconfiguration.
Compaction backlog
Use nodetool setcompactionthroughput to temporarily raise the compaction throttle if CPU and disk headroom exist. This change takes effect immediately. If pending compactions have been growing for hours, the node is in a compaction debt spiral. Stop non-critical writes and see Cassandra compaction death spiral: when writes outrun compaction throughput.
Counter and CAS timeouts
Do not retry these blindly. For counters, design the application to tolerate incomplete increments or use an idempotent alternative. For CAS, read the current state before deciding whether to re-attempt the conditional write.
Prevention
- Keep commitlog and data directories on separate physical devices.
- Monitor
CommitLog PendingTasksand mutation stage pending tasks as leading indicators. A value sustained above zero precedes timeouts by seconds or minutes. - Monitor the trend of pending compactions, not just the absolute value. A rising trend over hours predicts flush backpressure.
- Ensure GC pause duration stays well below
write_request_timeout_in_ms. Parse GC logs to track old-generation pause times. - Mark writes idempotent in the driver when the data model allows it. This lets the driver retry
SIMPLEtimeouts safely. - Monitor
ClientRequest,scope=Write,name=Timeoutsand alert on any sustained non-zero rate.
How Netdata helps
Netdata correlates ClientRequest Write Timeouts with DroppedMessage MUTATION and CommitLog PendingTasks on the same time axis to separate replica overload from commitlog I/O issues. It tracks ThreadPools MutationStage PendingTasks as a leading indicator and overlays JVM GC pause duration with write timeout spikes to flag stop-the-world events. Per-device disk await distinguishes commitlog saturation from data disk contention. P99 write latency trends catch slowdowns before they breach the timeout threshold.
Related guides
- Cassandra compaction strategies: STCS vs LCS vs TWCS vs UCS
- Cassandra compaction death spiral: when writes outrun compaction throughput
- Cassandra consistency levels explained: QUORUM, ONE, LOCAL_QUORUM, and EACH_QUORUM
- Cassandra zombie data resurrection: gc_grace_seconds and unrepaired tombstones
- Cassandra disk space exhaustion: emergency recovery when the data volume fills
- Cassandra GC death spiral: long pauses, gossip flapping, and recovery
- Cassandra GC pauses too long: diagnosing G1 stop-the-world pauses
- Cassandra heap pressure: sizing the JVM heap and tuning G1GC
- Cassandra monitoring checklist: the signals every production cluster needs
- Cassandra monitoring maturity model: from survival to expert
- Cassandra Not enough space for compaction: STCS space amplification and recovery
- Cassandra java.lang.OutOfMemoryError: Java heap space - causes and recovery







