Cassandra thread pool pending and blocked tasks: SEDA backpressure
You run nodetool tpstats and see non-zero values in the Pending or Blocked columns. On a healthy Cassandra node, request-stage pools like MUTATION and READ should show zero pending tasks in steady state. When pending climbs and stays above zero, the SEDA pipeline is backing up. If Blocked also rises, the node has moved from queuing to rejecting work.
This is not a transient spike you can ignore. Sustained pending tasks on the MUTATION or READ stages add latency to every client request. Blocked tasks mean the queue is full and the node is actively shedding load. In the GOSSIP stage, even a small pending backlog is an emergency: it means gossip is falling behind, which leads to false DOWN marking across the cluster.
The root cause is almost always a downstream bottleneck. The thread pool is not the problem; it is the symptom. This guide shows how to read nodetool tpstats, find the bottleneck, and relieve the pressure before dropped mutations or gossip flapping trigger a wider outage.
What this means
Cassandra uses a staged event-driven architecture (SEDA). Writes, reads, gossip, compaction, and flushes each run in dedicated thread pools. When a write arrives, it is handed to the MUTATION pool. If all threads are busy, the task enters the Pending queue. If the queue also fills, the task is counted as Blocked and is rejected or dropped.
nodetool tpstats shows five columns per pool:
| Column | What it tracks |
|---|---|
| Active | Threads currently executing tasks |
| Pending | Tasks queued waiting for a thread |
| Completed | Tasks finished since JVM startup |
| Blocked | Tasks that could not enter the queue because it is full |
| All time blocked | Cumulative blocked tasks since JVM startup |
A brief Pending spike during a traffic burst is normal and resolves within seconds. Sustained Pending greater than zero means the pool cannot dequeue work as fast as it arrives. Blocked greater than zero means the queue overflowed and Cassandra rejected the task. For request pools (MUTATION, READ, Native-Transport-Requests), any blocked count is an emergency because client requests are being rejected. For the GOSSIP pool, pending alone is dangerous: backed-up gossip prevents the failure detector from updating peer state and can cause peers to falsely mark the node DOWN.
In Cassandra 3.x, nodetool tpstats displays CamelCase pool names such as MutationStage and ReadStage. In 4.x and later, the same pools appear as uppercase MUTATION and READ. The underlying JMX metric structure is identical.
flowchart TD
A[Disk CPU or GC saturation] --> B[SEDA stage slows]
B --> C[Pending tasks > 0]
C --> D[Blocked tasks > 0]
D --> E[Dropped messages]
C --> F[Latency spikes]
B --> G[GOSSIP backlog]
G --> H[False DOWN marking]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| CPU saturation | Pending climbing across multiple pools simultaneously; CPU usage near 100% | mpstat -P ALL 1 |
| Disk I/O contention | READ or MUTATION pending with elevated await on data or commitlog devices | iostat -x 1 on the relevant device |
| GC-induced stalls | Pending spikes that correlate with stop-the-world pauses; drops reset after the pause | grep "pause" /var/log/cassandra/gc.log |
| Compaction debt | CompactionExecutor pending > 50 and growing; SSTable count increasing | nodetool compactionstats |
| Slow replicas or network partitions | Coordinator latency high but local read latency low on the replica; some nodes show higher pending than peers | nodetool proxyhistograms and per-node nodetool tpstats |
| Gossip stage backlog | GOSSIP pool Pending > 0 sustained; nodes flapping UP/DOWN | nodetool status from multiple nodes |
Quick checks
# Check thread pool saturation per stage
nodetool tpstats
# Check heap pressure and commitlog backlog
nodetool info | grep -E "Heap Memory|Commit Log"
# Check disk I/O latency on data and commitlog devices
iostat -x 1
# Check for GC pauses longer than 200 ms
grep -i "pause" /var/log/cassandra/gc.log | awk '$NF > 200'
# Check compaction backlog
nodetool compactionstats
# Check for node flapping or DOWN states
nodetool status
# Check coordinator latency percentiles
nodetool proxyhistograms
How to diagnose it
- Confirm the symptom is sustained. Run
nodetool tpstatstwice, 30 seconds apart. If Pending on MUTATION or READ is greater than zero both times, the pool is saturated. Note the Blocked and All time blocked columns: if Blocked is greater than zero or All time blocked is climbing, the queue has overflowed. - Check which pools are affected. Is it only MUTATION, only READ, or multiple pools including CompactionExecutor and MemtableFlushWriter? Wide impact suggests CPU, GC, or disk. Narrow impact suggests a stage-specific bottleneck. For example, MemtableFlushWriter pending greater than zero with MUTATION pending indicates the flush pipeline is the root cause.
- Check for GC pauses. Run
nodetool gcstatsor grep GC logs. If pauses exceed 500 ms, they are likely freezing all stages. If pauses exceed 2 seconds, gossip failure detection will trigger. - Check disk I/O. Run
iostat -x 1on the data and commitlog devices. If%utilis greater than 80% orawaitis elevated, disk saturation is the bottleneck. - Check compaction status. Run
nodetool compactionstats. If pending tasks are greater than 50 and growing, compaction is stealing I/O bandwidth and read amplification is increasing. - Check for gossip-specific backlog. If the GOSSIP pool shows pending greater than zero, check
nodetool statusfor flapping nodes. Gossip backlog does not self-correct and requires immediate investigation of network connectivity, disk I/O stalls, or GC pauses on the affected node. - Check commitlog pressure. Look at
Commit Log pending tasksinnodetool info. If greater than zero, the write path is blocked at the durability layer. - Correlate with dropped messages. Look at the Dropped section in
nodetool tpstats. Sustained drops confirm that work is timing out in queues. Dropped MUTATION means replica divergence that will require repair.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| MUTATION PendingTasks | Write path cannot keep up | > 0 sustained > 60 s |
| READ PendingTasks | Read requests piling up | > 0 sustained > 60 s |
| CurrentlyBlockedTasks | Queue overflow; work rejected | > 0 in any pool |
| All time blocked rate | Recurring overflow events | Increasing over 5 min |
| GOSSIP PendingTasks | False DOWN marking risk | > 0 sustained |
| Dropped MUTATION | Silent replica divergence | Non-zero rate |
| CompactionExecutor pending | Compaction falling behind | > 50 and growing |
| Native-Transport-Requests pending | Client request backlog | > 50% of pool maximum |
Fixes
CPU saturation
Reduce application write rate or add nodes. There is no Cassandra tuning knob that creates more CPU. If compaction is consuming the majority of cycles, temporarily lower compaction_throughput_mb_per_sec with nodetool setcompactionthroughput to free capacity for requests. This trades compaction debt for request latency.
Disk I/O contention
Verify that commitlog and data directories are on separate devices. If repairs or streaming are active, pause them. If compaction was artificially throttled below the disk capacity, raise compaction_throughput_mb_per_sec. If the device is already at maximum throughput, add IOPS or nodes. For STCS, major compaction can transiently need up to 100% additional space. Running out of room prevents compaction from running, which increases SSTable count and amplifies the I/O problem.
GC-induced stalls
This pattern is covered in detail in the Cassandra GC death spiral guide. Immediate actions: run nodetool disablebinary to stop new client load while keeping the node in the ring, then identify large partition reads or tombstone scans via nodetool toppartitions. Do not restart the node without identifying the heap consumer; the spiral will resume.
Commitlog backup
Ensure the commitlog device is independent from data directories. Check commitlog_sync mode: batch fsyncs every write batch and is slower than periodic. Do not change this during an incident without understanding the durability tradeoff. If commitlog segments are accumulating because memtable flushes are blocked, investigate MemtableFlushWriter pending tasks.
Gossip stage backlog
Treat this as a PAGE-level event. Check for network partitions by comparing nodetool status output from multiple nodes. Check for GC pauses longer than 2 seconds, which prevent gossip from progressing. Check disk I/O stalls that block the gossip thread. Gossip backlog does not self-correct.
Native transport overload
If the Native-Transport-Requests pool is pending, client connections are arriving faster than CQL requests can be parsed and routed. Check driver connection behavior and connectedNativeClients. Increasing native_transport_max_threads buys time but does not fix the underlying bottleneck.
Emergency load shedding
If the node is dropping mutations and approaching unavailability, run nodetool disablebinary to stop accepting new CQL connections without removing the node from the cluster. This prevents further client timeouts while you investigate disk, GC, or compaction issues.
Prevention
Monitor pending task
[OUTPUT TRUNCATED: Response exceeded output token limit.]







