Cassandra thread pool pending and blocked tasks: SEDA backpressure

You run nodetool tpstats and see non-zero values in the Pending or Blocked columns. On a healthy Cassandra node, request-stage pools like MUTATION and READ should show zero pending tasks in steady state. When pending climbs and stays above zero, the SEDA pipeline is backing up. If Blocked also rises, the node has moved from queuing to rejecting work.

This is not a transient spike you can ignore. Sustained pending tasks on the MUTATION or READ stages add latency to every client request. Blocked tasks mean the queue is full and the node is actively shedding load. In the GOSSIP stage, even a small pending backlog is an emergency: it means gossip is falling behind, which leads to false DOWN marking across the cluster.

The root cause is almost always a downstream bottleneck. The thread pool is not the problem; it is the symptom. This guide shows how to read nodetool tpstats, find the bottleneck, and relieve the pressure before dropped mutations or gossip flapping trigger a wider outage.

What this means

Cassandra uses a staged event-driven architecture (SEDA). Writes, reads, gossip, compaction, and flushes each run in dedicated thread pools. When a write arrives, it is handed to the MUTATION pool. If all threads are busy, the task enters the Pending queue. If the queue also fills, the task is counted as Blocked and is rejected or dropped.

nodetool tpstats shows five columns per pool:

Column	What it tracks
Active	Threads currently executing tasks
Pending	Tasks queued waiting for a thread
Completed	Tasks finished since JVM startup
Blocked	Tasks that could not enter the queue because it is full
All time blocked	Cumulative blocked tasks since JVM startup

A brief Pending spike during a traffic burst is normal and resolves within seconds. Sustained Pending greater than zero means the pool cannot dequeue work as fast as it arrives. Blocked greater than zero means the queue overflowed and Cassandra rejected the task. For request pools (MUTATION, READ, Native-Transport-Requests), any blocked count is an emergency because client requests are being rejected. For the GOSSIP pool, pending alone is dangerous: backed-up gossip prevents the failure detector from updating peer state and can cause peers to falsely mark the node DOWN.

In Cassandra 3.x, nodetool tpstats displays CamelCase pool names such as MutationStage and ReadStage. In 4.x and later, the same pools appear as uppercase MUTATION and READ. The underlying JMX metric structure is identical.

flowchart TD
    A[Disk CPU or GC saturation] --> B[SEDA stage slows]
    B --> C[Pending tasks > 0]
    C --> D[Blocked tasks > 0]
    D --> E[Dropped messages]
    C --> F[Latency spikes]
    B --> G[GOSSIP backlog]
    G --> H[False DOWN marking]

Common causes

Cause	What it looks like	First thing to check
CPU saturation	Pending climbing across multiple pools simultaneously; CPU usage near 100%	`mpstat -P ALL 1`
Disk I/O contention	READ or MUTATION pending with elevated `await` on data or commitlog devices	`iostat -x 1` on the relevant device
GC-induced stalls	Pending spikes that correlate with stop-the-world pauses; drops reset after the pause	`grep "pause" /var/log/cassandra/gc.log`
Compaction debt	CompactionExecutor pending > 50 and growing; SSTable count increasing	`nodetool compactionstats`
Slow replicas or network partitions	Coordinator latency high but local read latency low on the replica; some nodes show higher pending than peers	`nodetool proxyhistograms` and per-node `nodetool tpstats`
Gossip stage backlog	GOSSIP pool Pending > 0 sustained; nodes flapping UP/DOWN	`nodetool status` from multiple nodes

Quick checks

# Check thread pool saturation per stage
nodetool tpstats

# Check heap pressure and commitlog backlog
nodetool info | grep -E "Heap Memory|Commit Log"

# Check disk I/O latency on data and commitlog devices
iostat -x 1

# Check for GC pauses longer than 200 ms
grep -i "pause" /var/log/cassandra/gc.log | awk '$NF > 200'

# Check compaction backlog
nodetool compactionstats

# Check for node flapping or DOWN states
nodetool status

# Check coordinator latency percentiles
nodetool proxyhistograms

How to diagnose it

Confirm the symptom is sustained. Run nodetool tpstats twice, 30 seconds apart. If Pending on MUTATION or READ is greater than zero both times, the pool is saturated. Note the Blocked and All time blocked columns: if Blocked is greater than zero or All time blocked is climbing, the queue has overflowed.
Check which pools are affected. Is it only MUTATION, only READ, or multiple pools including CompactionExecutor and MemtableFlushWriter? Wide impact suggests CPU, GC, or disk. Narrow impact suggests a stage-specific bottleneck. For example, MemtableFlushWriter pending greater than zero with MUTATION pending indicates the flush pipeline is the root cause.
Check for GC pauses. Run nodetool gcstats or grep GC logs. If pauses exceed 500 ms, they are likely freezing all stages. If pauses exceed 2 seconds, gossip failure detection will trigger.
Check disk I/O. Run iostat -x 1 on the data and commitlog devices. If %util is greater than 80% or await is elevated, disk saturation is the bottleneck.
Check compaction status. Run nodetool compactionstats. If pending tasks are greater than 50 and growing, compaction is stealing I/O bandwidth and read amplification is increasing.
Check for gossip-specific backlog. If the GOSSIP pool shows pending greater than zero, check nodetool status for flapping nodes. Gossip backlog does not self-correct and requires immediate investigation of network connectivity, disk I/O stalls, or GC pauses on the affected node.
Check commitlog pressure. Look at Commit Log pending tasks in nodetool info. If greater than zero, the write path is blocked at the durability layer.
Correlate with dropped messages. Look at the Dropped section in nodetool tpstats. Sustained drops confirm that work is timing out in queues. Dropped MUTATION means replica divergence that will require repair.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
MUTATION PendingTasks	Write path cannot keep up	> 0 sustained > 60 s
READ PendingTasks	Read requests piling up	> 0 sustained > 60 s
CurrentlyBlockedTasks	Queue overflow; work rejected	> 0 in any pool
All time blocked rate	Recurring overflow events	Increasing over 5 min
GOSSIP PendingTasks	False DOWN marking risk	> 0 sustained
Dropped MUTATION	Silent replica divergence	Non-zero rate
CompactionExecutor pending	Compaction falling behind	> 50 and growing
Native-Transport-Requests pending	Client request backlog	> 50% of pool maximum

Fixes

CPU saturation

Reduce application write rate or add nodes. There is no Cassandra tuning knob that creates more CPU. If compaction is consuming the majority of cycles, temporarily lower compaction_throughput_mb_per_sec with nodetool setcompactionthroughput to free capacity for requests. This trades compaction debt for request latency.

Disk I/O contention

Verify that commitlog and data directories are on separate devices. If repairs or streaming are active, pause them. If compaction was artificially throttled below the disk capacity, raise compaction_throughput_mb_per_sec. If the device is already at maximum throughput, add IOPS or nodes. For STCS, major compaction can transiently need up to 100% additional space. Running out of room prevents compaction from running, which increases SSTable count and amplifies the I/O problem.

GC-induced stalls

This pattern is covered in detail in the Cassandra GC death spiral guide. Immediate actions: run nodetool disablebinary to stop new client load while keeping the node in the ring, then identify large partition reads or tombstone scans via nodetool toppartitions. Do not restart the node without identifying the heap consumer; the spiral will resume.

Commitlog backup

Ensure the commitlog device is independent from data directories. Check commitlog_sync mode: batch fsyncs every write batch and is slower than periodic. Do not change this during an incident without understanding the durability tradeoff. If commitlog segments are accumulating because memtable flushes are blocked, investigate MemtableFlushWriter pending tasks.

Gossip stage backlog

Treat this as a PAGE-level event. Check for network partitions by comparing nodetool status output from multiple nodes. Check for GC pauses longer than 2 seconds, which prevent gossip from progressing. Check disk I/O stalls that block the gossip thread. Gossip backlog does not self-correct.

Native transport overload

If the Native-Transport-Requests pool is pending, client connections are arriving faster than CQL requests can be parsed and routed. Check driver connection behavior and connectedNativeClients. Increasing native_transport_max_threads buys time but does not fix the underlying bottleneck.

Emergency load shedding

If the node is dropping mutations and approaching unavailability, run nodetool disablebinary to stop accepting new CQL connections without removing the node from the cluster. This prevents further client timeouts while you investigate disk, GC, or compaction issues.

Prevention

Monitor pending task

[OUTPUT TRUNCATED: Response exceeded output token limit.]

The Netdata solution

Cassandra monitoring with Netdata

Netdata monitors Apache Cassandra with per-second metrics and automatic dashboards. Correlate GC pauses, compaction backlog, tombstone rates, pending hints, and disk usage across nodes to catch a creeping cluster before it tips over.

See Cassandra monitoring → Start monitoring free