Cassandra commitlog pending tasks: write-path I/O pressure
Sustained non-zero CommitLog PendingTasks means a Cassandra node’s write path is backing up. Every write must be appended to the commitlog and synced to disk before the coordinator acknowledges it. When the fsync thread cannot keep up, mutations queue. This starts as elevated write latency; if the queue persists, it forces emergency memtable flushes, overwhelms the flush and compaction pipeline, and produces dropped mutations.
This is a durability bottleneck that affects every write replica-wide. Because the commitlog sits at the start of the write path, a slowdown cascades predictably: delayed acknowledgments, segment allocation pressure, forced flushes, then load shedding. The root cause is almost always I/O saturation on the commitlog device, an undersized or shared disk, or a mismatch between commitlog_sync mode and hardware.
Operators usually notice only after client write timeouts or dropped mutation alerts fire. By then the node has been under pressure for minutes. Treat CommitLog PendingTasks as an early warning, not background noise.
What this means
Cassandra’s write path is append-only. A replica receives a mutation, writes it sequentially to the current commitlog segment, and waits for the sync strategy to confirm durability. Under commitlog_sync: periodic (the default), the sync thread batches fsyncs every commitlog_sync_period_in_ms (default 10 seconds). Under commitlog_sync: batch, the coordinator blocks until that write batch is physically synced. In both modes, the sync operation gates acknowledgment.
CommitLog PendingTasks tracks mutations waiting for sync or segment allocation. A transient spike during a burst is normal, but sustained > 0 means the sync thread is falling behind. Segments fill faster than they are recycled. Segments cannot be discarded until every memtable that references them is flushed. If the flush pipeline is busy, commitlog space pressure builds, triggering WaitingOnSegmentAllocation and WaitingOnCommit. At that point the node is actively stalling writes.
With segments retained longer, the commitlog directory grows. Cassandra forces memtable flushes to free segments, but those flushes compete with compaction for disk I/O. If the commitlog shares a spindle with data directories, contention worsens: flushes write to the same device struggling to fsync the commitlog. The flush pipeline saturates, memtables grow, and the mutation stage drops messages. A slow disk becomes a cluster-wide write reliability risk.
flowchart TD A[Slow commitlog fsync] --> B[PendingTasks increases] B --> C[Write ack delayed] A --> D[Slow segment recycle] D --> E[Commitlog space pressure] E --> F[Forced memtable flushes] F --> G[Flush pipeline saturated] G --> H[Compaction debt rises] H --> I[Dropped mutations]
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Commitlog disk saturation (shared with data or slow storage) | PendingTasks > 0, high w_await or %util on the commitlog device, data disk may look normal | iostat -x 1 on the commitlog device |
commitlog_sync: batch on undersized I/O | High PendingTasks even at moderate throughput; every batch waits for a dedicated fsync | commitlog_sync mode in cassandra.yaml |
| Memtable flush bottleneck blocking segment recycle | Growing commitlog directory (du -sh), MemtableFlushWriter pending > 0 in nodetool tpstats | nodetool tpstats and nodetool compactionstats |
| Commitlog total space pressure | WaitingOnSegmentAllocation > 0, commitlog size approaching commitlog_total_space_in_mb | du -sh on the commitlog path and df -h on the volume |
Quick checks
# Check commitlog directory size
du -sh /var/lib/cassandra/commitlog
# Check commitlog backlog (PendingTasks) via JMX or your metrics pipeline
# MBean example: org.apache.cassandra.db:type=Commitlog <!-- TODO: verify MBean name and case for target version -->
# Check thread pool saturation, especially MutationStage and MemtableFlushWriter
nodetool tpstats
# Check commitlog device I/O latency and utilization
iostat -x 1
# Check commitlog volume free space
df -h
# Check whether compaction and flush are keeping up
nodetool compactionstats
How to diagnose it
- Confirm the symptom is sustained. Sample
CommitLog PendingTasksat 10-second intervals via JMX or your metrics pipeline. A transient spike during a bulk load differs from a sustained plateau. - Isolate the commitlog disk. Run
iostat -x 1on the commitlog device. Look forw_await> 10 ms on SSD or > 50 ms on HDD, or%util> 80%. If commitlog and data share the same device, I/O contention is likely. - Inspect the flush pipeline. In
nodetool tpstats, checkMemtableFlushWriterfor pending or blocked tasks. If flushes back up, segments cannot be recycled. Runnodetool compactionstatsto see if compaction tasks are accumulating. - Check segment allocation pressure. If your monitoring exposes JMX
WaitingOnSegmentAllocationorWaitingOnCommit, any non-zero value indicates the commitlog cannot acquire or recycle segments. This is a stronger signal than PendingTasks alone. - Correlate with write-path errors. Check
nodetool tpstatsDropped section for MUTATION drops. Cross-reference with client write timeout metrics. If dropped mutations rise while commitlog pending stays high, the node is shedding load. - Review sync mode and throughput. Check
commitlog_syncincassandra.yaml. If set tobatch, the commitlog thread fsyncs every write group rather than batching. On spinning disks or variable-latency cloud storage, this often saturates the device even at moderate write rates.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| CommitLog PendingTasks | Direct measure of commitlog sync backlog | > 0 sustained for > 60 s |
| WaitingOnSegmentAllocation | Segment allocation blocked; flushes cannot free space fast enough | Any non-zero value |
| WaitingOnCommit | Mutations queued behind fsync | Any non-zero sustained |
| CommitLog TotalCommitLogSize | Growth means segments are retained and not recycled | Growing beyond steady-state baseline |
| MemtableFlushWriter pending | Flush backlog prevents segment reuse | > 0 sustained |
| MutationStage pending | Writes queuing behind the commitlog stage | > 0 sustained |
| Dropped MUTATION | Active write loss from overload | Any sustained non-zero rate |
Disk w_await (commitlog device) | fsync latency directly gates write acknowledgments | > 10 ms on SSD, > 50 ms on HDD |
| Client write latency P99 | End-user impact of commitlog delay | > 3x baseline or sustained > 100 ms |
Fixes
Move commitlog to a dedicated disk
Move the commitlog to a dedicated disk. The commitlog workload is purely sequential write and fsync; data directories handle random reads, large sequential compaction writes, and flushes. When they share a device, head movement and queue depth contention kill fsync latency. This requires provisioning a separate volume for the commitlog path, updating cassandra.yaml, and restarting the node. Do not do this during heavy write load without a maintenance window.
Switch from batch to periodic sync
If the workload does not require per-write-group durability, change commitlog_sync from batch to periodic. The default interval of 10 seconds batches fsyncs, dramatically reducing IOPS demand. The tradeoff is a larger window of uncommitted data on power loss. For most workloads, periodic with a dedicated disk provides sufficient durability.
Throttle write pressure
If a traffic spike or bulk load exceeds provisioned IOPS, reduce the incoming write rate at the application or coordinator level. Pause non-critical batch jobs, reduce unlogged batch sizes, or temporarily reroute traffic away from the affected replica. This buys time without restarting the node.
Increase flush concurrency
If flushes are too slow to recycle segments and the disk has unused IOPS headroom, increase memtable_flush_writers. This allows more concurrent flush threads. On a shared disk, additional flush writers increase contention rather than help.
Scale commitlog IOPS
On cloud or virtualized infrastructure, upgrade the commitlog volume to a higher-IOPS tier or move to local NVMe. Remote storage with high fsync variance commonly causes commitlog backup. On-premise, verify the disk is not degraded and no other services share the spindle.
Prevention
- Dedicated commitlog disk. Provision a separate device at deployment time. Never share it with data, hints, or snapshots.
- Monitor commitlog pending as a first-class signal. Include
CommitLog PendingTasksin your paging alerts. Page on sustained > 0 for more than 60 seconds. - Size for peak fsync IOPS. Size the commitlog disk for peak write rate multiplied by fsync frequency, not average throughput.
- Validate sync mode against hardware. Only use
batchsync if storage can sustain the fsync rate at peak write volume. - Watch the flush pipeline. Monitor
MemtableFlushWriterpending tasks and commitlog size trends. Segment recycling depends on healthy flushes.
How Netdata helps
- Correlate
CommitLog PendingTaskswith per-disk I/O latency and utilization to spot commitlog device saturation. - Track dropped mutations, write latency percentiles, and commitlog size to visualize the cascade from fsync delay to load shedding.
- Alert on sustained commitlog backlog without manual JMX sampling.
- Surface memtable flush and compaction pressure alongside commitlog metrics to distinguish disk saturation from flush pipeline failure.
- Baseline write-path latency per node to distinguish normal spikes from sustained pressure before mutations drop.
Related guides
- Cassandra node stuck in joining (UJ): bootstrap diagnosis
- Cassandra compaction strategies: STCS vs LCS vs TWCS vs UCS
- Cassandra clock skew: how NTP drift silently corrupts data
- Cassandra compaction death spiral: when writes outrun compaction throughput
- Cassandra consistency levels explained: QUORUM, ONE, LOCAL_QUORUM, and EACH_QUORUM
- Cassandra zombie data resurrection: gc_grace_seconds and unrepaired tombstones
- Cassandra disk space exhaustion: emergency recovery when the data volume fills
- Cassandra dropped mutations: silent write loss and load shedding
- Cassandra dropped reads and other messages: reading nodetool tpstats Dropped
- Cassandra GC death spiral: long pauses, gossip flapping, and recovery
- Cassandra GC pauses too long: diagnosing G1 stop-the-world pauses
- Cassandra gossip flapping: nodes bouncing UP and DOWN







