Cassandra disk space exhaustion: emergency recovery when the data volume fills
A Cassandra node that runs out of disk space does not degrade gracefully. Compaction halts because it cannot allocate temporary space to merge SSTables. Old SSTables are never deleted. Writes append to the commitlog until segment allocation blocks. At that point the node rejects mutations and CommitLog.WaitingOnSegmentAllocation climbs. You may see No space left on device errors while the data volume still reports a few percent free, because Cassandra’s internal headroom requirements are stricter than the filesystem.
If you are responding to a PAGE for disk exhaustion, the node likely has less than 10% free space and write-path impact is confirmed. This runbook identifies what is consuming space, reclaims it safely without restarting the node, and gets compaction moving again.
What this means
Cassandra appends every write to the commitlog and inserts it into a memtable. When memtables flush, they become immutable SSTables. Compaction merges SSTables in the background to purge tombstones and reclaim space. That merge requires temporary disk space roughly equal to the size of the input SSTables.
When the filesystem fills, compaction cannot reserve temporary space. Without compaction, old SSTables remain on disk. New flushes and commitlog segments continue to consume space. Eventually the commitlog cannot allocate new segments. The node blocks writes, increments WaitingOnSegmentAllocation, and returns failures to clients.
flowchart TD
A[Disk usage exceeds compaction headroom] --> B[Compaction cannot reserve temp space]
B --> C[Compaction stalls]
C --> D[Old SSTables are never deleted]
D --> E[Disk usage rises from flushes and commitlog]
E --> F[Commitlog segment allocation blocks]
F --> G[Writes are rejected]
G --> H[Hints accumulate on coordinators]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Forgotten snapshots | df usage is high but nodetool info Load is lower. nodetool listsnapshots shows old snapshots. | nodetool listsnapshots and du -sh on snapshot directories. |
| Hint accumulation | Hints directory is large after a node was down. Hints consume space on coordinators. | du -sh /var/lib/cassandra/hints/ on all nodes. |
| STCS space amplification | Disk usage above 50% with SizeTieredCompactionStrategy. Major compaction needs temporary space equal to current data size. | Compaction strategy and nodetool tablestats SSTable count. |
| Commitlog and data on same device | Commitlog segments grow alongside data files, accelerating exhaustion. | df -h on both data and commitlog mounts. |
| Rapid data growth or bulk load | Write rate exceeds compaction throughput; SSTables accumulate faster than they merge. | nodetool compactionstats and write latency trends. |
Quick checks
Run these read-only checks to assess scope before making changes.
# Compare Cassandra data load to filesystem free space
nodetool info | grep "Load"
df -h /var/lib/cassandra/data
# Check commitlog pressure
nodetool info | grep -i "Commit Log"
# List snapshots to identify hard-linked space consumers
nodetool listsnapshots
# Check hints size on this node
du -sh /var/lib/cassandra/hints/
# Find the largest keyspaces by disk usage
du -sh /var/lib/cassandra/data/*/
# Check compaction status and pending tasks
nodetool compactionstats
# Search logs for disk-full errors
grep -i "no space left\|insufficient disk space" /var/log/cassandra/system.log
How to diagnose it
- Confirm filesystem utilization. Use
df -hon the data directory, commitlog directory, and hints directory. If any mount is above 90% full, treat it as critical. - Compare Cassandra’s
Loadmetric to filesystem usage. Runnodetool info | grep Load. Ifdfshows far more consumption thanLoad, the difference is likely snapshots, hints, commitlog segments, or compaction temporary files. - Identify snapshot bloat. Run
nodetool listsnapshots. Snapshots share hard links with live SSTables, but they prevent old SSTables from being deleted after compaction. TheTrue sizefield shows actual unique data;Size on diskshows the hard-link footprint. Even ifdflooks manageable, snapshots can tip a node over during a major compaction. - Check the commitlog allocation state. Look for
WaitingOnSegmentAllocation > 0via JMX or your monitoring agent. This confirms the write path is blocked by disk space, not by GC pressure or network partition. - Review compaction status. Run
nodetool compactionstats. If pending tasks are high but active compactions are zero, the compaction executor is stalled waiting for temp space. This is the signature of disk exhaustion with STCS. - Inspect hints. Run
du -sh /var/lib/cassandra/hints/. If hints are large, a recent node outage caused coordinators to buffer writes locally. Hints replay will add more write pressure once the target returns. - Check for write rejections. Look in
system.logforNo space left on device. Correlate with client write timeout and unavailable metrics to confirm user impact.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Disk space available | Compaction requires temporary space; zero free space halts the engine. | Less than 30% free on data volume; less than 50% free with STCS. |
| Commitlog WaitingOnSegmentAllocation | Direct indicator that the write path is blocked waiting for disk space. | Greater than 0 sustained for more than 60 seconds. |
| Pending compactions | Compaction debt. If pending grows while active compactions drop to zero, temp space is likely the blocker. | Trending upward over hours while disk usage climbs. |
| Client write timeouts/unavailables | Client-facing impact. Confirms the node is rejecting or failing writes. | Non-zero rate sustained for more than 60 seconds. |
| Storage exceptions | Uncaught storage-layer errors. May indicate filesystem corruption or full disk events. | Any non-zero rate sustained more than 30 seconds. |
| SSTable count per table | High counts mean read amplification and indicate compaction is not keeping up. | Greater than 50 for STCS or above expected LCS level bounds. |
Fixes
Clear snapshots
Snapshots are the fastest way to reclaim space because they use hard links. Removing them deletes the hard links but does not touch live data. Only clear snapshots after verifying that backups are complete and retained elsewhere.
# Remove all snapshots. Destructive if you rely on these for backups.
nodetool clearsnapshot --all
Tradeoff: If snapshots are your only backup, this is irreversible. Ensure backups are stored off-node first.
Clear saved caches
Saved caches are non-critical and rebuilt automatically on restart.
# Safe to delete; Cassandra rebuilds these on restart.
rm -rf /var/lib/cassandra/saved_caches/*
Address hint accumulation
Hints consume disk space on coordinators. If the target node is back online, hints replay automatically and free space as they complete. If the target is permanently decommissioned, remove its hint files after confirming the node will not rejoin. Do not delete active hint files for a downed target; if you do, run a full repair once it returns to restore consistency.
Add storage
If the volume can be expanded through LVM, cloud block storage, or a new mount, expand the filesystem. After space is added, compaction should resume automatically. Monitor nodetool compactionstats to confirm active compactions restart.
Reduce compaction pressure temporarily
If you are on STCS and cannot add storage immediately, do not force a major compaction because it requires temporary space equal to the current data size. Focus on removing snapshots and saved caches first. If you must change compaction strategy to reduce space amplification later, schedule that for a maintenance window after the crisis is resolved.
Prevention
- Monitor filesystem free space directly with
df, not only theLoadmetric.Loadexcludes commitlog, hints, snapshots, and compaction temp files. - Automate snapshot cleanup. Verify backups, then schedule
nodetool clearsnapshotor automate via your backup tool. - Monitor the hints directory size on all nodes. Hints should be near zero when all nodes are healthy.
- Separate commitlog and data directories onto different mounts. Sharing a device accelerates exhaustion and adds I/O contention.
- Size disks according to compaction strategy. Maintain greater than 50% free for STCS, greater than 30% for LCS or TWCS, plus a buffer for snapshots and hints.
- Track
PendingCompactionsand SSTable count as leading indicators. A rising trend predicts disk exhaustion days before it happens.
How Netdata helps
- Correlates disk space usage on the data mount with
CommitLog.WaitingOnSegmentAllocationandCompaction.PendingTasksto surface write-path blockage before the disk hits 100%. - Tracks per-table SSTable counts and compaction backlog alongside filesystem metrics, distinguishing a compaction stall from simple data growth.
- Alerts on commitlog pending tasks and storage exceptions using composite patterns, reducing noise from transient spikes.
- Disk I/O utilization per device confirms whether commitlog and data share a saturated disk.
Related guides
- Cassandra compaction strategies: STCS vs LCS vs TWCS vs UCS: /guides/cassandra/cassandra-choosing-compaction-strategy/
- Cassandra compaction death spiral: when writes outrun compaction throughput: /guides/cassandra/cassandra-compaction-death-spiral/
- Cassandra consistency levels explained: QUORUM, ONE, LOCAL_QUORUM, and EACH_QUORUM: /guides/cassandra/cassandra-consistency-levels-explained/
- Cassandra zombie data resurrection: gc_grace_seconds and unrepaired tombstones: /guides/cassandra/cassandra-data-resurrection-gc-grace/
- Cassandra GC death spiral: long pauses, gossip flapping, and recovery: /guides/cassandra/cassandra-gc-death-spiral/
- Cassandra GC pauses too long: diagnosing G1 stop-the-world pauses: /guides/cassandra/cassandra-gc-pauses-too-long/
- Cassandra heap pressure: sizing the JVM heap and tuning G1GC: /guides/cassandra/cassandra-heap-pressure-tuning/
- Cassandra monitoring checklist: the signals every production cluster needs: /guides/cassandra/cassandra-monitoring-checklist/
- Cassandra monitoring maturity model: from survival to expert: /guides/cassandra/cassandra-monitoring-maturity-model/
- Cassandra java.lang.OutOfMemoryError: Java heap space - causes and recovery: /guides/cassandra/cassandra-out-of-memory-error/
- Cassandra pending compactions growing: the compaction backlog runbook: /guides/cassandra/cassandra-pending-compactions-growing/
- Cassandra Scanned over N tombstones warning: finding the offending query: /guides/cassandra/cassandra-scanned-over-tombstones-warning/







