Cassandra compaction strategies: STCS vs LCS vs TWCS vs UCS

Compaction merges immutable SSTables, discards tombstones, and reclaims disk space. The strategy assigned to a table controls the tradeoff between write amplification and read amplification, and it determines how much temporary disk headroom you must preserve. A fit strategy keeps SSTable counts low and latency predictable. A mismatch creates compaction debt: creeping P99 read latency first, then disk space exhaustion, and finally write rejections when compaction cannot reclaim space fast enough to keep up with flushes.

Operators usually discover mismatches during an incident: a node runs out of disk during major compaction, read latency spikes because L0 has accumulated hundreds of SSTables, or a time-series table recompacts closed windows after a client backfilled historical data. When that happens, you need to know which strategy is in use, why it is behaving that way, and whether the fix is headroom or a strategy migration.

It determines disk provisioning math, I/O baseline, and which signals you watch during an incident. SizeTieredCompactionStrategy (STCS), LeveledCompactionStrategy (LCS), TimeWindowCompactionStrategy (TWCS), and UnifiedCompactionStrategy (UCS) organize SSTables differently, produce different I/O patterns, and fail differently. Understanding the differences lets you distinguish normal background noise from a compaction death spiral before it saturates disks.

What compaction is and why the strategy matters

Cassandra is a log-structured merge-tree database. Writes append to commitlog and memtable; memtables flush to immutable SSTables. Over time, a partition’s data may split across many SSTables, and tombstones persist until compaction removes them. Compaction merges SSTables to reduce the files a read must touch and to evict tombstones.

Every table has a compaction strategy that decides which SSTables to merge and when. This controls three levers: read amplification (how many SSTables a read must consult), write amplification (how many times data is rewritten during compaction), and space amplification (temporary disk space used while merging files).

Compaction creates new SSTables before deleting the old ones, so every strategy consumes temporary disk space. The amount varies significantly between strategies. Provision disk based on raw data size alone and you will eventually lose a node to space exhaustion. The strategy also determines whether compaction I/O is bursty or steady, which affects repair scheduling and whether you can share the disk with other services.

How the strategies work

STCS

STCS is the default. It groups SSTables into buckets by size and compacts similarly-sized files into one larger SSTable when a bucket reaches a threshold. This produces exponential size tiers: many small files, fewer medium files, and rare large files. Compaction I/O is bursty because the system waits for a tier to fill. Reads may need to check every SSTable in a tier when bloom filters produce false positives, so read amplification grows with the number of uncompacted tiers.

LCS

LCS organizes SSTables into levels. Level 0 contains freshly flushed SSTables. Higher levels contain non-overlapping SSTables of roughly uniform size, each level an order of magnitude larger than the previous. Promoting an SSTable from L0 to L1 requires reading all overlapping L1 SSTables and rewriting them. For levels above L0, a read touching a given partition key will find at most one SSTable per level, but L0 may contain many overlapping files. This structure lowers read amplification at the cost of higher write amplification, as data is rewritten multiple times while climbing levels. Compaction I/O is a steady drumbeat rather than a spike.

TWCS

TWCS divides SSTables into time windows based on data timestamps. Within the active window, compaction behaves like STCS. When a window closes and all data inside it expires, Cassandra can drop the entire SSTable instead of processing individual tombstones. This avoids tombstone accumulation, but it depends on time-ordered ingestion. Out-of-order writes that land in old windows, or read repair that pulls historical data into a current window, force those windows to recompact and lose the whole-file deletion optimization.

UCS

UCS is available starting with Cassandra 5.0. It uses density-based triggers and sharded compaction instead of fixed size tiers or levels. Multiple compaction shards run in parallel, and tunable scaling parameters let you bias behavior toward write-heavy or read-heavy profiles without changing the strategy class. Pending tasks tend to be more evenly distributed, producing a smoother I/O pattern than STCS sawtooth spikes.

flowchart TD
    Flush[Memtable flush] --> SST[SSTable created]
    SST --> STCS[STCS bucket by size]
    SST --> LCS[LCS level zero buffer]
    SST --> TWCS[TWCS time window]
    SST --> UCS[UCS density shard]
    STCS --> Tier[Merge into larger tiers]
    LCS --> Level[Promote to fixed levels]
    TWCS --> Window[Compact then delete whole window]
    UCS --> Tune[Tunable read or write bias]

How each strategy behaves in production

STCS appears on tables where the default was never changed. It produces a sawtooth I/O pattern: long quiet periods followed by sudden spikes when a size tier reaches its threshold. During spikes, disk utilization can jump from 20 percent to 90 percent in minutes. If the disk is already above 50 percent full, the node risks entering a disk exhaustion pattern where compaction cannot allocate temporary space and flushes back up. STCS can transiently need up to 100 percent additional disk space during major compaction.

LCS appears on tables that serve latency-sensitive reads or range scans. The I/O pattern is continuous because the strategy must constantly promote files through levels. Compaction activity persists even during low write volume. The danger signal is L0 growth. If nodetool compactionstats shows L0 with dozens of SSTables and the count is rising, the node cannot promote data fast enough and read amplification climbs silently until latency degrades. As a practical rule, keep more than 30 percent of the disk free.

TWCS appears on metrics, logs, and sensor data. In healthy operation, old windows should show zero compaction activity. Compaction running on windows that closed days ago indicates out-of-order writes or read repair pulling historical data into current windows. This is the primary TWCS failure mode. Because expired windows can be dropped whole, TWCS typically needs only 20 percent disk headroom.

UCS appears on Cassandra 5.0 clusters or migrated tables. Pending compaction tasks are more evenly distributed, so dashboards look smoother than the STCS sawtooth. The main operational consideration is ensuring you are on Cassandra 5.0 or later; attempting to use UCS on earlier versions will fail.

Tradeoffs and when to use each

StrategyBest forRead amplificationWrite amplificationDisk headroomPrimary risk
STCSWrite-heavy, bursty ingestionHighLowGreater than 50 percent freeDisk exhaustion during major compaction
LCSRead-heavy, steady workloadLow (excluding L0)HighGreater than 30 percent freeL0 backlog under write spikes
TWCSTime-series with uniform TTLMediumLow after window closeGreater than 20 percent freeOut-of-order writes forcing window recompaction
UCSGeneral purpose, Cassandra 5.0+TunableTunableVersion-locked to 5.0 and later

If you run Cassandra 5.0 or later and have no strong reason to use a legacy strategy, start with UCS and tune its scaling parameters. On earlier versions, use STCS for ingestion-heavy tables and LCS for latency-sensitive reads. Reserve TWCS for tables where every row has a TTL and writes are time-ordered.

Strategy migration is possible with an ALTER TABLE statement. The change applies to future compactions; existing SSTables are not rewritten immediately. However, the new strategy may still initiate large compaction jobs as it restructures existing data, which can saturate disk I/O for hours or days. Do not change strategies during an incident. If you must migrate from STCS to LCS, plan the operation during a maintenance window when you can tolerate the write amplification cost of the initial level buildup.

Signals to watch in production

SignalWhy it mattersWarning sign
Pending compactions trendLeading indicator of compaction debtIncreasing over 4 or more hours; STCS can spike to hundreds transiently, but LCS should stay low
SSTable count per tableDirect measure of read amplificationSTCS sustained high count; LCS growing L0 count or total count rising steadily
Disk space freeCompaction needs temporary space to write merged filesBelow strategy-specific headroom: STCS below 50 percent, LCS below 30 percent, TWCS below 20 percent
Disk I/O utilizationCompaction competes with reads and flushes for the same devicesAbove 80 percent sustained
Tombstone scan warningsIndicates TWCS window corruption or repair gapsSustained log entries or aborted reads

Watch the trend, not the absolute value. A pending compaction count that rises from 10 to 30 over six hours is more significant than a stable count of 50. For LCS, watch L0 SSTable count. For STCS, watch the size of the largest tier relative to free disk. In TWCS, check nodetool compactionstats for compaction activity on closed windows that should be idle.

How Netdata helps

  • Correlate pending compactions with per-device disk I/O utilization and P99 read latency to spot a compaction death spiral before it triggers client timeouts.
  • Apply strategy-specific thresholds to SSTable count and disk space alerts. An STCS node at 60 percent full is an emergency; an LCS node at 60 percent is not.
  • Track JVM heap usage and GC pause duration alongside compaction metrics. Compaction backlog increases heap pressure from SSTable metadata, and long GC pauses can masquerade as compaction stalls.
  • Use per-node comparison charts to detect asymmetric compaction backlog, where one node falls behind peers due to a hot partition or local disk degradation.
  • Annotate maintenance windows so post-restart compaction bursts and planned strategy migrations do not trigger false-positive alerts.