Cassandra repair not running: the silent gap that resurrects deleted data

Deleted rows reappear in application queries, or system_distributed.repair_history shows a last success older than your gc_grace_seconds window. In Cassandra, a repair that is not running, not completing, or not scheduled is an active data integrity risk.

Cassandra uses tombstones to track deletions and TTL expirations. Those tombstones must survive on every replica until anti-entropy repair propagates the delete. Once gc_grace_seconds passes, compaction drops tombstones. If repair has not finished for a table within that window, nodes that missed the original delete retain live data while the rest have discarded the tombstone. The result is zombie data resurrection.

flowchart TD
    A[Repair not completed within gc_grace_seconds] --> B[Tombstones compacted away on replicas that received delete]
    B --> C[Unrepaired replicas retain original live data]
    C --> D[Deleted data reappears on reads]

What this means

Anti-entropy repair builds Merkle trees across token ranges and streams differences between replicas. It is the only mechanism that guarantees every replica has seen every deletion. gc_grace_seconds defaults to 864,000 seconds (10 days). Compaction will purge tombstones older than this threshold. Repair must complete for every table before that purge happens.

When repair lapses:

  • Tombstones compact away on replicas that received the delete.
  • Replicas that were down, partitioned, or slow during the delete never received the tombstone.
  • Those unrepaired replicas still serve the original data.
  • A subsequent read at a consistency level that includes the unrepaired replica returns the deleted row.

This is the deterministic outcome of a missing or incomplete repair cycle.

Common causes

CauseWhat it looks likeFirst thing to check
Repair never scheduled or automation brokeNo recent entries in system_distributed.repair_history; no orchestration tool configuredQuery system_distributed.repair_history for the latest repair per table
Repair starts but fails silentlyRepair command returns while background sessions hang or abort; only some token ranges finishnodetool repair_admin list (4.0+) for incomplete sessions
Repair too slow for the datasetLast successful repair timestamp drifts closer to gc_grace_seconds each cycleCompare last repair time to the table’s gc_grace_seconds
Pre-4.0 incremental repair bugsSSTable proliferation or inconsistent state on Cassandra 3.x clustersCassandra version and whether incremental repair is enabled
Compaction backlog blocking repairHigh pending compactions; anti-compaction cannot split SSTables fast enoughnodetool compactionstats and pending task trends
STCS anti-compaction overload on dense nodesRepair stalls for hours or days; large SSTables rewritten repeatedlySSTable sizes and compaction strategy configuration

Quick checks

Run these read-only commands to assess the repair gap without mutating data.

# Check last repair timestamp per table (3.x+)
cqlsh -e "SELECT keyspace_name, table_name, finished_at FROM system_distributed.repair_history LIMIT 50;"

# Get gc_grace_seconds for a specific table
cqlsh -e "SELECT gc_grace_seconds FROM system_schema.tables WHERE keyspace_name='ks' AND table_name='tbl';"

# Check parent repair session status (4.0+)
cqlsh -e "SELECT keyspace_name, started_at, finished_at, state FROM system_distributed.parent_repair_history LIMIT 20;"

# List active and historical repair sessions (4.0+)
nodetool repair_admin list

# Check repaired vs unrepaired data percentage (4.0+)
nodetool tablestats <keyspace>.<table>

# Check compaction backlog that may block repair progress
nodetool compactionstats

# Check active streaming sessions from running repair
nodetool netstats

# Search logs for repair errors or session timeouts
grep -i "repair" /var/log/cassandra/system.log

How to diagnose it

  1. Confirm the repair gap. Query system_distributed.repair_history to find the latest completed repair for each keyspace and table. On Cassandra 4.0+, nodetool repair_admin list shows session states and completion percentages. If a table has no completed repair within gc_grace_seconds, the gap is confirmed.

  2. Check the deadline. Retrieve the table’s gc_grace_seconds value from system_schema.tables. Compute 80% of that value. If the last repair is older than that threshold, the table is in the danger zone. With the default 10 days, alert at 8 days.

  3. Identify silent failures. A repair command can return to the shell while background sessions continue, fail, or hang. Verify session status in repair_admin list, parent_repair_history, or logs. Look for sessions that started but never reached a successful completion.

  4. Inspect compaction interaction. Repair generates anti-compaction work. If nodetool compactionstats shows pending tasks trending upward while repair is active, compaction is the bottleneck. On Cassandra 5.0+, check whether you are using STCS with incremental repair. STCS anti-compaction can rewrite multi-hundred gigabyte SSTables repeatedly, stalling repair indefinitely.

  5. Review version-specific bugs. On Cassandra 3.x, incremental repair is unreliable and can leave repaired state inconsistent (CASSANDRA-9143). If you are on 3.x and using incremental repair, switch to full subrange repair or upgrade to 4.0+. Check for FINALIZED repair sessions not cleaned up after range movements in 4.0+/5.0+. Also check for incremental repair failures on mixed IPv4/IPv6 clusters.

  6. Check node density and vNode count. High num_tokens increases Merkle tree comparisons per repair. Very large SSTables on dense nodes extend repair duration per subrange. Evaluate whether the chosen segment count for subrange repair is appropriate for your SSTable sizes.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Time since last successful repairDirect measure of the anti-entropy windowExceeds 80% of gc_grace_seconds
Repair session statusCatches repairs that start but do not finishSessions that do not reach successful completion after extended runtime
Pending compactionsAnti-compaction and repair-generated SSTables must be processedTrending upward over hours during a repair cycle
Unrepaired SSTable percentage (4.0+)Shows incremental repair backlogPercentage growing week over week
Tombstone scan warningsTombstones accumulate when repair cannot purge themSustained log entries exceeding tombstone_warn_threshold
Read repair rateCompensates for full repair gaps but adds latencySpike not correlated with a node returning from downtime

Fixes

If repair has never been scheduled

Deploy Reaper or another orchestration tool. Manual cron jobs invoking nodetool repair fail silently because they do not handle token range calculation, failure retries, or progress tracking. Reaper manages subrange scheduling and persists completion state.

If repair is running but too slow

Switch to subrange full repair with smaller segments to reduce per-pass memory and I/O overhead. Increase stream_throughput_outbound_megabits_per_sec temporarily if network is the bottleneck, but monitor client latency. If the dataset is too large to repair within gc_grace_seconds, you must either increase subrange segment count or increase gc_grace_seconds after confirming repair can complete within the new window.

If compaction is blocking repair

On Cassandra 5.0+, migrate tables experiencing anti-compaction stalls to UnifiedCompactionStrategy (UCS). UCS handles incremental repair anti-compaction far more efficiently than SizeTieredCompactionStrategy (STCS).

WARNING: Changing compaction strategy requires rewriting all existing SSTables and is extremely disk and I/O intensive. Test on a non-production cluster first, and execute during a maintenance window.

If you must stay on STCS, schedule repair only during off-peak hours and ensure at least 50% free disk space for anti-compaction headroom.

If you are on Cassandra 3.x

Avoid incremental repair entirely. Use full repair only. The incremental repair rewrite in 4.0 made the feature production-ready; on 3.x it can corrupt repaired state and proliferate tiny SSTables.

If data resurrection has already occurred

Run a full repair immediately on affected tables to reconcile replicas. Then audit application data for inconsistencies. There is no automatic rollback for resurrected data; you must identify and re-delete invalid rows at the application layer after repair restores consistency.

Prevention

  • Alert on the gap, not the failure. Set an alert when last successful repair exceeds 80% of gc_grace_seconds. Do not wait for data resurrection.
  • Verify completion, not just invocation. Monitoring that calls nodetool repair is insufficient. Confirm all token ranges finish successfully.
  • Schedule during off-peak windows. Repair generates heavy disk and network I/O. Running it during peak traffic creates the saturation that causes repair to stall.
  • Track compaction as a leading indicator. A node with rising pending compactions will soon fail to complete repair within its window. Treat compaction backlog as a repair risk.
  • Prefer automated orchestration. Cassandra 6.0 introduces automated repair orchestration, which removes external scheduling dependencies. Until then, use Reaper.

How Netdata helps

  • Correlate repair gaps with compaction backlog, disk I/O saturation, and streaming throughput to identify whether repair is stalled by resource contention.
  • Alert on system_distributed.repair_history age or nodetool repair_admin list state via custom data collection.
  • Visualize tombstone scan warnings alongside read latency percentiles to spot early signs of unrepaired tombstone accumulation.
  • Track per-node SSTable counts and pending compaction trends as leading indicators for repair capacity.