ClickHouse background pool saturation: when merges and mutations starve

Insert latency is climbing and DelayedInserts is ticking up. system.merges shows every slot occupied, yet system.parts keeps growing. When the background merge and mutation pool saturates, new merges queue instead of starting, parts accumulate, and the distance to insert rejections shrinks fast. Distinguish true thread starvation from I/O-bound stalls, identify when mutations are the culprit, and relieve pressure before inserts fail.

What this means

ClickHouse consolidates immutable parts using a dedicated thread pool tracked in system.metrics as BackgroundMergesAndMutationsPoolTask. Concurrency is controlled by background_merges_mutations_concurrency_ratio (default 2); the older background_pool_size is deprecated. Merges and mutations share this pool. Separate pools handle fetches, moves, and distributed sends.

Pool saturation means every thread is occupied. The effects are:

  • New merges queue.
  • Merge completion rate drops below part creation rate.
  • Part count per partition rises.
  • Inserts delay once parts_to_delay_insert is crossed, then reject at parts_to_throw_insert.

Causes include too much work (mutations, many small inserts), too little concurrency, or threads stuck on disk I/O. The symptoms look identical in pool metrics, but the fixes differ.

flowchart TD
    A[Part count growing] --> B{Pool saturated?}
    B -->|Yes| C{Mutations running?}
    B -->|No| D[Check insert batch size and partition key]
    C -->|Yes| E[Kill or wait for mutations]
    C -->|No| F{I/O await high?}
    F -->|Yes| G[Reduce competing I/O or add capacity]
    F -->|No| H[Increase concurrency ratio or throttle inserts]

Common causes

CauseWhat it looks likeFirst thing to check
Mutation backlogsystem.merges shows is_mutation = 1 occupying slots; system.mutations has is_done = 0 with slow or flat parts_to_doSELECT count(*) FROM system.mutations WHERE is_done = 0
Insert rate exceeds merge throughputSteady part-count growth, pool at 100%, no mutations, merges running but not fast enoughSELECT event, value FROM system.events WHERE event IN ('InsertQuery','InsertedRows')
I/O saturation masquerading as thread starvationPool threads are active but merge progress is frozen or crawling; disk await is highiostat -xz 1 5 and SELECT progress, elapsed FROM system.merges
Single large merge hogging slotsOne entry in system.merges with very large num_parts, elapsed over an hour, consuming most memorySELECT elapsed, num_parts, memory_usage FROM system.merges ORDER BY elapsed DESC
Many small inserts creating unmergeable overheadHigh InsertQuery count but low InsertedRows per query; part count spikes faster than usualRows per insert from system.events deltas

Quick checks

-- Check background pool utilization
SELECT metric, value
FROM system.metrics
WHERE metric LIKE 'Background%Pool%'
ORDER BY metric;
-- Check active merges and whether they are mutations
SELECT
    database,
    table,
    elapsed,
    progress,
    num_parts,
    is_mutation,
    formatReadableSize(memory_usage) AS mem
FROM system.merges
ORDER BY elapsed DESC;
-- Check for incomplete mutations
SELECT
    database,
    table,
    mutation_id,
    parts_to_do,
    is_done,
    latest_fail_reason
FROM system.mutations
WHERE is_done = 0
ORDER BY create_time;
-- Check active parts per partition (worst offenders)
SELECT
    database,
    table,
    partition_id,
    count(*) AS parts
FROM system.parts
WHERE active = 1
GROUP BY database, table, partition_id
ORDER BY parts DESC
LIMIT 10;
-- Check insert pressure and early warnings
SELECT event, value
FROM system.events
WHERE event IN ('DelayedInserts', 'RejectedInserts', 'InsertQuery', 'InsertedRows');
# Check disk I/O latency during the incident
iostat -xz 1 5

How to diagnose it

  1. Confirm pool saturation. Query system.metrics for BackgroundMergesAndMutationsPoolTask and BackgroundMergesAndMutationsPoolSize. A sustained ratio at or near 1.0 for more than 10 minutes during active inserts means saturation.
  2. See what occupies the slots. Query system.merges. If is_mutation = 1, mutations are competing directly with merges for threads.
  3. Check mutation backlog. Query system.mutations for is_done = 0. If parts_to_do is flat or failing to decrease, a mutation is stuck or very large.
  4. Correlate with part-count growth. Query system.parts grouped by partition_id. If the max per partition rises while the pool is full, merge throughput is the bottleneck.
  5. Distinguish I/O-bound from CPU/thread-bound stalls. In system.merges, sample progress twice over 60 seconds. If it barely moves and iostat shows high await, the threads are waiting on disk, not compute.
  6. Check for insert-pattern changes. Compare InsertQuery to InsertedRows in system.events. A sudden jump in query count with flat row count means smaller batches, which creates more parts per row and overwhelms the pool.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Background pool utilization (BackgroundMergesAndMutationsPoolTask / PoolSize)Headroom for merge and mutation workSustained > 0.9 for > 10 minutes
Active parts per partitionDistance to insert throttling and rejectionWithin 50% of configured parts_to_delay_insert
DelayedInsertsEarliest warning that merges are falling behindCounter increasing over successive samples
Merge progress velocityWhether active merges are completing or stuckprogress flat for > 5 minutes
Disk I/O awaitDistinguishes thread starvation from I/O wait> 20 ms on SSD, or trending sharply up
Mutation parts_to_doMutation backlog consuming pool capacityNot decreasing over a 30-minute window

Fixes

Kill or pause heavy mutations

If system.merges shows is_mutation = 1 and system.mutations has stalled work, cancel non-critical mutations to free pool capacity.

-- Identify the mutation
SELECT database, table, mutation_id
FROM system.mutations
WHERE is_done = 0
ORDER BY create_time
LIMIT 1;

-- Kill it. Destructive: partial work is discarded.
KILL MUTATION WHERE database = 'db' AND table = 'table' AND mutation_id = '0000000000';

Tradeoff: Killed mutations must be reissued later. On large tables, even valid mutations may take hours; schedule them during low-traffic windows.

Throttle or batch inserts

If the pool is saturated because inserts arrive faster than merges can consolidate, reduce the part creation rate. Increase batch size so each INSERT creates fewer parts, or temporarily slow the ingest pipeline.

Tradeoff: Upstream systems may need to buffer data or tolerate higher latency.

Increase merge concurrency

If the node has idle CPU and disk headroom, raise background_merges_mutations_concurrency_ratio above its default of 2. This increases the number of concurrent merge and mutation tasks. Configuration changes may not affect work already queued.

Tradeoff: More concurrent merges increase I/O and CPU consumption. Do not raise this if disk await is already elevated.

Address I/O saturation

If iostat shows high await and merge progress is frozen, the pool is not starved for threads; it is starved for disk. Reduce competing load:

  • Temporarily route heavy analytical queries away from the node.
  • Pause backups or large exports touching the same volumes.
  • If you use tiered storage, verify that hot-tier disk bandwidth is not the bottleneck.

Tradeoff: Reduced query capacity or added infrastructure cost.

Reclaim space to unblock merges

If disk space is tight, merges cannot allocate temporary output files and will stall. See ClickHouse disk space collapse: why merges need free space and how the spiral starts for emergency space recovery.

Prevention

  • Monitor the merge-to-insert ratio. Track part creation rate against merge completion rate. If the ratio stays above 1.0 for more than a few minutes, the pool is losing ground.
  • Avoid mutation storms. Do not issue ALTER UPDATE/DELETE during peak ingest. Mutations are full part rewrites, not row updates, and they monopolize pool threads. See ClickHouse ALTER UPDATE/DELETE overuse: why mutations are not row updates.
  • Batch inserts. Target thousands of rows per INSERT. Many small inserts create parts faster than any pool configuration can merge.
  • Size pools to I/O capacity, not just core count. A concurrency ratio of 2 is a starting point. If your storage cannot sustain random I/O from more concurrent merges, raising the ratio will only turn thread starvation into I/O saturation.

How Netdata helps

  • Correlate BackgroundMergesAndMutationsPoolTask with active part count and DelayedInserts on the same timeline to spot saturation.
  • Alert when pool utilization crosses 90% alongside positive part-count growth, catching the bottleneck before inserts delay.
  • Surface disk I/O latency next to merge activity to distinguish thread starvation from I/O waits without manual iostat.
  • Track mutation backlog and merge duration trends without polling system.mutations.
  • Visualize insert latency degradation as an early warning, often hours before RejectedInserts appears.