ClickHouse No space left on device: emergency recovery when the data disk fills
When ClickHouse returns No space left on device to client inserts or the server log fills with write errors, the situation is past a simple capacity alert. ClickHouse does not degrade gradually on a full disk. Background merges halt immediately because they require temporary free space to write combined parts before removing source files. Once merges stop, small insert parts accumulate, metadata overhead grows, and disk usage accelerates. TTL-based expiration also stops because TTL cleanup is executed by merges. ClickHouse system tables such as system.query_log and system.part_log are MergeTree tables that can grow unbounded and consume the remaining space.
Recovery requires manual intervention. The goal is to reclaim enough space for merges to resume, identify what consumed the disk, and fix the root cause before the next cycle. Do not restart the server as a first step. A restart with a full disk can prevent ClickHouse from starting if it needs to write system tables or temporary files during initialization.
flowchart TD
A[Disk > 85% full] --> B[Merges cannot allocate temp space]
B --> C[Merges halt]
C --> D[Small parts accumulate]
D --> E[Metadata and indexes grow]
E --> F[TTL cleanup stops]
F --> G[Disk usage accelerates]
G --> H[Inserts fail with No space left]What this means
ClickHouse stores data in immutable parts. The merge engine continuously combines smaller parts into larger ones to maintain query performance and control file descriptor usage. A merge reads all source part files and writes a new merged part to the same disk before deleting the sources. A single merge temporarily requires space for both source and output parts.
When the disk crosses the threshold where the largest potential merge cannot be completed, ClickHouse stops scheduling new merges. Existing merges may stall or fail. Without merges, every insert creates a new part that persists indefinitely. The part count rises, increasing files, index entries, and memory structures. This metadata growth itself consumes additional disk space.
TTL relies on merges to remove expired rows, so old data stays on disk. If the volume also hosts ZooKeeper transaction logs, Keeper snapshots, or application logs, those can tip the system from “nearly full” to “hard stop.” Inserts fail, read queries may fail if they require temporary space, and the server can crash if it cannot write logs or system tables.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Ingestion outpacing TTL or retention | Disk usage grows steadily day over day; old monthly partitions are still present and large. | system.parts for partition ages and sizes. |
| Merge death spiral from prior pressure | Disk is above 85%, active parts per partition are climbing, and system.merges is empty. | Merge activity and part count trends together. |
| Mutation backlog creating temporary copies | Disk usage is high despite flat ingestion; system.mutations shows long-running or stuck jobs. | Pending mutations with is_done = 0. |
| System tables growing unbounded | system.query_log or system.part_log consumes tens or hundreds of gigabytes. | bytes_on_disk grouped by database and table, including system tables. |
| Detached parts not cleaned up | Used disk space exceeds the sum of active parts; old detached directories linger. | system.detached_parts or the detached/ directories on disk. |
| Non-ClickHouse files on the data volume | df shows less free space than system.disks.free_space suggests. | OS-level disk usage outside of ClickHouse data paths. |
Quick checks
Run these read-only queries and commands in order.
# OS-level disk usage on the data mount
df -h /var/lib/clickhouse
-- ClickHouse's view of disk space and reservation
SELECT
name,
path,
formatReadableSize(free_space) AS free,
formatReadableSize(total_space) AS total,
round(100 * (1 - free_space / total_space), 1) AS used_pct,
formatReadableSize(unreserved_space) AS unreserved
FROM system.disks;
-- Largest partitions by on-disk size
SELECT
database,
table,
partition_id,
count() AS active_parts,
sum(rows) AS total_rows,
formatReadableSize(sum(bytes_on_disk)) AS size
FROM system.parts
WHERE active = 1
GROUP BY database, table, partition_id
ORDER BY size DESC
LIMIT 20;
-- Are merges running, or is the pipeline frozen?
SELECT
count(*) AS active_merges,
countIf(is_mutation = 1) AS mutations,
countIf(is_mutation = 0) AS regular_merges
FROM system.merges;
-- Pending mutations that could be blocking merges and consuming space
SELECT
database,
table,
mutation_id,
command,
create_time,
is_done,
parts_to_do
FROM system.mutations
WHERE is_done = 0
ORDER BY create_time;
-- Space consumed by system tables themselves
SELECT
database,
table,
formatReadableSize(sum(bytes_on_disk)) AS disk_size,
count() AS parts
FROM system.parts
WHERE active = 1
AND database = 'system'
GROUP BY database, table
ORDER BY disk_size DESC
LIMIT 10;
-- Detached parts that still occupy disk but are invisible to queries
SELECT
database,
table,
name,
reason,
formatReadableSize(bytes_on_disk) AS size,
modification_time
FROM system.detached_parts
ORDER BY bytes_on_disk DESC
LIMIT 20;
-- Confirm whether inserts are being delayed or rejected
SELECT event, value
FROM system.events
WHERE event IN ('DelayedInserts', 'RejectedInserts');
How to diagnose it
Quantify the gap. Compare OS
dfoutput againstsystem.disks. If the OS shows significantly less free space than ClickHouse reports, the difference is consumed by non-ClickHouse files, detached parts, or filesystem overhead. Check for core dumps, package caches, or oversized logs.Find the largest consumers. Rank partitions by
bytes_on_diskusing the query above. Focus on the top three. In most emergencies, one or two large tables or old partitions dominate.Determine if merges are blocked. If
system.mergesreturns zero rows while active parts are high, the merge pipeline is stalled. Checksystem.mutationsnext. Pending mutations can hold merge threads and temporarily duplicate parts.Check system table bloat. If
system.query_log,system.part_log, orsystem.text_logrank high in the size query, the monitoring instrumentation itself is contributing to the emergency. This is common on high-QPS clusters where TTL was never configured for log tables.Assess detached parts.
system.detached_partsshows parts that were removed from active service but not deleted. They continue to consume space. Large counts here usually follow failedALTERoperations, replication fetch failures, or manualDETACHcommands that were never cleaned up.Verify TTL health. If TTL is configured but expired data remains, confirm merges are running. TTL only drops data during merges. A full disk blocks merges, which blocks TTL, which prevents space reclamation.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Disk free space | Merges need temporary headroom equal to the largest source parts. | Free space below 20% or unreserved_space approaching zero. |
| Active part count per partition | Parts grow when merges halt, accelerating metadata overhead. | Count rising while merge activity is zero. |
| Merge activity | Confirms the background pipeline is actually making progress. | No entries in system.merges for more than 10 minutes during active inserts. |
| Mutation queue depth | Mutations block merges and temporarily double part sizes. | Any is_done = 0 mutation with parts_to_do not decreasing over time. |
| System table sizes | query_log and part_log are MergeTree tables that can fill the disk. | Any system table larger than 10% of total data size. |
| Insert delays and rejections | Early indicators that the part count or disk pressure threshold is crossed. | DelayedInserts or RejectedInserts counters increasing. |
Fixes
Immediate stabilization: stop the bleeding
Stop or throttle ingestors before deleting anything. Each insert creates a new part that adds metadata overhead. If replication is active, pause non-essential distributed sends and DDL operations. Do not restart ClickHouse yet; a restart with zero free space can fail during initialization.
Reclaim space from old partitions
Detach old partitions to recover space without destroying data permanently. Detached partitions are no longer visible to queries but remain on disk in the detached/ directory. They can be reattached later once space is available.
-- Detach a specific old partition (safe, reversible)
ALTER TABLE db.table DETACH PARTITION ID '202401';
To permanently delete a partition and immediately reclaim space, use DROP PARTITION. This is destructive and irreversible.
-- Permanently drop a partition (destructive)
ALTER TABLE db.table DROP PARTITION ID '202401';
After detaching or dropping, run the system.disks check again to confirm space was freed.
Unblock system tables
If system.query_log, system.part_log, or similar tables are consuming significant space, truncate them for immediate relief. This destroys log history, so extract any needed forensic data first.
-- Truncate a system log table (destructive to history)
TRUNCATE TABLE system.query_log;
For long-term prevention, configure TTL on these tables in the server configuration so they do not grow unbounded. The change requires a server restart to apply.
Clean up detached parts
If detached parts from past operations are consuming space and are no longer needed, remove them from the detached/ directory. This is destructive; the data cannot be recovered without restoring from backup. Paths vary by database engine and configuration; verify the exact location before running destructive commands.
# Remove detached parts for a specific table (destructive)
rm -rf /var/lib/clickhouse/data/db/table/detached/*
Prefer querying system.detached_parts first to document what will be removed.
Cancel mutations that are doubling disk usage
If a long-running mutation is holding merge threads and creating temporary part copies, kill it. The mutation must be reissued later, but cancellation frees threads and stops temporary space consumption.
-- Kill a specific mutation
KILL MUTATION WHERE database = 'db' AND table = 'table' AND mutation_id = '0000000000';
Restore merge activity
Once free space is above the merge threshold, verify that merges resume:
-- Check for new merge activity
SELECT * FROM system.merges LIMIT 5;
Monitor system.parts over the next 10 to 30 minutes. Active part counts should begin trending downward. If parts do not decrease, investigate whether the background pool is saturated or whether I/O latency is preventing merge progress.
Structural fixes
If disk expansion is the only option, add space to the underlying volume or move older tables to tiered storage. Do not rely on TTL alone to save you in the immediate emergency; TTL requires merges, and merges require free space.
Prevention
- Maintain merge headroom. Keep disk usage below 80% as an operational target. The safety margin must be at least 2x the size of the largest active partition to allow a full merge to complete.
- Configure TTL on system tables. Set retention policies for
system.query_log,system.part_log,system.text_log, andsystem.trace_log. On high-traffic clusters, these can outgrow user data. - Monitor the merge-to-insert ratio. Track part creation rate against merge completion rate. If parts are created faster than they are merged, part count and disk usage will grow until they trigger a crisis.
- Audit partitioning strategy. Overly granular partitioning (for example, by hour) multiplies part counts and metadata overhead. Daily or monthly partitioning is usually more space-efficient.
- Clean detached parts proactively. Include
system.detached_partsin routine checks. Detached parts left after incidents or failed alters silently consume space indefinitely. - Set disk utilization alerts below the danger zone. Page when disk usage exceeds 80%, not at 95%. The extra runway is needed for merge temp space.
How Netdata helps
- Real-time correlation of disk utilization, merge activity, and active part count exposes the merge death spiral before inserts fail.
- Per-partition part counts and growth rates predict disk pressure hours before the filesystem fills.
DelayedInsertsandRejectedInsertsevent counters alert on write-pipeline backpressure.- System table size monitoring alongside user data catches cases where
query_logorpart_logbecome the primary disk consumer. - Background pool utilization and mutation queue depth distinguish disk-full causes from merge thread starvation.
Related guides
- ClickHouse active part count growing: reading MaxPartCountForPartition before it pages
- ClickHouse ALTER UPDATE/DELETE overuse: why mutations are not row updates
- ClickHouse async inserts: when async_insert fixes too-many-parts and when it hides it
- ClickHouse DelayedInserts climbing: the warning before too-many-parts
- ClickHouse distributed DDL stuck: ON CLUSTER queries that never finish
- ClickHouse insert latency rising: the leading indicator of write-pipeline trouble
- ClickHouse cannot connect to ZooKeeper/Keeper: diagnosing the coordination layer
- ClickHouse Keeper latency high: the early warning before sessions expire
- ClickHouse Keeper saturation spiral: too many tables, DDL storms, and cluster freeze
- ClickHouse Memory limit (for query) exceeded: per-query limits and GROUP BY/JOIN blowups
- ClickHouse Memory limit (total) exceeded - server-wide memory pressure and fixes
- ClickHouse memory pressure death spiral: runaway queries, retries, and OOM







