ClickHouse too many open files: file descriptors, part count, and nofile limits
ClickHouse aborts queries with “Too many open files” or the server process dies. Logs show errors about failing to open column files or metadata. This is not a traditional leak; it is a capacity cliff. Every active MergeTree part keeps multiple files open, background merges temporarily spike that count, and the Linux nofile limit is usually the bottleneck. The default limit of 1024 is catastrophic for production ClickHouse.
A table with 100 columns and 200 active parts needs roughly 20,000 file descriptors for column files alone, plus metadata, sockets, and log files. A merge that combines several source parts into one opens all source and target column files simultaneously. When the hard limit is reached, the next open() fails with EMFILE, which crashes the server or causes cascading query failures.
What this means
Each MergeTree data part is a directory containing one file per column plus index, checksum, and metadata files. ClickHouse opens these files to read or write the part. More active parts means more open files. Wide tables amplify this because file count scales linearly with columns.
Background merges make usage spiky. When ClickHouse merges N source parts into one target part, it opens all source and all target column files concurrently. A single merge can spike file descriptor usage far above the steady-state level. If the nofile limit is close to typical usage, merges push you over.
The failure mode is a hard cliff. Once the process hits the nofile limit, open() returns EMFILE. ClickHouse may crash, refuse new connections, or fail queries with errors about unreadable parts. There is no graceful degradation.
flowchart TD
A[Inserts create parts] --> B[Active part count grows]
B --> C[Column files + metadata kept open]
C --> D[Baseline FD usage rises]
D --> E[Merge opens all source and target files]
E --> F[FD usage spikes]
F --> G[Hit nofile hard limit]
G --> H[Open fails server crash or query errors]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Part count explosion | FD count tracks active parts; DelayedInserts is climbing | system.parts active count per partition |
| Low nofile limit | FD usage is near the limit even under normal load | /proc/<pid>/limits for Max open files |
| Wide tables | FD count is disproportionately high relative to part count | Compare columns per table against FD baseline |
| Merge storms | Sudden FD spikes that correlate with heavy merge activity | system.merges for num_parts and elapsed time |
| Connection leaks | FD count is high but active part count is normal | system.metrics for TCPConnection and HTTPConnection |
Quick checks
Run these read-only checks from the server host.
# Current open FD count for the ClickHouse process
ls /proc/$(pgrep clickhouse-server)/fd | wc -l
# Soft and hard nofile limits
cat /proc/$(pgrep clickhouse-server)/limits | grep "Max open files"
# What kinds of FDs are open (files, sockets, pipes)
ls -l /proc/$(pgrep clickhouse-server)/fd | awk '{print $NF}' | sort | uniq -c | sort -nr | head
-- ClickHouse-tracked open file handles
SELECT metric, value
FROM system.metrics
WHERE metric IN ('OpenFileForRead', 'OpenFileForWrite');
-- TODO: verify metric names for your ClickHouse version
-- Active parts per partition (the primary FD driver)
SELECT
database,
table,
partition_id,
count() AS active_parts
FROM system.parts
WHERE active = 1
GROUP BY database, table, partition_id
ORDER BY active_parts DESC
LIMIT 20;
-- Active merges (temporary FD spikes)
SELECT
database,
table,
elapsed,
num_parts,
is_mutation
FROM system.merges
ORDER BY elapsed DESC;
-- Background pool saturation
SELECT metric, value
FROM system.metrics
WHERE metric LIKE 'Background%Pool%';
-- Client connection count
SELECT metric, value
FROM system.metrics
WHERE metric IN ('TCPConnection', 'HTTPConnection', 'InterserverConnection');
How to diagnose it
Confirm the limit is being hit. Check
/proc/$(pgrep clickhouse-server)/limitsfor theMax open filesline and compare it tols /proc/$(pgrep clickhouse-server)/fd | wc -l. If the count is within 10-20% of the soft limit, you are in the danger zone. grep the ClickHouse server logs for “Too many open files” or “EMFILE”.Correlate FD usage with part count. Run the
system.partsquery. Each active part contributes roughly one FD per column plus a few for metadata. If your top partitions have hundreds of active parts and your tables have dozens or hundreds of columns, the math explains the FD count.Check for merge-induced spikes. Query
system.merges. If merges are combining many parts or have been running for a long time, they are the likely trigger that pushed FD usage over the limit even if the baseline seemed safe.Identify wide-table hotspots. If one table dominates the FD count despite having fewer parts than others, it likely has many columns. Wide tables are FD-intensive because every part opens a file for every column.
Rule out connection leaks. If
TCPConnectionorHTTPConnectionis unexpectedly high and stable while query load is low, clients may be leaking connections. Each connection consumes an FD independently of parts.Verify systemd or container limits. Even if you raised
/etc/security/limits.conf, a ClickHouse process started by systemd or inside a container inherits limits from the runtime, not from PAM. Check/proc/<pid>/limitsto see the effective limit.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Open FDs vs nofile limit | Proximity to the hard cliff | > 70% of soft limit sustained |
| Active parts per partition | Primary driver of baseline FD usage | > 50% of parts_to_delay_insert default (1000) |
OpenFileForRead / OpenFileForWrite | ClickHouse internal tracking of open handles | Steady growth without corresponding insert rate increase |
| Background pool utilization | Saturated pool delays merges, letting parts accumulate | > 90% sustained for > 10 minutes |
| Merge activity | Merges temporarily spike FD usage | Long-running merges with high num_parts |
| Client connection count | Connections consume FDs independently of data parts | > 80% of max_connections |
Fixes
Raise the nofile limit
Set the limit to at least 100000 before ClickHouse starts; ClickHouse recommends 262144. The limit is fixed at process start time, so changing it requires restarting ClickHouse. Plan this during a maintenance window.
- For systemd: set
LimitNOFILE=262144in the service unit, runsystemctl daemon-reload, then restart ClickHouse. - For containers: configure the limit at the runtime or orchestrator level.
- For bare metal or init.d: set the limit in the environment that starts the process.
Do not rely on /etc/security/limits.conf alone for systemd or container deployments. Verify with /proc/<pid>/limits after restart.
Reduce part count pressure immediately
If you cannot restart, reduce active parts to lower open files:
- Throttle or pause inserts to stop new part creation.
- Kill blocking mutations with
KILL MUTATIONifsystem.mutationsshows long-running mutations consuming the background pool. See ClickHouse ALTER UPDATE/DELETE overuse. - Detach old partitions to remove their files from the active set. WARNING: This makes data unavailable until reattached.
Tune merge capacity
If the merge pool is saturated, merges cannot complete fast enough to close files and free FDs:
- Check
system.mergesand background pool metrics. If merges are stuck due to disk space, see ClickHouse disk space collapse. - Ensure
background_merges_mutations_concurrency_ratiois appropriate for your CPU and I/O capacity.
Fix connection leaks
If connection count is the primary FD consumer rather than parts:
- Identify leaking clients via
system.processesandsystem.metrics. - Set aggressive client-side connection timeouts and pool limits.
- Consider lowering
max_connectionstemporarily to force clients to queue rather than open infinite sockets.
Prevention
- Set nofile to 262144 at the OS and runtime level. The Linux default of 1024 is insufficient for production ClickHouse, and the limit is fixed at process start.
- Monitor part count at the partition level. The FD cost is per-partition and per-column, so a single hot partition can exhaust limits even when the table total looks safe.
- Batch inserts to 1000+ rows per INSERT. Many small inserts create parts faster than merges can close them, driving FD usage up permanently.
- Alert on the merge-to-insert ratio. When part creation chronically exceeds merge completion, FD usage trends upward until it hits the hard limit.
- Account for wide tables in capacity planning. A table with 500 columns opens an order of magnitude more files per part than a table with 10 columns.
- Verify container runtime limits independently. Orchestrators and container runtimes can impose their own nofile ceilings that override OS settings.
How Netdata helps
- Correlates open file descriptor usage with the process nofile limit, showing proximity to the hard cliff before crashes.
- Tracks active part count per partition and
MaxPartCountForPartitionto identify FD growth drivers. - Monitors background pool utilization and merge activity to flag spikes in temporary FD consumption.
- Surfaces
DelayedInsertsandRejectedInserts, which often rise alongside FD pressure. - Displays per-query memory and I/O alongside FD metrics to distinguish part-driven exhaustion from connection leaks or query storms.
Related guides
- ClickHouse active part count growing: reading MaxPartCountForPartition before it pages
- ClickHouse ALTER UPDATE/DELETE overuse: why mutations are not row updates
- ClickHouse async inserts: when async_insert fixes too-many-parts and when it hides it
- ClickHouse DelayedInserts climbing: the warning before too-many-parts
- ClickHouse disk space collapse: why merges need free space and how the spiral starts
- ClickHouse disk space monitoring: free_space, unreserved_space, and the 80% target
- ClickHouse distributed DDL stuck: ON CLUSTER queries that never finish
- ClickHouse insert latency rising: the leading indicator of write-pipeline trouble
- ClickHouse cannot connect to ZooKeeper/Keeper: diagnosing the coordination layer
- ClickHouse Keeper latency high: the early warning before sessions expire
- ClickHouse Keeper saturation spiral: too many tables, DDL storms, and cluster freeze
- ClickHouse Memory limit (for query) exceeded: per-query limits and GROUP BY/JOIN blowups







