ClickHouse system log tables eating disk: query_log, part_log, and TTL
You are investigating disk growth on a ClickHouse node. User tables look reasonable, but system.query_log or system.part_log are consuming tens or hundreds of gigabytes. These tables use MergeTree-family engines and are subject to the same part-count and merge pressure limits as production tables. Without a TTL rule, data accumulates forever. During incidents with high query error rates or retry storms, query_log can expand fast enough to threaten disk capacity and stall logging.
ClickHouse flushes telemetry in batches to system.query_log, system.part_log, system.trace_log, and system.text_log. Each flush creates a data part. Without TTL, old parts are never dropped. On high-QPS clusters, frequent flushes create many parts, and every failed query is logged, so error spikes accelerate growth. Eventually the system tables become the largest disk consumers. If part counts cross parts_to_delay_insert or parts_to_throw_insert, new log inserts are delayed or rejected.
What this means
System log tables behave exactly like MergeTree tables: inserts create active parts, background merges consolidate them, and part-count limits apply. No TTL means unbounded disk growth.
The risk is not just space. When system.query_log accumulates merge debt, it competes with production tables for background pool threads and disk I/O. If it crosses its part limit, new log entries are rejected. You lose observability exactly when you need it.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| No TTL on system log tables | system.query_log or system.part_log rank among top disk consumers; oldest parts span months | SELECT ... FROM system.parts WHERE database = 'system' for size and part counts |
| High-error incident | query_log disk usage spikes after an outage; error rate correlates with growth | FailedQuery in system.events around the incident window |
| High merge or insert activity | part_log is large on write-heavy clusters with frequent merges or mutations | Merge throughput and part_log event volume |
| Merge starvation on system tables | system.query_log part count climbs toward parts_to_throw_insert | Active parts per partition for system tables |
Quick checks
-- System tables consuming disk
SELECT
database,
table,
formatReadableSize(sum(bytes_on_disk)) AS size,
count() AS active_parts
FROM system.parts
WHERE active = 1 AND database = 'system'
GROUP BY database, table
ORDER BY size DESC;
-- Per-partition part counts for system tables
SELECT
table,
partition_id,
count() AS parts_in_partition
FROM system.parts
WHERE active = 1 AND database = 'system'
GROUP BY table, partition_id
ORDER BY parts_in_partition DESC
LIMIT 10;
-- Query log volume over the last 24 hours
SELECT
toStartOfHour(event_time) AS hour,
count() AS queries,
countIf(type = 'ExceptionWhileProcessing') AS errors
FROM system.query_log
WHERE event_time > now() - INTERVAL 24 HOUR
GROUP BY hour
ORDER BY hour DESC;
-- Inspect engine definition for TTL and partitioning
SELECT engine_full FROM system.tables WHERE database = 'system' AND name = 'query_log';
-- Insert limits hit on any table
SELECT event, value
FROM system.events
WHERE event IN ('RejectedInserts', 'DelayedInserts');
-- Active merges on system tables
SELECT
database,
table,
elapsed,
progress
FROM system.merges
WHERE database = 'system';
How to diagnose it
- Measure system table footprint. Run the
system.partssize query. Ifsystem.query_log,system.part_log,trace_log, ortext_logexceed your retention budget, they are the primary consumers. - Check part density. High active part counts in
system.query_logindicate merge pressure. - Correlate with workload. A spike in
ExceptionWhileProcessingor overall query volume explains suddenquery_loggrowth. - Verify TTL.
SELECT engine_full FROM system.tablesorSHOW CREATE TABLE system.query_logreveals whether a TTL clause exists. Absence confirms unbounded growth. - Assess merge health. Query
system.mergesfor thesystemdatabase. If merges are scarce while parts grow, the table will hit limits regardless of TTL.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
System table disk size (system.parts.bytes_on_disk) | Direct measure of unbounded growth | Any system table larger than your largest production table |
Active parts in system.query_log | Predicts too-many-parts on the log table itself | Part count trending toward your parts_to_delay_insert threshold |
| Failed query rate | Failed queries are still logged; errors accelerate query_log growth | Sudden spike in ExceptionWhileProcessing entries |
system.part_log size | High merge activity generates many part events | Steady growth on write-heavy clusters |
DelayedInserts / RejectedInserts | System tables can trigger these limits internally | Nonzero deltas with system tables near part limits |
Fixes
Configure TTL on system tables
Set a TTL expression on system.query_log, system.part_log, system.trace_log, and system.text_log. Default installations usually partition these tables by toYYYYMMDD(event_date), so a TTL on event_date drops entire partitions efficiently during merges.
For an existing table, apply TTL with ALTER TABLE:
ALTER TABLE system.query_log MODIFY TTL event_date + INTERVAL 7 DAY;
For tables created from server configuration, set the <ttl> element inside the log table’s configuration block:
<query_log>
<ttl>event_date + INTERVAL 30 DAY</ttl>
</query_log>
Configuration changes do not alter tables that already exist. Apply ALTER TABLE to existing system tables, or drop and recreate the table after changing its configuration.
Reclaim space from an overgrown system table
TTL runs during merges. If a system table already has hundreds of active parts and merges are starved, TTL alone may not reclaim space quickly.
For immediate relief, drop old partitions directly after verifying partition IDs in system.parts:
ALTER TABLE system.query_log DROP PARTITION ID '20240601';
Warning: this permanently deletes data for that partition. Verify the partition ID and backup any critical forensic data first.
If merges are proceeding but too slowly, reduce incoming query volume or error rates to slow part creation while TTL catches up.
Balance retention against forensics
Shorter TTL saves disk and reduces merge debt. Longer TTL aids incident investigation. query_log is usually the most valuable retention target. trace_log and text_log are often the largest relative to their utility, so consider shorter retention for verbose logs.
Ensure time-based partitioning
Default system log tables are partitioned by toYYYYMMDD(event_date). If the engine definition shows a non-time partition key, TTL drops become expensive because they must delete row ranges inside parts instead of dropping whole partitions. Keep time-based partitioning when customizing system log engine definitions.
Prevention
- Set TTL on all system MergeTree tables before taking production traffic.
- Include
system.query_logandsystem.part_login disk capacity planning. - Monitor the
systemdatabase size with the same urgency as production databases. - During incident response, expect
query_logto expand rapidly and verify it has disk and part-count headroom.
How Netdata helps
- Correlates ClickHouse disk usage with table-level part counts to surface when
system.query_logorsystem.part_logbecome top consumers. - Tracks
RejectedInsertsandDelayedInsertsto catch system tables approaching part limits. - Monitors query error rates to warn that
query_logis about to balloon. - Visualizes active part counts per table, including system tables, to spot merge debt before it blocks inserts.
- Alerts on disk space utilization with awareness of merge headroom requirements.
Related guides
- ClickHouse active part count growing: reading MaxPartCountForPartition before it pages
- ClickHouse ALTER UPDATE/DELETE overuse: why mutations are not row updates
- ClickHouse async inserts: when async_insert fixes too-many-parts and when it hides it
- ClickHouse background pool saturation: when merges and mutations starve
- ClickHouse mark cache and uncompressed cache: reading low hit rates
- ClickHouse client connections climbing: TCP 9000, HTTP 8123, and connection leaks
- ClickHouse checksum mismatch and broken parts: detecting data corruption
- ClickHouse DelayedInserts climbing: the warning before too-many-parts
- ClickHouse detached parts piling up: reading system.detached_parts and reclaiming space
- ClickHouse disk space collapse: why merges need free space and how the spiral starts
- ClickHouse disk space monitoring: free_space, unreserved_space, and the 80% target
- ClickHouse distributed DDL stuck: ON CLUSTER queries that never finish







