ClickHouse

Plugin: go.d.plugin Module: clickhouse

Overview

This collector retrieves performance data from ClickHouse for connections, queries, resources, replication, IO, and data operations (inserts, selects, merges) using HTTP requests and ClickHouse system tables. It monitors your ClickHouse server’s health and activity.

It sends HTTP requests to the ClickHouse HTTP interface, executing SELECT queries to retrieve data from various system tables. Specifically, it collects metrics from the following tables:

system.metrics
system.async_metrics
system.events
system.disks
system.parts
system.processes

This collector is supported on all platforms.

This collector supports collecting metrics from multiple instances of this integration, including remote instances.

Default Behavior

Auto-Detection

By default, it detects ClickHouse instances running on localhost that are listening on port 8123. On startup, it tries to collect metrics from:

http://127.0.0.1:8123

Limits

The default configuration for this integration does not impose any limits on data collection.

Performance Impact

The default configuration for this integration is not expected to impose a significant performance impact on the system.

Setup

Prerequisites

No action required.

Configuration

File

The configuration file name for this integration is go.d/clickhouse.conf.

You can edit the configuration file using the edit-config script from the Netdata config directory.

cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/clickhouse.conf

Options

The following options can be defined globally: update_every, autodetection_retry.

Name	Description	Default	Required
update_every	Data collection frequency.	1	no
autodetection_retry	Recheck interval in seconds. Zero means no recheck will be scheduled.	0	no
url	Server URL.	http://127.0.0.1:8123	yes
timeout	HTTP request timeout.	1	no
username	Username for basic HTTP authentication.		no
password	Password for basic HTTP authentication.		no
proxy_url	Proxy URL.		no
proxy_username	Username for proxy basic HTTP authentication.		no
proxy_password	Password for proxy basic HTTP authentication.		no
method	HTTP request method.	GET	no
body	HTTP request body.		no
headers	HTTP request headers.		no
not_follow_redirects	Redirect handling policy. Controls whether the client follows redirects.	no	no
tls_skip_verify	Server certificate chain and hostname validation policy. Controls whether the client performs this check.	no	no
tls_ca	Certification authority that the client uses when verifying the server’s certificates.		no
tls_cert	Client TLS certificate.		no
tls_key	Client TLS key.		no

Examples

Basic

A basic example configuration.

jobs:
  - name: local
    url: http://127.0.0.1:8123

HTTP authentication

Basic HTTP authentication.

jobs:
  - name: local
    url: http://127.0.0.1:8123
    username: username
    password: password

HTTPS with self-signed certificate

ClickHouse with enabled HTTPS and self-signed certificate.

jobs:
  - name: local
    url: https://127.0.0.1:8123
    tls_skip_verify: yes

Multi-instance

Note: When you define multiple jobs, their names must be unique.

Collecting metrics from local and remote instances.

jobs:
  - name: local
    url: http://127.0.0.1:8123

  - name: remote
    url: http://192.0.2.1:8123

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Per ClickHouse instance

These metrics refer to the entire monitored application.

This scope has no labels.

Metrics:

Metric	Dimensions	Unit
clickhouse.connections	tcp, http, mysql, postgresql, interserver	connections
clickhouse.slow_reads	slow	reads/s
clickhouse.read_backoff	read_backoff	events/s
clickhouse.memory_usage	used	bytes
clickhouse.running_queries	running	queries
clickhouse.queries_preempted	preempted	queries
clickhouse.queries	successful, failed	queries/s
clickhouse.select_queries	successful, failed	selects/s
clickhouse.insert_queries	successful, failed	inserts/s
clickhouse.queries_memory_limit_exceeded	mem_limit_exceeded	queries/s
clickhouse.longest_running_query_time	longest_query_time	seconds
clickhouse.queries_latency	queries_time	microseconds
clickhouse.select_queries_latency	selects_time	microseconds
clickhouse.insert_queries_latency	inserts_time	microseconds
clickhouse.io	reads, writes	bytes/s
clickhouse.iops	reads, writes	ops/s
clickhouse.io_errors	read, write	errors/s
clickhouse.io_seeks	lseek	ops/s
clickhouse.io_file_opens	file_open	ops/s
clickhouse.replicated_parts_current_activity	fetch, send, check	parts
clickhouse.replicas_max_absolute_dela	replication_delay	seconds
clickhouse.replicated_readonly_tables	read_only	tables
clickhouse.replicated_data_loss	data_loss	events
clickhouse.replicated_part_fetches	successful, failed	fetches/s
clickhouse.inserted_rows	inserted	rows/s
clickhouse.inserted_bytes	inserted	bytes/s
clickhouse.rejected_inserts	rejected	inserts/s
clickhouse.delayed_inserts	delayed	inserts/s
clickhouse.delayed_inserts_throttle_time	delayed_inserts_throttle_time	milliseconds
clickhouse.selected_bytes	selected	bytes/s
clickhouse.selected_rows	selected	rows/s
clickhouse.selected_parts	selected	parts/s
clickhouse.selected_ranges	selected	ranges/s
clickhouse.selected_marks	selected	marks/s
clickhouse.merges	merge	ops/s
clickhouse.merges_latency	merges_time	milliseconds
clickhouse.merged_uncompressed_bytes	merged_uncompressed	bytes/s
clickhouse.merged_rows	merged	rows/s
clickhouse.merge_tree_data_writer_inserted_rows	inserted	rows/s
clickhouse.merge_tree_data_writer_uncompressed_bytes	inserted	bytes/s
clickhouse.merge_tree_data_writer_compressed_bytes	written	bytes/s
clickhouse.uncompressed_cache_requests	hits, misses	requests/s
clickhouse.mark_cache_requests	hits, misses	requests/s
clickhouse.max_part_count_for_partition	max_parts_partition	parts
clickhouse.parts_count	temporary, pre_active, active, deleting, delete_on_destroy, outdated, wide, compact	parts
distributed_connections	active	connections
distributed_connections_attempts	connection	attempts/s
distributed_connections_fail_retries	connection_retry	fails/s
distributed_connections_fail_exhausted_retries	connection_retry_exhausted	fails/s
distributed_files_to_insert	pending_insertions	files
distributed_rejected_inserts	rejected	inserts/s
distributed_delayed_inserts	delayed	inserts/s
distributed_delayed_inserts_latency	delayed_time	milliseconds
distributed_sync_insertion_timeout_exceeded	sync_insertion	timeouts/s
distributed_async_insertions_failures	async_insertions	failures/s
clickhouse.uptime	uptime	seconds

Per disk

These metrics refer to the Disk.

Labels:

Label	Description
disk_name	Name of the disk as defined in the server configuration.

Metrics:

Metric	Dimensions	Unit
clickhouse.disk_space_usage	free, used	bytes

Per table

These metrics refer to the Database Table.

Labels:

Label	Description
database	Name of the database.
table	Name of the table.

Metrics:

Metric	Dimensions	Unit
clickhouse.database_table_size	size	bytes
clickhouse.database_table_parts	parts	parts
clickhouse.database_table_rows	rows	rows

Alerts

The following alerts are available:

Alert name	On metric	Description
clickhouse_restarted	clickhouse.uptime	ClickHouse has recently been restarted
clickhouse_queries_preempted	clickhouse.queries_preempted	ClickHouse has queries that are stopped and waiting due to priority setting
clickhouse_long_running_query	clickhouse.longest_running_query_time	ClickHouse has a long-running query exceeding the threshold
clickhouse_rejected_inserts	clickhouse.rejected_inserts	ClickHouse has INSERT queries that are rejected due to high number of active data parts for partition in a MergeTree
clickhouse_delayed_inserts	clickhouse.delayed_inserts	ClickHouse has INSERT queries that are throttled due to high number of active data parts for partition in a MergeTree
clickhouse_replication_lag	clickhouse.replicas_max_absolute_delay	ClickHouse is experiencing replication lag greater than 5 minutes
clickhouse_replicated_readonly_tables	clickhouse.replicated_readonly_tables	ClickHouse has replicated tables in readonly state due to ZooKeeper session loss/startup without ZooKeeper configured
clickhouse_max_part_count_for_partition	clickhouse.max_part_count_for_partition	ClickHouse high number of parts per partition
clickhouse_distributed_connections_failures	clickhouse.distributed_connections_fail_exhausted_retries	ClickHouse has failed distributed connections after exhausting all retry attempts
clickhouse_distributed_files_to_insert	clickhouse.distributed_files_to_insert	ClickHouse high number of pending files to process for asynchronous insertion into Distributed tables

Troubleshooting

Debug Mode

Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.

To troubleshoot issues with the clickhouse collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn’t working.

Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that’s not the case on your system, open netdata.conf and look for the plugins setting under [directories].
```
cd /usr/libexec/netdata/plugins.d/
```
Switch to the netdata user.
```
sudo -u netdata -s
```
Run the go.d.plugin to debug the collector:
```
./go.d.plugin -d -m clickhouse
```

Getting Logs

If you’re encountering problems with the clickhouse collector, follow these steps to retrieve logs and identify potential issues:

Run the command specific to your system (systemd, non-systemd, or Docker container).
Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.

System with systemd

Use the following command to view logs generated since the last Netdata service restart:

journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep clickhouse

System without systemd

Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector’s name:

grep clickhouse /var/log/netdata/collector.log

Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.

Docker Container

If your Netdata runs in a Docker container named “netdata” (replace if different), use this command:

docker logs netdata 2>&1 | grep clickhouse

Industry

Technology

Use cases

ClickHouse

ClickHouse

Overview

Default Behavior

Auto-Detection

Limits

Performance Impact

Setup

Prerequisites

Configuration

File

Options

Examples

Basic

HTTP authentication

HTTPS with self-signed certificate

Multi-instance

Metrics

Per ClickHouse instance

Per disk

Per table

Alerts

Troubleshooting

Debug Mode

Getting Logs

System with systemd

System without systemd

Docker Container

The observability platform companies need to succeed