$ guides / elasticsearch / elasticsearch-too-many-segments ▌

Operations Guides

Elasticsearch too many segments per shard: search slowdown and force-merge

Searches that used to complete in tens of milliseconds are now taking hundreds. CPU on data nodes climbs without a traffic spike. Cluster health stays green, but query latency degrades and timeouts increase. The problem is invisible at the cluster level: shards have accumulated too many Lucene segments, and every query scans every segment in every relevant shard. When a shard holds more than roughly 100 segments, the background merge policy has fallen behind. Search slows, file descriptor usage rises, and merge I/O competes with indexing and queries. Since Elasticsearch 7.7, most segment metadata moved off-heap, so the node may not run out of JVM heap, but search performance degrades all the same.

What this means

A Lucene segment is an immutable chunk of an inverted index. During the write path, documents collect in an in-memory buffer; on refresh, Elasticsearch writes a new segment that is immediately searchable. On flush, segments become durable and the translog is truncated. Background merges combine small segments into larger ones, reclaim space from deleted documents, and improve search performance. Healthy shards typically hold 10 to 50 segments. When the count exceeds 100 per shard, merges are not keeping up with the creation rate.

Elasticsearch uses a scatter-gather model: the coordinating node broadcasts the query to one copy of every relevant shard. Each shard executes the query against its local segments and returns results. The slowest shard determines overall latency. When segment counts are high, every shard scan opens more files, iterates more data structures, and consumes more CPU. The node also holds more open file descriptors, which can approach system limits. Merge operations themselves consume disk I/O and CPU, so a merge backlog can saturate storage and slow indexing even further.

flowchart TD
    A[High indexing rate or frequent refresh] --> B[Many small segments created]
    C[Update-heavy workload] --> B
    B --> D[Merge backlog]
    D --> E[Segment count exceeds 100 per shard]
    E --> F[Search latency increases]
    E --> G[File descriptor pressure]
    D --> H[Disk I/O saturation]
    H --> I[Indexing slowdown]

Common causes

Cause	What it looks like	First thing to check
Refresh interval too aggressive	Segment count rising on actively indexed indices; high refresh throughput	`/_nodes/stats/indices/refresh` for rising total time
Merge I/O saturation	Segment count growing despite steady load; merge time increasing	`/_nodes/stats/indices/merges` for `current`, plus OS `iostat`
Indexing burst outpacing merge threads	Many small segments; merges persistently active	`/_nodes/stats/indices/merges` for `current` and `total`
Update-heavy workload with deleted docs	Merges are large and slow; space not reclaimed after deletes	`/_nodes/stats/indices/merges` for `total_size_in_bytes`
Force merge blocking background merges	Segment count flat or rising while a force merge is running	`/_nodes/stats/indices/merges` `current` metric

Quick checks

Read-only checks for segment density, merge health, and search impact.

# Segment counts per index, sorted by primary segment count descending
curl -s 'http://localhost:9200/_cat/indices?v&h=index,pri,rep,docs.count,store.size,pri.segments.count&s=pri.segments.count:desc' | head -20

# Node-level segment count and segment memory
curl -s 'http://localhost:9200/_cat/nodes?v&h=name,segments.count,segments.memory'

# Node-level merge concurrency and cumulative merges
curl -s 'http://localhost:9200/_nodes/stats/indices/merges?filter_path=nodes.*.indices.merges.current,nodes.*.indices.merges.total'

# Detailed merge stats: current merges, total time, and size
curl -s 'http://localhost:9200/_nodes/stats/indices/merges?filter_path=nodes.*.indices.merges'

# Search latency: query and fetch phase cumulative counters
curl -s 'http://localhost:9200/_nodes/stats/indices/search?filter_path=nodes.*.indices.search.query_total,nodes.*.indices.search.query_time_in_millis,nodes.*.indices.search.fetch_total,nodes.*.indices.search.fetch_time_in_millis'

# Refresh and flush performance
curl -s 'http://localhost:9200/_nodes/stats/indices/refresh,flush?filter_path=nodes.*.indices.refresh,nodes.*.indices.flush'

# File descriptor utilization per node
curl -s 'http://localhost:9200/_cat/nodes?v&h=name,file_desc.current,file_desc.max,file_desc.percent'

How to diagnose it

Confirm segment counts per shard. Use _cat/indices with pri.segments.count. Sort descending. Healthy shards show 10 to 50 segments. Anything above 100 suggests a merge backlog.
Check if merges are keeping up. Use _nodes/stats/indices/merges. If current is persistently non-zero and segment counts continue to rise, the node cannot consolidate segments faster than they are created.
Inspect merge throughput and cost. Use _nodes/stats/indices/merges to see total_time_in_millis and total_size_in_bytes. Rapid growth in merge time without a corresponding drop in segment count indicates I/O-bound merges.
Correlate with search latency. Sample _nodes/stats/indices/search twice over a 30-second interval. Compute delta(query_time) / delta(query_total). Elevated query latency with stable query complexity points to segment scan overhead.
Evaluate refresh behavior. Check _nodes/stats/indices/refresh for total_time_in_millis. If refresh time is high or refresh_interval is set very low, the node is creating segments faster than the merge policy can absorb them.
Check file descriptor pressure. Use _cat/nodes with file_desc.percent. Each segment consists of multiple files. Segment explosion drives file descriptor growth, which can approach the per-process limit.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
`pri.segments.count` per index	Direct measure of per-shard segment density	>100 per shard
`segments.count` per node	Node-wide load and file descriptor pressure	Growing monotonically
`segments.memory` per node	Memory consumed by segment-related structures	Sustained upward trend
Search query latency	User-facing impact of segment scan overhead	>2x baseline sustained
`merges.current`	Merge concurrency saturation	Persistently non-zero while segment counts rise
Merge total time	Background I/O and CPU cost	Growing faster than segment count drops
`file_desc.percent`	Segments consume multiple files per shard	>80% of system limit

Fixes

Reduce refresh frequency on write-heavy indices

For indices still receiving writes, increase index.refresh_interval to 30s or higher during bulk loads. This reduces the creation rate of small segments and gives the TieredMergePolicy room to consolidate them. Restore 1s only after ingestion completes and segment counts normalize. Do not set refresh_interval: -1 on live production search indices unless you can tolerate delayed searchability.

Tune merge concurrency for storage type

If merges are falling behind on spinning disks, reduce index.merge.scheduler.max_thread_count to 1 to lower random I/O contention. On SSDs, the default scheduler concurrency is tuned for available cores. Raising the thread count above the default rarely helps and usually increases contention with indexing and search.

Force-merge read-only indices only

For time-series indices that are no longer written to, force merge to one segment per shard:

# WARNING: blocking and resource-intensive. Only for read-only indices.
POST /<index>/_forcemerge?max_num_segments=1

This can take hours on large shards and will spike CPU and disk I/O while it runs. Never run force merge on an index that is still receiving writes. On a live write index, force merge creates very large segments that the background merge policy will not efficiently recombine with future small segments. This wastes disk space and CPU and can trigger subsequent merge storms. After a force merge, new writes create new segments alongside the force-merged giant, leaving you with worse fragmentation than before.

Address over-sharding

If segment counts are high because shards are undersized, use the shrink API to reduce the primary shard count on read-only indices, or reindex into a new index with fewer, larger shards. Target shard sizes in the 10 to 50 GB range. Fewer shards means fewer total segments and lower fixed overhead.

Prevention

Set refresh_interval to -1 during large bulk imports and restore it afterward.
Use ILM to transition time-series indices to read-only and force-merge them automatically once writes stop.
Monitor pri.segments.count and segments.count per node as leading indicators, not just lagging symptoms.
Size disks with merge headroom in mind. A large merge can temporarily require disk space for both old and new segments.
Avoid dynamic index creation rates that outpace merge capacity. Consolidate time-series granularity where possible.

How Netdata helps

Netdata correlates Elasticsearch segment counts with per-node search latency and CPU to isolate segment pressure from query load. It tracks file descriptor utilization per node alongside segment growth, surfaces OS disk I/O wait alongside merge activity to distinguish merge saturation from generic disk pressure, and charts JVM heap and old GC behavior against segment memory trends. Alert on monotonically rising segment counts per node before query performance degrades.

The Netdata solution

Elasticsearch monitoring with Netdata

Netdata monitors Elasticsearch with per-second metrics and ML anomaly detection. Correlate JVM heap pressure, shard counts, disk watermarks, mapping growth, and merge activity with cluster and node health in one view.

See Elasticsearch monitoring → Start monitoring free

Elasticsearch too many segments per shard: search slowdown and force-merge

Elasticsearch too many segments per shard: search slowdown and force-merge

What this means

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Reduce refresh frequency on write-heavy indices

Tune merge concurrency for storage type

Force-merge read-only indices only

Address over-sharding

Prevention

How Netdata helps

Related guides

Elasticsearch monitoring with Netdata