$ guides / elasticsearch / elasticsearch-old-gc-long-pauses ▌

Operations Guides

Elasticsearch long GC pauses: old-generation stop-the-world and node drops

In Elasticsearch 8.x, nodes can drop out of the cluster without logging errors. The master logs node-left with reason disconnected, while the departed node shows no ERROR entries because its JVM was frozen in an old-generation stop-the-world GC pause. A single pause longer than 10 seconds fails a fault-detection check; roughly 30 seconds of total unresponsiveness triggers removal. Once the master reallocates shards, remaining nodes face additional heap pressure and the cascade continues.

The node logs nothing during the pause because every thread, including logging and network I/O, is suspended. Without correlating GC metrics to node departures, the symptom looks like a network problem or sudden crash. In reality, it is almost always structural heap pressure.

Old-gen GC pauses are not a root cause. They are the final warning that the heap is full of long-lived objects the collector cannot reclaim quickly enough. This guide shows how to distinguish an isolated allocation spike from a rising trend that will bring down the cluster, and how to stop the cascade without masking it behind longer timeouts.

What this means

Since ES 8.0, G1GC is the default collector. Old-generation collections reclaim long-lived objects such as segment metadata, fielddata, and cluster state structures. Under normal conditions these pauses are brief. Under heap pressure, old GC cannot keep up, pauses lengthen into seconds or tens of seconds, and the JVM stops every thread including those used for cluster coordination and transport.

The cluster coordination layer performs follower and leader checks with a default 10-second timeout and three retries before node removal. A GC pause longer than 10 seconds fails one check; roughly 30 seconds of total unresponsiveness causes the master to remove the node. A hard TCP disconnect triggers immediate removal. After removal, the master reallocates that node’s shards to remaining nodes, which generates additional indexing, search, and merge load on peers already near their limits. This is the heap pressure death spiral.

flowchart TD
    A[Heap pressure] --> B[Frequent old-gen GC]
    B --> C[Stop-the-world pause]
    C --> D{Pause > 10s?}
    D -->|Yes| E[Fault detection timeout]
    E --> F[Node marked failed]
    F --> G[Shard reallocation]
    G --> H[More heap pressure on peers]
    H --> B

Common causes

Cause	What it looks like	First thing to check
Too many shards / segment metadata bloat	`segments.memory` climbing; high shard count; heap floor rising	`GET /_cat/nodes?v&h=name,segments.memory,segments.count`
Fielddata loading on text fields	`fielddata.memory_size_in_bytes` high; slow logs show aggregations on `text`	`GET /_nodes/stats/indices/fielddata?fields=&filter_path=nodes..indices.fielddata.fields`
Mapping explosion or bloated cluster state	Pending cluster tasks growing; master heap elevated	`GET /_cluster/pending_tasks` and `GET /_cluster/stats?filter_path=indices.mappings`
Expensive aggregations or oversized bulk requests	`parent` or `request` circuit breaker trips; large tasks in `/_tasks`	`GET /_tasks?detailed=true&actions=search`
Inadequate heap for sustained workload	`heap_used_percent` >85% with rising old GC count and time	`GET /_nodes/stats/jvm?filter_path=nodes..jvm.mem,nodes..jvm.gc`

Quick checks

# Per-node heap and old-generation GC time
curl -s 'http://localhost:9200/_nodes/stats/jvm?filter_path=nodes.*.jvm.mem.heap_used_percent,nodes.*.jvm.gc.collectors.*.collection_time_in_millis'

# Cluster health and node count for recent departures
curl -s 'http://localhost:9200/_cluster/health?filter_path=status,unassigned_shards,number_of_nodes'

# Segment metadata pressure per node
curl -s 'http://localhost:9200/_cat/nodes?v&h=name,segments.memory,segments.count'

# Currently running search tasks
curl -s 'http://localhost:9200/_tasks?detailed=true&actions=*search*'

# Hot threads to see what is consuming CPU
curl -s 'http://localhost:9200/_nodes/hot_threads'

How to diagnose it

Correlate node departures with GC. On the removed node, compare the departure timestamp with old-generation collection time spikes under jvm.gc.collectors.
Decide whether this is an isolated pause or a rising trend. An isolated spike suggests a single oversized allocation or one-off query. A steady increase in collection count and collection time over hours means structural heap pressure that will not self-resolve.
Identify the dominant heap consumer. Query GET /_nodes/stats/indices/segments,fielddata,query_cache,request_cache,completion and compare memory sizes. If segment memory dominates, shard count or field count is the driver. If fielddata dominates, mappings are the driver. If request cache or query cache dominate, search patterns are the issue.
Inspect cluster state pressure. High pending tasks on the master combined with master-node heap spikes point to mapping explosion or excessive index churn. Check GET /_cluster/stats?filter_path=indices.mappings for runaway field counts and compare with the master’s jvm.mem.heap_used_percent.
Find expensive in-flight operations. Use GET /_tasks?detailed=true&actions=*search* to look for long-running aggregations, scrolls, or bulk operations that coincide with the GC window. Cancel them with POST /_tasks/{task_id}/_cancel if safe. Warning: Cancelling tasks aborts in-flight queries and can return errors to clients.
Verify fault-detection configuration. Query GET /_cluster/settings?include_defaults=true and inspect cluster.fault_detection.*. If follower_check.timeout or leader_check.timeout have been raised above the 10-second default, the cluster is likely masking chronic heap pressure instead of fixing it.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
`jvm.mem.heap_used_percent`	Sustained elevation precedes old GC storms	>75% sustained
Old-generation GC collection time	Measures stop-the-world duration	Rate increasing; single pause >10 s
Old-generation GC collection count	Frequency of old collections	Rising trend over hours
Post-GC heap floor	Long-lived object accumulation	Minimum heap after GC creeps upward
`segments.memory`	Segment metadata lives in old generation	Growing linearly with shard count
`breakers.parent.estimated_size_in_bytes` vs `limit_size_in_bytes`	Proximity to OOM protection	Ratio consistently >70%

Fixes

Immediate stabilization

If nodes are dropping in a cascade, stop the rebalancing storm before fixing the root cause. Set cluster.routing.allocation.enable: none to prevent the master from relocating shards onto already stressed peers. Warning: This stops all shard allocation and relocation, including replica recovery. Re-enable it after stabilization; the cluster will not self-heal while it is set.

Cancel expensive in-flight operations identified via /_tasks. This is disruptive to the affected queries, but it can free heap immediately. Reducing indices.breaker.fielddata.limit can force earlier rejection of fielddata-heavy queries, trading query failures for heap headroom.

Structural fixes

When segment memory is high, reduce shard density. Close or delete old indices, implement ILM policies, or use the shrink API to reduce the shard count on heavy nodes. Force-merge read-only indices to one segment to cut segment metadata, but never force-merge a live index receiving writes because the I/O cost can worsen pressure.

For fielddata issues, change mappings to use keyword sub-fields for aggregations and sorting instead of loading fielddata on text fields. If the cluster state is bloated, enforce index.mapping.total_fields.limit and audit dynamic mapping on unstructured input. Each new field increases cluster state size, which is held in heap on every node.

For query-driven pressure, reduce bulk batch sizes and limit aggregation cardinality at the application layer. Do not simply increase heap unless the current size is below the compressed-OOP threshold. Adding heap delays the problem without fixing the consumer, and heaps above the compressed-OOP threshold waste space. Ensure -Xms equals -Xmx so the JVM does not resize during pressure.

Configuration pitfalls

Do not increase cluster.fault_detection.follower_check.timeout or leader_check.timeout to tolerate GC pauses. This hides symptoms, does not fix underlying heap pressure, and delays detection of genuine node failures, extending the window of cluster instability.

Prevention

Monitor the post-GC heap floor trend, not just the peak percentage. A rising floor is the best leading indicator that the death spiral is approaching. Keep sustained heap usage below 75% and ensure old GC pauses stay well under the 10-second fault-detection timeout. Maintain shard count per node below 500-800 and monitor segment memory weekly for growth.

Review slow logs regularly for queries that load fielddata or build large aggregation structures. After any restart, allow time for OS page cache warming before declaring latency anomalies, but watch for heap pressure that outlasts the warmup window. In container deployments, ensure the cgroup memory limit matches the JVM heap plus native and off-heap overhead so the Linux OOM killer does not terminate the process before GC can complete.

How Netdata helps

Correlates old-generation GC time and collection count rate with node reachability and cluster health on the same timeline, making the pause-to-removal pattern visible.
Tracks jvm.mem.heap_used_percent and estimates the post-GC floor per node without manual delta calculations.
Alerts on composite conditions such as sustained heap greater than 85% combined with rising old GC time, which reduces noise from transient spikes.
Surfaces per-node segment memory, shard counts, and circuit breaker utilization alongside GC metrics to reveal gradual heap consumers before they cause pauses.
Maps thread pool rejections and fault-detection events to the nodes experiencing GC pressure, clarifying whether a node drop is a cause or a symptom.

The Netdata solution

Elasticsearch monitoring with Netdata

Netdata monitors Elasticsearch with per-second metrics and ML anomaly detection. Correlate JVM heap pressure, shard counts, disk watermarks, mapping growth, and merge activity with cluster and node health in one view.

See Elasticsearch monitoring → Start monitoring free

Elasticsearch long GC pauses: old-generation stop-the-world and node drops

Elasticsearch long GC pauses: old-generation stop-the-world and node drops

What this means

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Immediate stabilization

Structural fixes

Configuration pitfalls

Prevention

How Netdata helps

Related guides

Elasticsearch monitoring with Netdata