$ guides / elasticsearch / elasticsearch-out-of-memory-oom-killed ▌

Operations Guides

Elasticsearch node OOM-killed: heap ceiling, page cache, and container limits

An Elasticsearch node leaves the cluster, restarts seconds later via systemd or a supervisor, and is killed again. Kernel logs show the OOM-killer terminated the Java process. heap.percent often looks reasonable right up until the kill.

The JVM heap is only one component of resident set size. Off-heap allocations, memory-mapped Lucene segments, and co-located processes all compete for the same memory budget. In containers, the cgroup limit is the hard boundary, not the host’s physical RAM.

Setting -Xmx caps the JVM heap, not the process RSS. Elasticsearch uses off-heap buffers for network I/O via Netty 4. The JVM allocates Metaspace, JIT code cache, and thread stacks outside the heap. Lucene accesses index segments via memory-mapped files, which consume OS page cache. The page cache drives search performance, but also contributes to RSS.

In a containerized deployment, the OOM-killer triggers when the cgroup’s total memory usage reaches the container limit. This can happen even when heap.percent is below 75% because the heap is not the only consumer.

The parent circuit breaker defaults to 95% of JVM heap with real memory tracking. It rejects operations that would push heap usage too high, but it does not account for Lucene mmap regions, direct ByteBuffers, or memory used by other processes sharing the cgroup. Consequently, the breaker may never trip before the kernel kills the process.

flowchart TD
    A[Bulk indexing or aggregations] --> B[JVM heap fills]
    B --> C[Circuit breaker may trip]
    A --> D[Netty direct buffers grow]
    A --> E[Lucene mmap segments expand]
    D --> F[Container RSS hits memory limit]
    E --> F
    B --> F
    F --> G[Kernel OOM-killer sends SIGKILL]
    G --> H[Node exits 137]
    H --> I[Master removes node]
    I --> J[Shard reallocation starts]
    J --> K[Remaining nodes absorb load]
    K --> A

Common causes

Cause	What it looks like	First thing to check
Container memory limit too tight	Node restarts in a loop with exit code 137; dmesg shows oom-killer	Container memory limit vs `-Xmx` plus headroom
Heap sized above 50% of available RAM	Frequent OOM despite moderate heap percent; search latency high from cold page cache	`_cat/nodes` `heap.max` vs container or host total memory
Off-heap pressure from segments and buffers	RSS grows steadily while heap stays flat; many open file descriptors	`_cat/nodes` `segments.count` and `segments.memory`
Startup RSS spike	Node killed during bootstrap before handling traffic	Service logs for early exit, dmesg timestamp vs start time
Co-located services in pod or on host	ES process alone fits budget, but total RSS exceeds limit	Per-process RSS with `ps` or container sidecar metrics

Quick checks

# Confirm kernel OOM-killer killed the Java process
dmesg | grep -i "killed process"

# Same check via journalctl if dmesg is empty or rotated
journalctl -k | grep -i "killed process"

# Check JVM heap max and current usage
curl -s 'http://localhost:9200/_cat/nodes?v&h=name,heap.max,heap.percent'

# Check segment count and off-heap segment memory per node
curl -s 'http://localhost:9200/_cat/nodes?v&h=name,segments.count,segments.memory'

# Inspect circuit breaker state
curl -s 'http://localhost:9200/_nodes/stats/breaker?filter_path=nodes.*.breakers'

# Check for restart loops in systemd logs
systemctl status elasticsearch --no-pager

# Show process RSS on the host
ps -o pid,rss,comm -p $(pgrep -f org.elasticsearch.bootstrap.Elasticsearch)

# Read container memory limit from inside the pod/container
cat /sys/fs/cgroup/memory/memory.limit_in_bytes 2>/dev/null || cat /sys/fs/cgroup/memory.max 2>/dev/null

How to diagnose it

Verify the OOM kill. Run dmesg | grep -i "out of memory" or journalctl -k | grep -i "killed process". Look for lines naming the Java PID and reporting anon-rss. Note the timestamp. If the node is in a container, check the host dmesg, not the container.
Confirm the restart pattern. Check systemctl status elasticsearch or the container runtime for exit code 137 (128 + SIGKILL 9). Rapid uptime resets in _cat/nodes indicate the supervisor is respawning the process.
Compare heap to limit. Query _cat/nodes?v&h=name,heap.max,heap.percent. Convert heap.max to the same unit as the container limit or host RAM. If heap.max exceeds 50% of the limit, the configuration violates the headroom guideline.
Measure off-heap growth. Check _cat/nodes?v&h=name,segments.count,segments.memory. High segment count increases mmap pressure and file descriptor usage. Correlate with _nodes/stats/jvm?filter_path=nodes.*.jvm.mem to see the gap between heap committed and process RSS.
Check circuit breaker history. Query _nodes/stats/breaker. If the parent breaker tripped count is zero, the OOM was caused by untracked memory. If it tripped repeatedly, heap pressure preceded the kill but was not the only factor.
Identify co-located consumers. On Kubernetes, check the pod spec for sidecar containers. On bare metal or VMs, sum RSS across all processes. Non-ES consumers can push total usage over the limit even when ES itself is sized correctly.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
`jvm.mem.heap_used_percent`	Largest controllable memory consumer	Sustained >75%
`breakers.parent.tripped`	Indicates heap pressure before OOM	Any delta > 0
`segments.memory`	Segment metadata and mmap pressure	Growing without index growth
`process.open_file_descriptors`	Proxies for segment count and mmap regions	>80% of max
Container or host memory usage vs limit	Hard boundary for OOM-killer	Usage >80% of limit
Node uptime / restart frequency	Catches supervisor respawn loops	Unexpected restart within 10 minutes

Fixes

Raise the container memory limit

If the container limit is artificially low, increase it. Do not raise -Xmx to consume all the extra space. Keep -Xmx at no more than 50% of the container limit, capped at roughly 26-30 GB to keep compressed OOPs enabled.

Lower `-Xmx` to free headroom

If you cannot raise the limit, reduce -Xmx. This requires a rolling restart. A smaller heap gives more room to the OS page cache and off-heap allocations. Tradeoff: young GC frequency rises and heavy aggregation loads are more likely to trip the parent circuit breaker.

Reduce segment and shard pressure

High segment counts increase off-heap memory and file descriptor usage. Force merge read-only indices to reduce segments. Delete old indices or close them. Warning: force merge is I/O-intensive and temporarily doubles disk usage for the segments involved.

Isolate co-located workloads

Move monitoring agents, log shippers, and sidecars out of the Elasticsearch pod or off the host. If that is impossible, size their memory and subtract it from the available budget before setting -Xmx.

Correct CPU container detection

If running in a container with CPU limits, set -XX:ActiveProcessorCount to match the limit. Thread pools sized for too many cores allocate excessive thread stacks, adding to RSS. This also requires a rolling restart.

Prevention

Size heap to half the budget. Set -Xms and -Xmx to no more than 50% of the memory available to the node, with a ceiling of roughly 26-30 GB.
Leave headroom for page cache. Elasticsearch relies on the OS page cache for Lucene segment access. Starving the page cache increases search latency and does not prevent OOM.
Monitor total memory usage, not just heap. Heap percentage is a sawtooth that hides off-heap growth. Track process or container memory usage against the limit.
Account for startup spikes. Some versions briefly allocate extra memory during bootstrap. Size container limits to handle startup, not just steady state.
Watch for respawn loops. A supervisor restarting the process after exit 137 creates a flapping node that triggers unnecessary shard reallocation. Alert on unexpected node uptime resets.

How Netdata helps

Correlates elasticsearch.jvm_heap_used_percent with system RAM and cgroup memory usage, revealing when RSS diverges from heap.
Surfaces kernel OOM-killer events from system logs without manual dmesg searches.
Tracks elasticsearch.thread_pool_queued_operations and elasticsearch.breaker_tripped to identify memory pressure before the kernel intervenes.
Alerts on node uptime drops and process restarts, catching supervisor respawn loops that mask chronic OOM kills.
Monitors per-process RSS and open file descriptors to expose segment-related off-heap growth.

The Netdata solution

Elasticsearch monitoring with Netdata

Netdata monitors Elasticsearch with per-second metrics and ML anomaly detection. Correlate JVM heap pressure, shard counts, disk watermarks, mapping growth, and merge activity with cluster and node health in one view.

See Elasticsearch monitoring → Start monitoring free

Elasticsearch node OOM-killed: heap ceiling, page cache, and container limits

Elasticsearch node OOM-killed: heap ceiling, page cache, and container limits

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Raise the container memory limit

Lower -Xmx to free headroom

Reduce segment and shard pressure

Isolate co-located workloads

Correct CPU container detection

Prevention

How Netdata helps

Related guides

Elasticsearch monitoring with Netdata

Lower `-Xmx` to free headroom