$ guides / docker / docker-jvm-memory-tuning ▌

Operations Guides

Docker JVM memory tuning: heap, off-heap, and the cgroup mismatch

Your Java container is OOMKilled at 02:00. Heap usage is 60%. Docker reports exit code 137 and OOMKilled: true. The JVM never threw an OutOfMemoryError.

In a container, the kernel enforces memory limits through cgroups, but the JVM heap is only one component of process RSS. Off-heap memory, metaspace, thread stacks, direct byte buffers, and GC overhead all count against the same cgroup limit. When total RSS crosses that limit, the kernel kills the container without a JVM-level error. In some JDK and kernel combinations, the JVM fails to detect the cgroup limit entirely and sizes the heap against host RAM, which guarantees an OOM kill.

This guide shows how to determine whether an OOM kill was caused by off-heap pressure, cgroup detection failure, or oversizing, and how to set explicit limits that prevent recurrence.

What this means

Docker writes memory limits to cgroup v1 memory.limit_in_bytes or cgroup v2 memory.max. Since Java 10, the JVM enables -XX:+UseContainerSupport by default to read these limits and size the heap ergonomically. The default allocates 25% of the detected limit to max heap via -XX:MaxRAMPercentage.

The kernel’s OOM killer evaluates total RSS: heap, metaspace, code cache, thread stacks, direct byte buffers allocated via NIO or Netty, JNI library allocations, and GC working memory. A container with a 2 GB limit and a 1.5 GB heap can still be killed if a Netty client allocates 600 MB of direct buffers during a traffic spike.

A second failure mode is cgroup detection failure. On Linux kernel 6.12 and later, changes to /proc/cgroups caused JDK 21.0.9 and earlier to misread the container environment as having no memory controller enabled. The JVM falls back to host memory, sets a massive heap, and the container is OOM-killed shortly after startup. Similar detection gaps occur when the container’s cgroup scope lacks an explicit limit and the JVM reads a parent systemd slice instead.

Common causes

Cause	What it looks like	First thing to check
Cgroup detection failure (JDK/kernel regression)	Max heap is 25% of host RAM, not the container limit; container dies within seconds of start	Effective `-Xmx` or `MaxRAMPercentage` output vs `memory.max`
Off-heap exhaustion (Netty, gRPC, NIO direct buffers)	Heap usage healthy; container RSS at limit; OOMKilled under I/O load	`docker stats` RSS vs expected heap; presence of Netty or gRPC clients
Heap sized equal to container limit	Stable for hours, then OOMKilled during GC or class loading	`-Xmx` value vs `docker inspect` memory limit
Missing container memory limit	Intermittent host-level OOM; container killed unpredictably	`docker inspect --format '{{.HostConfig.Memory}}'`
Ancestor cgroup limit confusion	JVM respects a limit, but it is the parent slice limit, not the container limit	`memory.max` in container scope vs parent scope

Quick checks

Run these read-only checks in order. They confirm whether the kill was OOM, what limit the kernel enforced, and what the JVM thought it could use.

# Check OOMKilled status and exit code
docker inspect --format '{{.State.OOMKilled}} {{.State.ExitCode}}' <container_id>

# Live memory usage and limit for all containers
docker stats --no-stream --format "table {{.Name}}\t{{.MemUsage}}\t{{.MemPerc}}"

# Check configured container memory limit
docker inspect --format '{{.HostConfig.Memory}}' <container_id>

# Inspect cgroup memory limit directly (cgroup v2; run inside the container)
cat /sys/fs/cgroup/memory.max
# Inspect cgroup memory limit directly (cgroup v1; run inside the container)
cat /sys/fs/cgroup/memory/memory.limit_in_bytes

# Check kernel OOM kill log (run on the host)
dmesg | grep -i "oom\|killed process" | tail -20

# Stream recent OOM events from Docker
docker events --filter event=oom --since 1h

# Memory breakdown inside container (cgroup v2)
cat /sys/fs/cgroup/memory.stat

# Check configured JVM flags (works even on stopped containers)
docker inspect --format '{{.Config.Entrypoint}} {{.Config.Cmd}}' <container_id>

# Check effective JVM flags of the running process
docker exec <container_id> cat /proc/1/cmdline | tr '\0' ' '

How to diagnose it

Confirm the kill was OOM. Check docker inspect for OOMKilled: true and exit code 137. If OOMKilled is false, the container received SIGKILL from an external source. If true, proceed to memory accounting.
Verify the JVM saw the container limit. Look at the effective max heap. If it is sized to host RAM (for example, 25% of 64 GB on a node with a 2 GB container limit), the JVM failed cgroup detection. This is common on kernel 6.12+ with JDK 21.0.9 and earlier, or when the container scope has no explicit memory.max.
Compare heap to limit. If -Xmx is set explicitly, ensure it is not equal to the container limit. Size -Xmx to no more than 75% of the container limit, leaving headroom for off-heap components.
Account for off-heap memory. Check whether the application uses Netty, gRPC, or OpenTelemetry. These allocate direct byte buffers outside the heap. If docker stats shows RSS significantly higher than heap usage, off-heap pressure is the gap.
Check for ancestor limit confusion. On systemd hosts, read memory.max from the container’s cgroup path. If it reads max, the JVM may be reading the parent slice limit. Ensure Docker or Kubernetes sets an explicit limit on the container scope.
Correlate timing. If OOM kills align with traffic spikes, batch jobs, or JIT compilation warm-up, the container limit may be correctly sized for idle state but insufficient for peak RSS.
Check restart loops. A container with a restart policy that hits its memory limit repeatedly will show a climbing RestartCount alongside exit code 137. This is a crash loop, not a one-time spike.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
Container memory usage % of limit	The kernel OOM killer evaluates this, not the JVM heap	Sustained usage >80% of limit
`OOMKilled` status	Binary confirmation that the cgroup limit was breached	Any `true` in production
Container RSS from `docker stats`	Includes heap, off-heap, and cache; this is the kernel’s view	RSS near limit while heap metrics are low
JVM heap usage	Growth inside the heap is invisible to cgroup metrics until allocated	Heap consistently >80% of `-Xmx`
Container restart count	OOM crash loops surface as rapid restarts	Restart count increasing with exit code 137
`memory.stat` `anon` bytes	Non-reclaimable anonymous memory drives OOM risk	Steady `anon` growth without traffic increase
cgroup `memory.max` vs JVM max heap	Reveals detection failures where the JVM ignored the limit	Heap sized to host RAM instead of container limit

Fixes

If the cause is cgroup detection failure

Upgrade the JDK. JDK 21.0.10+ includes the fix for the kernel 6.12+ /proc/cgroups regression that caused the JVM to ignore container limits. JDK 17 users should verify their build contains the backport.

If upgrading is not immediately possible, replace -XX:MaxRAMPercentage with explicit -Xmx and -Xms values that fit comfortably inside the container limit. Explicit flags bypass ergonomic detection entirely and are the safest workaround. Do not disable container support with -XX:-UseContainerSupport; this removes cgroup detection entirely and usually causes larger heap sizing problems.

If the cause is off-heap exhaustion

Set explicit off-heap limits: -XX:MaxDirectMemorySize, -XX:MaxMetaspaceSize, and -XX:ReservedCodeCacheSize. Without these, native allocations can grow unbounded.

Pad the container memory limit. A practical starting formula is: container limit >= -Xmx + -XX:MaxDirectMemorySize + metaspace headroom + GC overhead. For a 2 GB heap with 700 MB direct memory, set the container limit to at least 3 GB.

Profile direct buffer usage if you use Netty or gRPC. Native memory leaks in direct buffers are invisible to standard JVM heap dumps and will push RSS past the limit indefinitely.

If the cause is heap oversizing

Set -Xmx to no more than 75% of the container memory limit. The remaining 25% covers metaspace, thread stacks, code cache, direct buffers, and JVM native overhead.

Do not set both -Xmx and -XX:MaxRAMPercentage together. If both are present, -Xmx wins silently and the percentage flag is ignored, which can confuse operators during incidents who expect percentage-based sizing to be active.

If the cause is missing or incorrect ancestor limit

Ensure Docker or Kubernetes sets an explicit memory limit on the container’s cgroup scope. On cgroup v2, verify that memory.max in the container’s scope is a concrete number, not max. If the parent systemd slice enforces a lower effective limit than the orchestrator intended, adjust the workload configuration so the container scope receives the correct bound.

Prevention

Test cgroup detection after every JDK or kernel upgrade. Start a test container with a known memory limit and verify that the ergonomically sized heap matches MaxRAMPercentage of that limit, not host RAM.
Monitor RSS, not just heap. Application metrics from JMX show heap state; cgroup metrics show RSS. The gap between them is off-heap pressure.
Configure explicit limits for every Java container. Set -Xmx, -XX:MaxDirectMemorySize, -XX:MaxMetaspaceSize, and a container memory.max. Implicit defaults hide sizing errors.
Set container restart policies to surface loops early. A container that OOMs and restarts repeatedly is easier to spot than one that dies once and stays stopped.
Document your JDK and kernel compatibility matrix. Note which JDK builds are verified against your host kernel and cgroup version to avoid deploying known-bad combinations into production.

How Netdata helps

Netdata collects per-container cgroup memory (memory.current, memory.stat) and process RSS. Chart the gap between RSS and heap usage to isolate off-heap growth.

OOM kill alerts and restart count spikes expose crash loops before you inspect logs.

CPU throttling charts let you exclude CPU pressure as a confounding factor during memory incidents.

Docker JVM memory tuning: heap, off-heap, and the cgroup mismatch

Docker JVM memory tuning: heap, off-heap, and the cgroup mismatch

What this means

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

If the cause is cgroup detection failure

If the cause is off-heap exhaustion

If the cause is heap oversizing

If the cause is missing or incorrect ancestor limit

Prevention

How Netdata helps

Related guides