Docker memory limits: how to set them and what happens when they hit

A container without a memory limit can consume all available RAM, force the kernel to reclaim page cache, push the system into swap, and eventually trigger a host-level OOM kill that takes down the Docker daemon or other critical processes. Setting --memory is not enough: limits can be silently ignored, misread by runtimes, or masked by swap behavior that turns a clean failure into a slow crawl.

Exit code 137 can mean an OOM kill or an external SIGKILL. The difference matters because the fix for an undersized limit is not the same as the fix for a runtime sizing its heap from host memory instead of the cgroup limit.

This guide covers hard and soft limits, how to verify the kernel is enforcing them, and how to diagnose what happens when a container hits its ceiling. You will be able to distinguish a true OOM kill from a configuration ghost, and you will know which cgroup files to read when Docker’s own output is not enough.

What this means

Docker does not enforce memory limits itself. It writes the limit to the kernel’s cgroup memory controller. On cgroup v1, --memory maps to memory.limit_in_bytes; on cgroup v2, it maps to memory.max. The kernel tracks every page allocated by processes in that cgroup. If the total exceeds the limit, the kernel’s OOM killer terminates a process inside the cgroup with SIGKILL.

The container’s main process exits with code 137 (128 + SIGKILL), and Docker sets "OOMKilled": true in the container state. If the killed process was not PID 1, the container may stay running in a degraded state. If it was PID 1, the container stops. A restart policy can then create an OOM crash loop.

A soft limit set with --memory-reservation is advisory. It activates when the host is under memory contention, but the container can burst above it when headroom exists. Swap behavior is controlled separately with --memory-swap. If you only set --memory and leave --memory-swap unset, the default total memory-plus-swap allowed is twice the memory limit. If you set --memory-swap equal to --memory, the container gets no swap.

On cgroup v2 hosts, including Ubuntu 24.04, the paths and semantics change. The v2 controller exposes memory.high for soft throttling, but Docker does not surface this in the CLI. Runtimes such as the JVM and Node.js may size their heap from host memory rather than the cgroup limit, causing OOM kills despite free heap space.

Common causes

CauseWhat it looks likeFirst thing to check
Hard limit too low for workloadExit code 137, OOMKilled: true, sawtooth memory pattern before restartdocker inspect limit versus docker stats peak usage
Application memory leakmemory.stat anon grows monotonically until OOM loopanon byte trend in cgroup over hours
Runtime assumes host memory for heapJVM or Node.js container OOMs despite heap looking smallRuntime flags such as -XX:MaxRAMPercentage or --max-old-space-size
Limit silently ignoreddocker inspect shows a limit but container uses host-wide memoryCgroup memory.max or memory.limit_in_bytes directly
Swap exhaustionOOM occurs while RAM usage appears below the limit--memory-swap setting and host swap availability
Parent cgroup shadowingLimits worked before host upgrade, now ignored inside LXC or nested containersParent cgroup memory.max overriding child limits

Quick checks

# Check live memory usage against the limit for all containers
docker stats --no-stream --format "table {{.Name}}\t{{.MemUsage}}\t{{.MemPerc}}"

# Inspect configured limit and OOM status for a specific container
docker inspect --format 'Limit={{.HostConfig.Memory}} OOMKilled={{.State.OOMKilled}} Exit={{.State.ExitCode}}' <container_id>

# Read the enforced limit directly from cgroup v2 (systemd driver)
CONTAINER_ID=$(docker inspect --format '{{.Id}}' <container_id>)
cat /sys/fs/cgroup/system.slice/docker-${CONTAINER_ID}.scope/memory.max

# Read current non-reclaimable usage from cgroup v2
cat /sys/fs/cgroup/system.slice/docker-${CONTAINER_ID}.scope/memory.stat | grep -E '^anon|^slab'

# Check cgroup v2 OOM kill counter
cat /sys/fs/cgroup/system.slice/docker-${CONTAINER_ID}.scope/memory.events | grep oom_kill

# Check for kernel OOM kill messages
dmesg | grep -i "oom-kill\|killed process" | tail -20

# Stream OOM events from the Docker daemon
docker events --filter event=oom --since 1h

# Check cgroup v1 limit and OOM state (cgroupfs driver)
cat /sys/fs/cgroup/memory/docker/<container_id>/memory.limit_in_bytes
cat /sys/fs/cgroup/memory/docker/<container_id>/memory.oom_control

How to diagnose it

  1. Confirm the limit is actually set. Run docker inspect --format '{{.HostConfig.Memory}}' <id>. If the value is 0, there is no limit. The container can consume all host RAM.

  2. Verify the kernel sees the same limit. Read the cgroup file directly. On cgroup v2 with the systemd driver, check /sys/fs/cgroup/system.slice/docker-<id>.scope/memory.max. On cgroup v1 with the cgroupfs driver, check /sys/fs/cgroup/memory/docker/<id>/memory.limit_in_bytes. If the file does not exist or the value differs from docker inspect, the limit is not being enforced. This happens on kernels without memory cgroup support or in nested environments where a parent cgroup shadows the child.

  3. Determine if the container was OOM-killed. Check docker inspect for "OOMKilled": true and exit code 137. If OOMKilled is false, the SIGKILL came from an external source such as docker kill or an orchestrator. Do not tune memory limits for a non-OOM kill.

  4. Inspect the memory usage pattern. Look at memory.stat in the cgroup. Focus on anon (anonymous pages, heap and stack) and slab (kernel allocations). If anon climbs steadily and never plateaus, the application has a leak. If file (page cache) is high but anon is low, the pressure may be from cache, which is reclaimable and often harmless.

  5. Check the runtime behavior for JVM or Node.js. Runtimes may default to host memory for heap sizing unless explicitly constrained. If the runtime allocates heap based on host memory instead of the cgroup limit, it will OOM despite internal metrics showing free heap. For the JVM, -XX:MaxRAMPercentage sizes the heap as a percentage of detected RAM; only rely on it when the JVM correctly detects the cgroup limit. For Node.js, set --max-old-space-size in MB to roughly 70-75% of the container limit.

  6. Review swap configuration. If --memory-swap is unset, the container can use swap up to the memory limit. If swap is disabled or exhausted, the kernel treats swap pressure like RAM pressure. Set --memory-swap equal to --memory if you want to prevent swap usage entirely.

  7. Check for multi-process container quirks. The kernel OOM killer targets the process with the highest badness score, which may be a child process rather than PID 1. If a child is killed but PID 1 survives, the container stays running in a degraded state and docker inspect may show OOMKilled: false even though a process in the cgroup was killed.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Container memory usage vs limitHard limit breaches trigger kernel OOM killsSustained usage above 80% of limit
cgroup oom_kill countDirect kernel count of OOM kills in the cgroupAny nonzero count in production
Container restart countRising restarts indicate crash loops from OOM or other failuresIncreasing faster than once per hour
memory.stat anon bytesAnonymous pages are non-reclaimable; cache is excludedSteady growth without traffic increase
Exit code 137 with OOMKilled: trueDistinguishes OOM kill from external SIGKILLAny occurrence on production containers
Host MemAvailableContainers without limits compete for host RAMMemAvailable drops below 20% of MemTotal

Fixes

If the cause is an undersized hard limit

Increase the limit with docker update --memory <new-limit> <id>. This works live for increases; decreasing below current usage fails. Note that changes made with docker update do not persist for containers managed by Docker Compose. The tradeoff is that a higher limit protects the container but reduces headroom for other workloads on the same host. Size limits based on peak observed usage during load testing, not average usage.

If the cause is a memory leak

Raise the limit temporarily to reduce crash-loop noise, then profile the application. For JVM containers, cap the heap relative to the detected limit and set -XX:MaxMetaspaceSize. For Node.js, set --max-old-space-size in MB to roughly 70-75% of the container limit. The real fix is in the application code, but runtime flags prevent the runtime from assuming it owns the entire host.

If the cause is silent limit ignore

If the cgroup file does not match docker inspect, the host kernel may lack memory cgroup support, or a parent cgroup may be overriding the limit. This has been observed on minimal cloud VMs and inside LXC containers after Ubuntu 24.04 upgrades to cgroup v2. Verify enforcement on non-production hosts before deploying to production, and inspect the parent cgroup hierarchy if running nested Docker.

If the cause is swap pressure

Set --memory-swap equal to --memory to disable swap for the container, or ensure the host has adequate swap if you rely on it. Be aware that swap usage changes the meaning of memory pressure alerts. A container using heavy swap can OOM when swap runs out while its RAM usage looks healthy.

If the cause is page cache ambiguity

Alert on anon + slab rather than total cgroup memory. Page cache (file) is reclaimable and the kernel will drop it under pressure. On some kernel and cgroup versions, page cache accounting behaves inconsistently, so total memory usage alone can mislead you into thinking the container is near OOM when it is not.

Prevention

  • Set a hard --memory limit on every production container. A container without a limit is an unbounded host-level risk.
  • Avoid --oom-kill-disable. It prevents the kernel from terminating the container, allowing it to exhaust host memory and trigger system-wide kills.
  • Configure application runtimes to respect cgroup limits. Bake JVM heap limits and Node.js --max-old-space-size into images.
  • Monitor memory.events oom_kill proactively. Do not wait for restart loops or user complaints.
  • Verify limits after kernel upgrades or host migrations by spot-checking cgroup files directly.

How Netdata helps

  • Charts container memory usage against cgroup limits so you see proximity to OOM before the kernel acts.
  • Breaks down cgroup v2 memory.stat into anon, file, and slab without manual /proc parsing.
  • Alerts on OOM kill events and container restart spikes alongside host memory pressure charts.
  • Tracks host MemAvailable to catch unbounded containers exhausting system memory.