MongoDB OOM-killed by the kernel: RSS, cache sizing, and oom_score_adj
You find mongod gone. The replica set has no primary. Applications time out. MongoDB logs show no graceful shutdown. Instead, dmesg shows Out of memory: Killed process 12345 (mongod). The Linux OOM killer has reaped the process. MongoDB is a frequent target because its resident set size is usually the largest on the host.
An OOM kill is not a MongoDB bug. It is the kernel freeing RAM by terminating the highest-scoring process. mongod’s RSS is dominated by the WiredTiger cache, plus roughly 1 MB per connection, plus roughly 500 MB to 1 GB of internal overhead for indexes, session buffers, and stack. When that sum comes within 1 GB of total RAM, the node is in the danger zone. The kill is abrupt: no stepdown, no replica set coordination, and after restart the cache must warm again.
The fix is rarely just adding RAM. Size the WiredTiger cache so RSS fits safely inside the host or container limit, control connection count and churn, and use oom_score_adj as a protective signal without making the host unmanageable.
flowchart TD
A[System RAM] --> B[mongod RSS]
B --> C[WiredTiger cache]
B --> D[~1 MB per connection]
B --> E[Internal overhead ~500 MB-1 GB]
C --> F[Cache fill >80%]
D --> G[Connection storm]
F --> H[RSS approaches RAM]
G --> H
H --> I[OOM killer selects mongod]
I --> J[mongod terminated]What this means
RSS is the physical memory mongod occupies. In a healthy node:
RSS ~= WiredTiger cache size + (connections current × ~1 MB) + ~500 MB-1 GB overhead
If your node has 16 GB RAM and the default cache formula applies, WiredTiger claims max(0.5 × (16 - 1), 0.25) = 7.5 GB. With 1,000 connections and 1 GB overhead, RSS sits around 9.5 GB. That is comfortable. But if the same default applies inside an 8 GB container, the cache still claims 7.5 GB, leaving almost no room for connections or overhead. RSS quickly reaches the container limit and the cgroup OOM killer terminates mongod.
Treat mem.resident within 1 GB of total RAM, or within 1 GB of the container memory limit, as a pre-OOM condition.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| WiredTiger cache using host RAM formula inside a container | OOM kill shortly after startup; RSS hovers near the container limit even under light load | wiredTiger.cache.maximum bytes configured vs the container memory limit |
| Connection storm after failover, deploy, or DNS blip | totalCreated spikes; RSS tracks connections.current; latency rises from thread overhead | db.serverStatus().connections correlated with db.serverStatus().mem.resident |
| Default cache on a small VM | RSS reaches the 2 GB limit despite the 256 MB cache floor | mem.resident vs total RAM minus 1 GB safety margin |
| Long-running snapshots pinning cache | Cache fill and dirty ratio rise without workload increase; operations slow from application-thread eviction | db.currentOp() for old transactions and metrics.cursor.open.noTimeout |
| Memory leak or heap fragmentation | RSS grows steadily while cache utilization and connection count are flat | db.serverStatus().tcmalloc.generic for heap_size vs current_allocated_bytes |
Quick checks
All are read-only. Run in order.
# Confirm the OOM kill in the kernel log
sudo dmesg -T | grep -i "out of memory"
sudo grep "Killed process.*mongod" /var/log/kern.log
# Check mongod RSS in kilobytes
for pid in $(pgrep mongod); do grep VmRSS /proc/$pid/status; done
# Check current oom_score_adj
for pid in $(pgrep mongod); do cat /proc/$pid/oom_score_adj; done
# Check swap and swappiness
sysctl vm.swappiness
cat /proc/swaps
// Check RSS, cache size, and connection count
var mem = db.serverStatus().mem;
var wt = db.serverStatus().wiredTiger.cache;
var conn = db.serverStatus().connections;
print("RSS MB: " + mem.resident);
print("Cache max MB: " + (wt["maximum bytes configured"] / 1024 / 1024).toFixed(0));
print("Cache used %: " + (100 * wt["bytes currently in the cache"] / wt["maximum bytes configured"]).toFixed(1));
print("Connections current: " + conn.current);
print("Connections totalCreated: " + conn.totalCreated);
How to diagnose it
- Confirm the kill was OOM. Look for
Out of memory: Killed process <pid> (mongod)indmesgor/var/log/kern.log. In containers, the runtime may also emit a cgroup-specific OOM message. - Measure RSS after restart. Use
db.serverStatus().mem.residentand compare it to total system RAM or the container limit. If it is already within 1 GB, the cache is oversized or connections are too high. - Compare the configured cache to available memory.
wiredTiger.cache.maximum bytes configuredshould leave room for connections, overhead, and the OS page cache. - Check for connection churn. A high
totalCreateddelta with stablecurrentmeans connections are being destroyed and recreated rapidly. Each creation allocates a thread stack and spikes RSS. - Check for container misconfiguration. In containers, ensure
storage.wiredTiger.engineConfig.cacheSizeGBis set explicitly and sized for the container limit, not the host RAM. - Look for snapshot pinning. Long-running multi-document transactions and
noCursorTimeoutcursors hold old cache snapshots open, preventing eviction and inflating RSS. - Suspect a leak only after ruling out cache and connections. Compare
tcmalloc.generic.heap_sizetocurrent_allocated_bytes. Significant and growing divergence suggests fragmentation.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
mem.resident | Tracks mongod RSS, the memory the OOM killer scores | Within 1 GB of total RAM or cgroup limit |
wiredTiger.cache.maximum bytes configured | Shows whether the cache is sized to the host instead of the container limit | Leaves no room for connections and overhead |
wiredTiger.cache.bytes currently in the cache | Cache fill directly adds to RSS | Sustained >80% with rising eviction |
wiredTiger.cache.tracked dirty bytes in the cache | Dirty ratio predicts cache pressure and checkpoint stall | >20% of maximum bytes configured |
wiredTiger.cache.pages evicted by application threads | Indicates cache pressure forcing user threads to do eviction work | Any sustained nonzero rate |
connections.current | Each connection adds ~1 MB of RSS and scheduling overhead | Sustained >80% of maxIncomingConnections |
connections.totalCreated delta | Churn allocates and destroys thread stacks repeatedly | Sharp spike while current is stable |
metrics.cursor.open.noTimeout | Each cursor can pin a cache snapshot indefinitely | Growing or unexpectedly high |
extra_info.page_faults rate | Sustained high rate after warmup indicates the working set exceeds memory | Sustained high rate after warmup |
Fixes
Resize the WiredTiger cache
The safe cache size is not the default. It is the largest value that keeps total RSS below the danger zone.
For a container or VM with memory limit L, set cacheSizeGB so expected RSS stays at least 1 GB below L. Using the approximation RSS ~= cache + connections + overhead:
Example: an 8 GB container with 1,000 connections (~1 GB) and 1 GB overhead should set cacheSizeGB to roughly 3 GB, not the host-derived 7.5 GB or the container-aware default of 3.5 GB. That yields an expected RSS near 5 GB, keeping 3 GB of headroom.
Update mongod.conf:
storage:
wiredTiger:
engineConfig:
cacheSizeGB: 3
Restart mongod to apply. You can change the cache at runtime with setParameter and wiredTigerEngineRuntimeConfig, but the change does not survive restart. Always update mongod.conf as the source of truth.
Reduce connection pressure
If a connection storm triggered the OOM kill, reduce the driver pool size, eliminate connection leaks, and route read traffic to secondaries. If the server is actively flooded, temporarily lowering net.maxIncomingConnections rejects new connections cleanly instead of accepting them and dying from RSS growth.
Protect mongod with oom_score_adj
Set oom_score_adj to a negative value so the kernel is less likely to select mongod first. Do not set it to -1000; full immunity can cause the kernel to kill sshd, systemd, or other critical processes instead, potentially locking you out of the host. A practical protective value is often around -900.
# Run as root. Persists only until restart; set it in systemd or init scripts for permanence.
for pid in $(pgrep mongod); do echo -900 > /proc/$pid/oom_score_adj; done
Release pinned snapshots
Warning: killOp terminates operations immediately. Use only if you have identified the specific transaction or cursor causing pressure.
Kill long-running transactions and noCursorTimeout cursors that pin cache snapshots:
// Find transactions open > 60 seconds
db.currentOp({ "transaction": { "$exists": true } }).inprog.forEach(function(op) {
if (op.transaction.timeOpenMicros > 60000000) {
print("Killing " + op.opid + " open for " + op.transaction.timeOpenMicros / 1000000 + "s");
db.killOp(op.opid);
}
});
Prevention
- Size for headroom. Keep
mem.residentbelow roughly 80% of total RAM or the container limit, and never within 1 GB of the ceiling. - Plot RSS weekly. Track
mem.residentagainst cache size and connection count. If RSS grows without cache growth, investigate fragmentation or leaks. - Cap connections. Operate below 50% of
maxIncomingConnectionsto leave room for reconnection storms. - Set
vm.swappiness=1. MongoDB should not swap. A value of 1 lets the kernel swap only under extreme pressure without evicting hot WiredTiger pages eagerly. - Disable Transparent Huge Pages. THP causes latency spikes and fragmentation for database workloads. Set it to
neverormadvise. - Monitor dirty ratio and application-thread evictions. Rising dirty ratio and application-thread evictions are leading indicators that cache pressure is building before RSS explodes.
How Netdata helps
- Correlate
mem.resident,wiredTiger.cacheutilization, and system memory usage on the same timeline to see when RSS approaches the host or cgroup limit. - Alert on WiredTiger cache dirty ratio climbing above safe thresholds before the pressure cascades into RSS growth.
- Track
connections.currentandconnections.totalCreateddeltas alongside RSS to distinguish a connection storm from cache-driven memory growth. - Show container memory limits next to MongoDB metrics to expose container misconfiguration immediately.
- Map kernel OOM events to MongoDB process state changes to confirm whether a restart was caused by the OOM killer.
Related guides
- How MongoDB actually works in production: a mental model for operators
- MongoDB pages evicted by application threads: when eviction becomes user latency
- MongoDB balancer stuck and jumbo chunks: permanent imbalance and how to fix it
- MongoDB WiredTiger cache dirty ratio high: the leading indicator nobody watches
- MongoDB WiredTiger cache pressure cascade: eviction stalls and latency spikes
- MongoDB cache too small: sizing the WiredTiger cache for your working set
- MongoDB checkpoint duration climbing: diagnosing slow WiredTiger checkpoints
- MongoDB checkpoint stall write freeze: when all writes stop with no error
- MongoDB connection churn: high totalCreated rate and thread creation overhead
- MongoDB connection refused at maxIncomingConnections: hitting the connection ceiling
- MongoDB connection storm spiral: reconnection floods after an election or deploy
- MongoDB disk full: emergency recovery when mongod can’t write the journal







