$ guides / mongodb / mongodb-swapping-and-swappiness ▌

Operations Guides

MongoDB swapping: why mongod must never swap and how to tune the OS

Application timeouts climb. MongoDB latency jumps from milliseconds to seconds or minutes, yet mongod is still running and accepting connections. CPU is low, disk I/O is not saturated, and the MongoDB log shows no errors. The process has not crashed. It has entered swap death. When the Linux kernel evicts mongod pages to swap, the database continues to function at roughly 1/1000th of normal speed. MongoDB relies on the WiredTiger cache and OS page cache to remain resident in RAM.

What this means

Swapping turns a memory access into a disk read. For mongod, this is catastrophic: the storage engine and OS cache are designed for RAM-speed access. WiredTiger maintains an in-memory cache separate from the OS page cache. When either layer is swapped out, cache misses hit disk at swap speed. The process stays alive, heartbeats continue, and replica set elections may not trigger because the node is technically responsive. Operations queue indefinitely. The degradation is self-reinforcing: slow operations hold tickets and connections longer, increasing memory pressure and causing more swapping.

flowchart TD
    A[Memory pressure] --> B[OS swaps mongod pages]
    B --> C[Cache misses hit disk]
    C --> D[Latency spikes 10-100x]
    D --> E[Connections pile up]
    E --> F[More memory pressure]
    F --> A

Common causes

Cause	What it looks like	First thing to check
vm.swappiness at default (60)	Host swaps under moderate pressure even when buffers could be dropped	`cat /proc/sys/vm/swappiness`
Working set exceeds RAM	Page fault rate climbs after warmup; RSS sits near physical memory limit	`db.serverStatus().extra_info.page_faults` and `ps` RSS
NUMA imbalance on multi-socket	Uneven memory allocation across sockets; some nodes saturated while others are free	`numastat` and `/proc/<pid>/numa_maps`
Transparent Huge Pages enabled	Latency spikes and fragmentation under load, especially on older MongoDB versions	`cat /sys/kernel/mm/transparent_hugepage/enabled`
Container memory limit too small	OOM kills or swap pressure inside the container despite free host RAM	WiredTiger cache max bytes vs container limit

Quick checks

# Substitute the mongod PID explicitly if multiple instances are running.
MONGOD_PID=$(pgrep mongod)

cat /proc/sys/vm/swappiness

free -h
cat /proc/swaps

grep VmSwap /proc/$MONGOD_PID/status

ps -o rss,vsz,comm -p $MONGOD_PID

mongosh --quiet --eval 'db.serverStatus().wiredTiger.cache["maximum bytes configured"]'

mongosh --quiet --eval 'db.serverStatus().extra_info.page_faults'

cat /sys/kernel/mm/transparent_hugepage/enabled

numastat -p $MONGOD_PID

cat /proc/$MONGOD_PID/oom_score_adj

How to diagnose it

Confirm mongod is swapped. Any nonzero VmSwap in /proc/<pid>/status is abnormal.
Correlate with latency. Read and write spikes in db.serverStatus().opLatencies, together with rising page fault rates, confirm memory pressure. extra_info.page_faults is cumulative; calculate the delta over an interval.
Find the memory consumer. Compare mongod RSS to wiredTiger.cache["maximum bytes configured"]. Budget roughly 1MB per connection plus 1-2GB of internal overhead. If the expected footprint is below physical RAM but swapping still occurs, another process may be consuming memory, or the kernel is over-aggressive due to swappiness.
Inspect cache sizing. The WiredTiger default is 50% of RAM minus 1GB. If co-hosted software or container limits reduce available memory below this default, the cache pressures the OS.
Verify kernel tuning. Check vm.swappiness, NUMA policy, and THP. Misconfiguration is the most common root cause after insufficient RAM.
Check for application-thread evictions. In db.serverStatus().wiredTiger.cache, growing pages evicted by application threads means the cache is under memory pressure, which often precedes OS-level swapping.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
System swap usage	Any swap consumed by mongod signals swap death risk	`VmSwap` > 0 for the mongod process
Page fault rate	Hard page faults mean data is not resident	Rate increasing after the warmup period
WiredTiger cache fill ratio	Pressure here precedes OS swapping	Sustained > 80%
WiredTiger cache dirty ratio	Dirty data accumulation strains flush capacity and increases memory pressure	Sustained > 10%
Memory RSS vs system memory	Approaching the limit triggers swap or OOM	RSS > 90% of system RAM
Connection count	Each connection adds ~1MB of thread stack memory	Growth correlating with an RSS spike

Fixes

Reduce memory pressure immediately

If mongod is actively swapping, do not restart it as a first response. A restart triggers cache warmup, potential election churn, and connection storms. Kill unnecessary long-running operations with db.currentOp() and db.killOp() to free tickets and memory. If the working set exceeds RAM, reduce the WiredTiger cache size temporarily or move the node to a larger instance.

Set vm.swappiness to 1

A value of 1 tells the kernel to avoid swapping unless absolutely necessary. Do not set it to 0. A value of 0 disables proactive swap and increases the risk that the kernel kills mongod under sudden memory pressure rather than paging out cleanly.

Set it immediately:

sudo sysctl vm.swappiness=1

Persist it in /etc/sysctl.conf:

echo 'vm.swappiness = 1' | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

Tradeoff: A low swappiness value protects mongod but means the OOM killer may target other processes first. If no swap is configured and swappiness is 1, the system has no emergency relief valve other than OOM kills.

Disable Transparent Huge Pages

THP causes latency spikes and memory fragmentation. Check the current state:

cat /sys/kernel/mm/transparent_hugepage/enabled

If the output includes [always] or [madvise], write never to the sysfs control file and persist the setting through your host’s init framework.

Tradeoff: Disabling THP slightly increases TLB pressure for workloads that would benefit from huge pages. For MongoDB, the latency stability gain outweighs this cost.

Configure NUMA interleaving

On multi-socket servers, run mongod with memory interleaved across all NUMA nodes to prevent one socket from saturating while others remain free:

numactl --interleave=all mongod ...

Also ensure the numad daemon is not running, because its dynamic placement conflicts with static interleaving.

Tradeoff: Interleaving adds minor cross-socket memory latency for localized access patterns, but prevents catastrophic imbalance.

Protect mongod from the OOM killer

Because mongod has a high RSS, the Linux OOM killer often selects it first. Set the OOM score adjustment to -1000 to exclude mongod from OOM killing:

echo -1000 | sudo tee /proc/$(pgrep mongod)/oom_score_adj

Tradeoff: Protecting mongod means another process will be killed instead. Ensure your host is not running other critical unprotected services that could cause a cascading failure if OOM-killed.

Container-specific tuning

When mongod runs inside a container, the WiredTiger cache default sizes itself against host RAM unless overridden. Explicitly set storage.wiredTiger.engineConfig.cacheSizeGB to roughly 50% of the container memory limit minus 1GB. Set vm.swappiness=1 on the host kernel. If your orchestrator uses cgroup-level swap controls, ensure they do not override the host setting.

Prevention

Set vm.swappiness=1 before production.
Size the WiredTiger cache for the deployment. Reduce the limit if you co-host other software or run inside a container.
Monitor swap usage continuously. Any swap consumed by mongod is an emergency, not a warning.
Disable THP and configure NUMA at provision time. Treat these as standard host image hardening.
Right-size instances before data growth exceeds RAM. Track page fault rates and cache fill trends weekly to forecast runway.
Limit application connection pool sizes. Unbounded growth increases mongod RSS directly.

How Netdata helps

Track per-process swap usage for mongod. Any nonzero value is a critical signal.
Correlate page fault rates with MongoDB opLatencies to distinguish swapping from slow queries.
Alert on memory utilization approaching limits alongside connection count growth.
Monitor kernel settings such as vm.swappiness and THP status to detect configuration drift after host updates.

The Netdata solution

MongoDB monitoring with Netdata

Netdata monitors MongoDB with per-second metrics and automatic dashboards. Watch WiredTiger cache pressure, oplog window, connection counts, checkpoint stalls, and replication health in one place, correlated with the underlying host.

See MongoDB monitoring → Start monitoring free

MongoDB swapping: why mongod must never swap and how to tune the OS

MongoDB swapping: why mongod must never swap and how to tune the OS

What this means

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Reduce memory pressure immediately

Set vm.swappiness to 1

Disable Transparent Huge Pages

Configure NUMA interleaving

Protect mongod from the OOM killer

Container-specific tuning

Prevention

How Netdata helps

Related guides

MongoDB monitoring with Netdata