$ guides / mongodb / mongodb-page-faults-high ▌

Operations Guides

MongoDB page faults high: working set exceeding memory after warmup

Hard page faults long after startup mean the active data set exceeds resident memory. On Linux, extra_info.page_faults counts major faults: the OS read data from disk because the page was missing from both the WiredTiger cache and the OS page cache. A brief spike after restart is normal during warmup, but sustained faults mean the working set does not fit. On EBS gp3, 50 faults per second can degrade latency. On NVMe, hundreds per second may be tolerable, but neither is free. Confirm the cause, distinguish warmup from pressure, and reduce the fault rate without guessing.

What this means

MongoDB uses a two-tier memory hierarchy. WiredTiger maintains its own uncompressed cache, defaulting to roughly 50% of RAM minus 1 GB. When a document is not in the WiredTiger cache, WiredTiger may still find the compressed on-disk page in the OS page cache. A page fault only fires when neither layer holds the data, forcing a physical disk read. Sustained faults after warmup mean the active data set exceeds the combined memory of both tiers. This is worse than a WiredTiger cache miss served by the OS page cache. It is an OS-level signal that the node is memory-bound, and every fault adds disk I/O latency directly to the operation.

flowchart TD
    A[Query requests page] --> B{In WiredTiger cache?}
    B -->|Yes| C[Serve from WT cache]
    B -->|No| D{In OS page cache?}
    D -->|Yes| E[Read into WT cache]
    D -->|No| F[Major page fault
disk I/O required]
    E --> C
    F --> C

Common causes

Cause	What it looks like	First thing to check
Working set growth or unindexed queries	Faults rise with disk read IOPS; `docsExamined` far exceeds `docsReturned` in slow queries.	WiredTiger cache fill ratio and `db.currentOp()` for collection scans.
WiredTiger cache undersized or container limit ignored	Faults are high despite a modest active set; cache is capped far below available RAM.	`wiredTiger.cache.maximum bytes configured` against host or container memory limit.
Long-running snapshots pinning old versions	Cache fill is high but dirty ratio is low; faults persist with few new writes.	`db.currentOp()` for open transactions and `metrics.cursor.open.noTimeout` count.
External memory pressure or swap	Faults spike alongside system-level memory exhaustion; mongod RSS is stable but available memory is low.	`free -m` and `vmstat 1` for swap activity and system reclaim.
Inadequate storage for unavoidable faults	Fault rate is acceptable for NVMe but painful on EBS gp3; latency spikes correlate with fault spikes.	Storage device type and `iostat -x 1` for `await` and utilization.

Quick checks

Run these read-only commands to baseline the current state.

# Check system memory and swap pressure
free -m && vmstat 1 3

# Major page faults per mongod process
pgrep mongod | while read pid; do
  awk '{print "pid "$1" majflt:", $12}' /proc/$pid/stat
done

// Check WiredTiger cache fill, dirty ratio, and configured size
var c = db.serverStatus().wiredTiger.cache;
var max = c["maximum bytes configured"];
var used = c["bytes currently in the cache"];
var dirty = c["tracked dirty bytes in the cache"];
print("Cache used: " + (100 * used / max).toFixed(1) + "%");
print("Cache dirty: " + (100 * dirty / max).toFixed(1) + "%");
print("Max configured: " + (max / 1024 / 1024 / 1024).toFixed(1) + " GB");

// Check cumulative page faults (compute delta over 60s for a rate)
db.serverStatus().extra_info.page_faults

// Check for long-running operations and open transactions
db.currentOp({ "active": true, "secs_running": { "$gt": 60 } }).inprog.forEach(function(op) {
  print(op.opid + " | " + op.op + " | " + op.secs_running + "s | " + op.ns);
});

// Check for cursors that never time out and can pin snapshots
printjson(db.serverStatus().metrics.cursor)

# Check disk I/O latency and utilization
iostat -x 1 5

// Check resident memory vs expected baseline
var mem = db.serverStatus().mem;
var conn = db.serverStatus().connections;
print("RSS MB: " + mem.resident);
print("Connections: " + conn.current);

How to diagnose it

Confirm the fault rate is abnormal. Sample extra_info.page_faults twice over 60 seconds and compute the delta. If the node recently restarted, high faults are expected while the cache warms. Wait until the working set should have loaded before treating faults as abnormal.
Check the two-tier memory state. Inspect WiredTiger cache fill ratio. If it is below 70% and faults are high, the working set likely exceeds the OS page cache because other processes are consuming RAM or the OS is reclaiming cache aggressively. If cache fill is above 80%, WiredTiger itself is under pressure.
Identify snapshot retention. Run db.currentOp() filtered for transactions and aggregations running longer than 60 seconds. Check metrics.cursor.open.noTimeout. If either is elevated, old snapshots are preventing WiredTiger from evicting historical versions, reducing the effective cache available for the working set.
Correlate with query efficiency. Scan the slow query log for COLLSCAN or queries where docsExamined vastly exceeds docsReturned. A new unindexed query can pull far more data into memory than necessary, displacing the real working set and causing faults on subsequent accesses.
Validate the cache sizing. Compare maximum bytes configured to the host’s physical RAM. In containers, set the cache size explicitly based on the container limit , because the default formula may use host RAM rather than the container limit. A container with a 4 GB limit on a 64 GB host can experience OOM kills if the cache is sized to host RAM, or suffer cache pressure if capped too low.
Check storage backend latency. Run iostat -x 1. If await is high during fault spikes, the disk subsystem is the bottleneck. On EBS gp2, check burst balance. On gp3, verify provisioned IOPS and throughput are not saturated.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
`extra_info.page_faults` rate	Hard faults mean disk I/O on every miss.	Sustained rate above 50/s on EBS gp3, or trending upward after warmup.
WiredTiger cache fill ratio	Shows if the working set exceeds the internal cache.	Above 80% sustained, especially with rising eviction rates.
WiredTiger cache dirty ratio	Dirty data accumulation can displace clean pages and worsen faults.	Above 10% sustained; above 20% risks checkpoint stalls.
`metrics.cursor.open.noTimeout`	Each cursor can hold a snapshot, pinning old versions.	Above zero is a risk; above 10 strongly indicates cache pressure from snapshots.
`currentOp` max age	One runaway query can flood the cache with irrelevant pages.	Any non-background operation above 300 seconds.
System available memory / swap	External memory pressure steals page cache from MongoDB.	Available memory near zero or any swap activity.
Disk read `await` (`iostat`)	Confirms whether faults are actually causing queueing.	`await` above 20 ms sustained during fault spikes.

Fixes

Reduce the working set or improve locality

Add missing indexes or optimize queries so MongoDB touches fewer pages. Use db.collection.aggregate([{ $indexStats: {} }]) to verify indexes are being used. A single new collection scan can displace a previously stable working set. Tradeoff: write amplification from additional indexes and the I/O cost of background builds.

Right-size the WiredTiger cache

If the cache is too small for the working set, increase it with --wiredTigerCacheSizeGB or storage.wiredTiger.engineConfig.cacheSizeGB in the configuration file. Do not exceed roughly 80% of available RAM; the OS page cache and connection thread stacks also need space. In containers, set this explicitly based on the container limit, not the host’s. Tradeoff: less RAM for the OS page cache, which can paradoxically increase faults if overdone.

Free pinned snapshots

Kill unnecessarily long-running operations via db.killOp(). Identify applications leaving noCursorTimeout cursors open and close them. This immediately increases the pool of evictable pages. Warning: killing operations is disruptive to clients and can interrupt in-flight transactions or ETL jobs.

Reduce memory competition

Shrink application connection pool sizes to reduce thread stack overhead, or move non-MongoDB workloads off the node. Ensure vm.swappiness is set to 1 so the OS prefers reclaiming page cache over swapping. If swap is active, faults become far more expensive.

Scale out or archive cold data

If the working set exceeds what can fit in memory economically, shard the collection to spread the working set across nodes, or archive cold data to reduce the active set. Tradeoff: operational complexity.

Upgrade storage if faults are unavoidable

If the working set cannot be reduced and memory cannot be increased, ensure the storage layer can absorb the fault rate. Moving from EBS gp3 to NVMe-backed instances turns a latency crisis into manageable background noise.

Prevention

Trend cache fill and dirty ratio over weeks. A steady climb from 60% to 75% gives early warning that the working set is approaching limits.
Audit index usage monthly. Unused indexes consume cache and write bandwidth. Missing indexes cause scans that bloat the effective working set.
Monitor connection churn, not just connection count. High totalCreated rates increase memory fragmentation and RSS pressure.
Gate alerts on uptime. Suppress page fault alerts during the first 30 minutes after restart to avoid false positives during warmup.
Track currentOp max age continuously. Catching a runaway query at 60 seconds prevents it from flooding cache and causing a fault storm.

How Netdata helps

Netdata correlates extra_info.page_faults with WiredTiger cache fill, dirty ratio, and eviction rates. OS-level disk latency and mongod RSS on the same dashboard distinguish external memory pressure from internal cache saturation. Historical tracking of long-running operation age and cursor counts shows which query or noTimeout cursor preceded a fault spike. Connection churn is shown as a rate, surfacing thread-creation overhead that competes with the page cache. Second-granularity collection catches brief fault bursts that slower tools average away.

The Netdata solution

MongoDB monitoring with Netdata

Netdata monitors MongoDB with per-second metrics and automatic dashboards. Watch WiredTiger cache pressure, oplog window, connection counts, checkpoint stalls, and replication health in one place, correlated with the underlying host.

See MongoDB monitoring → Start monitoring free

MongoDB page faults high: working set exceeding memory after warmup

MongoDB page faults high: working set exceeding memory after warmup

What this means

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Reduce the working set or improve locality

Right-size the WiredTiger cache

Free pinned snapshots

Reduce memory competition

Scale out or archive cold data

Upgrade storage if faults are unavoidable

Prevention

How Netdata helps

Related guides

MongoDB monitoring with Netdata