MongoDB cache too small: sizing the WiredTiger cache for your working set
When MongoDB latency doubles and disk read IOPS climb, operators usually check indexes and the query planner first. If db.currentOp() shows no runaway query and the slow query log is quiet, the culprit is often the WiredTiger cache.
WiredTiger maintains its own in-memory cache of uncompressed B-tree pages, separate from the OS page cache. MongoDB defaults the cache to max(0.5 * (RAM - 1 GB), 256 MB). That default works for a single mongod on a dedicated host, but it breaks down in containers, multi-tenant deployments, and during organic data growth. Once the working set exceeds the cache, reads fault to disk, pages are decompressed, and eviction threads compete with application threads for CPU.
The sections below cover sizing mechanics, common misconfigurations, and the production signals that precede inline eviction stalls.
What it is and why it matters
WiredTiger stores documents and indexes as uncompressed B-tree pages in its own cache. The OS page cache still holds the compressed on-disk representation. On a cache miss, WiredTiger reads from disk, decompresses the page, and loads it into the cache. If the cache is full, pages are evicted.
Because a page evicted from WiredTiger may still reside in the OS page cache, re-access avoids a physical disk read but still pays the decompression cost. When both caches are too small to cover the working set, every access becomes a physical disk read.
The default cache size is max(0.5 * (RAM - 1 GB), 256 MB). A 16 GB instance yields 7.5 GB; a 2 GB instance hits the 256 MB floor. You can override the default with --wiredTigerCacheSizeGB, though the change requires a process restart.
The working set is the subset of data and indexes your application touches in a typical window. It is rarely the full database size. A 1 TB database with a 10 GB working set runs comfortably in a 12 GB cache. A 100 GB database with a 90 GB working set suffocates in the same cache. When the working set exceeds the cache, the database enters constant churn: pages read in and evicted out, page faults rise, and latency couples to disk performance.
How it works
WiredTiger uses an LRU-like eviction system with multiple thresholds. Background eviction threads begin cleaning pages when the cache reaches roughly 80 percent fill. If fill climbs toward 95 percent, eviction becomes aggressive. If background threads cannot keep pace, application threads perform eviction inline before resuming their own work. That is the moment user-visible latency spikes.
WiredTiger distinguishes between clean and dirty eviction. Clean pages, which have not been modified since the last checkpoint, can be dropped quickly. Dirty pages must be reconciled and written to disk before their memory is reclaimed. A write-heavy workload therefore stresses both eviction and checkpoint subsystems simultaneously. If the cache is oversized relative to disk throughput, dirty data accumulates faster than the storage layer can flush it, turning a cache-size problem into a disk-queuing problem.
Dirty pages also drive checkpoint behavior. WiredTiger checkpoints every 60 seconds by default. The dirty ratio is the percentage of cache holding modified data not yet flushed to disk. When dirty data climbs above 20 percent of the cache, aggressive eviction triggers. If checkpoints cannot flush fast enough, write stalls become likely as threads wait for clean pages.
The cache retains old document versions for MVCC. Long-running transactions, cursors with noCursorTimeout, or large aggregation pipelines can pin old snapshots, preventing eviction and inflating the fill ratio even when the working set itself is stable.
flowchart TD
A[Host RAM] --> B[OS Page Cache]
A --> C[WiredTiger Cache]
C --> D[Uncompressed B-tree Pages]
D --> E{Working Set Fit}
E -->|Fits| F[Low latency queries]
E -->|Exceeds| G[Cache miss]
G --> H[Disk read into OS cache]
H --> I[Decompress to WiredTiger]
I --> J[Eviction pressure]
J --> K[Application threads evict]
K --> L[Latency spikes]Where it shows up in production
Containers are the most common place cache sizing goes wrong. Modern mongod versions attempt to detect cgroup memory limits, but detection is not guaranteed across all runtimes, versions, and cgroup configurations. If detection fails, mongod sizes the cache against host RAM. On a 64 GB host, an unconfigured mongod in a 4 GB pod could allocate roughly 31.5 GB and trigger an immediate OOM kill. Explicitly set --wiredTigerCacheSizeGB to roughly 50 percent of the container memory limit minus 1 GB, and always leave headroom for the runtime and sidecars.
On bare metal or VMs, the default formula assumes a single mongod instance owns the machine. Running multiple mongod processes, or collocating mongod with application servers, without reducing each cache proportionally causes aggregate memory oversubscription. The OS may swap, or the OOM killer may terminate processes.
Consider setting vm.swappiness to 1 on dedicated database hosts. Avoid disabling swap entirely unless you have confirmed OOM killer behavior is acceptable, because a system without swap can kill processes rather than degrade gracefully. Changing this requires root and should be validated in a non-production environment first.
Organic data growth is the silent cause. A cache that was generous at launch becomes tight six months later as hot indexes and documents expand. Because changing the cache size requires a rolling restart, operators often tolerate gradual degradation until a traffic spike converts a 5 ms p99 into a 500 ms p99.
Sizing rules and tradeoffs
Use the default only when mongod is the sole memory consumer on a stable, dedicated host. In containers, calculate --wiredTigerCacheSizeGB as roughly 50 percent of the container memory limit minus 1 GB, respecting the 256 MB floor. Leave headroom for connection buffers and thread stacks, the OS page cache, and internal overhead of roughly 500 MB to 1 GB.
For example, a pod with a 16 GB memory limit should set the cache to roughly 7 GB, leaving the remaining 9 GB for the OS page cache, connection overhead, and the container runtime. If the pod also runs a monitoring sidecar or Istio proxy, subtract those allocations before calculating the cache.
Do not size the cache to your total data set. Size it to your working set plus headroom for writes. If you do not know your working set size, use page fault rate as a proxy: after warmup, a sustained high rate of faults indicates the working set exceeds memory.
Write-heavy workloads need extra capacity for dirty pages. Read-heavy workloads benefit from a larger cache, but starving the OS page cache reduces the efficiency of compressed disk reads because compressed pages must be re-read from disk more frequently.
A cache fill of 75 to 80 percent with a low dirty ratio and no application-thread eviction is healthy. WiredTiger is designed to keep the cache nearly full. The dangerous state is 95 percent fill combined with rising application-thread evictions and a climbing dirty ratio.
If you must choose between slightly too small and slightly too large, err slightly large. The cost of an oversized cache is mild memory pressure. The cost of an undersized cache is a latency cascade that can exhaust tickets and stall the instance.
Signals to watch in production
| Signal | What it tells you | Threshold to act |
|---|---|---|
| Cache fill ratio | Occupied percentage of the configured cache. | Sustained >80% with rising eviction rates. |
| Dirty ratio | Write pressure versus checkpoint flush speed. | >15% is warning; >20% is critical and triggers aggressive eviction. |
| Application-thread evictions | Query threads forced to do eviction work. | Any sustained nonzero count adds latency to user operations. |
| Page faults | Working set is leaving memory. | Sustained high rate after warmup means disk is the hot path. |
| Checkpoint duration | Time to flush dirty data. | >30 seconds is concerning; >60 seconds risks write stalls. |
| Available read/write tickets | Storage engine concurrency headroom. | <25% of total available signals queuing and contention. |
Pull these metrics from db.serverStatus().wiredTiger.cache. Key fields include bytes currently in the cache, maximum bytes configured, and tracked dirty bytes in the cache. Compare current bytes to maximum to derive the fill ratio. Compare dirty bytes to current bytes to derive the dirty ratio. If the count of pages evicted by application threads grows steadily while background eviction metrics plateau, the eviction system is saturated.
How Netdata helps
Netdata collects wiredTiger.cache metrics from db.serverStatus(), exposing fill ratio, dirty ratio, and eviction counters without manual polling.
Correlating cache utilization with system.pgpgin and disk IOPS distinguishes a cache miss storm from a slow disk subsystem. If cache fill and pgpgin rise together while disk latency stays flat, the working set has exceeded memory. If disk latency rises in parallel, the storage layer itself is the bottleneck.
Composite alerts on cache fill combined with application-thread evictions catch the cascade before ticket exhaustion and request queue depth spikes.
Per-second resolution on opLatencies and cache metrics shows whether a latency spike coincides with eviction pressure or an independent query regression.







