MongoDB RSS growing without cache growth: leaks, threads, and tcmalloc fragmentation
db.serverStatus().mem.resident climbs while WiredTiger cache utilization stays flat and the host is not swapping. Virtual memory is larger than RSS by design and is not an alert target. Only RSS reflects physical memory pressure. When RSS grows without cache growth, the problem lives outside the storage engine.
This pattern points to one of three areas: tcmalloc heap retention and fragmentation, per-connection thread stack accumulation, or unbounded internal allocations from cursors, plan caches, or aggregation pipelines. Each connection reserves roughly 1 MB of stack space, so a connection storm can add gigabytes of RSS in minutes. TCMalloc caches freed memory in per-thread or per-CPU arenas, which inflates RSS independently of the WiredTiger cache.
Version-specific allocator changes complicate the picture. MongoDB 8.0 switched to a per-CPU tcmalloc implementation that changes THP behavior. MongoDB 7.0 introduced a confirmed memory leak in the Slot-Based Execution plan cache (SERVER-96924). Container deployments add another wrinkle: MongoDB may detect host RAM instead of the container limit, leaving the cache unbounded relative to the cgroup.
Use the read-only checks below to isolate the source before restarting. A restart drops RSS and erases the diagnostic state you need to prevent recurrence.
What this means
WiredTiger cache is a managed buffer pool with its own memory budget (cacheSizeGB). When cache utilization is flat but RSS rises, the additional memory comes from the C++ heap (tcmalloc), thread stacks, or internal data structures. TCMalloc retains deallocated blocks in per-thread or per-CPU caches to reduce lock contention. This cached memory counts toward RSS but is invisible to WiredTiger.
MongoDB uses one thread per connection. Each backend thread reserves up to 1 MB of virtual address space for its stack, with typical usage in the tens to hundreds of kilobytes. At thousands of connections, thread stacks alone can consume multiple gigabytes of RSS.
Internal structures can also balloon. Aggregation pipeline stages allocate memory outside the WiredTiger cache, capped at 100 MB per stage by default . Cursors with noCursorTimeout hold snapshots open, pinning memory until they close. The query plan cache, particularly in MongoDB 7.0, has exhibited unbounded growth under specific query patterns.
flowchart TD
A[RSS growing] --> B{Cache flat?}
B -->|Yes| C[Non-cache growth]
C --> D[tcmalloc retention]
C --> E[Connection threads]
C --> F[Cursor leaks]
C --> G[Plan cache leak]
C --> H[Aggregation memory]
B -->|No| I[See cache pressure guides]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| TCMalloc heap retention and fragmentation | RSS exceeds active allocations; pageheap_free_bytes plus total_free_bytes is high | db.serverStatus().tcmalloc |
| Connection thread stack accumulation | RSS spikes correlate with connection count spikes; current in the thousands | db.serverStatus().connections |
| Cursor or aggregation memory leak | open.noTimeout or open.total growing; long-running aggregations | db.serverStatus().metrics.cursor and db.currentOp() |
| Plan cache leak (MongoDB 7.0 SBE) | Unbounded RSS growth on 7.0 with complex $in arrays; plan cache size climbing | db.serverStatus().metrics.query.planCacheTotalSizeEstimateBytes |
| Container cache misconfiguration | RSS approaches container memory limit while cacheSizeGB is sized for host RAM | db.serverStatus().wiredTiger.cache maximum bytes vs cgroup limit |
Quick checks
# Compare RSS to WiredTiger cache limit
mongosh --quiet --eval 'var s=db.serverStatus(); print("RSS MB: " + s.mem.resident); print("Cache max GB: " + (s.wiredTiger.cache["maximum bytes configured"]/1024/1024/1024).toFixed(1));'
// Check tcmalloc retained memory
var tc = db.serverStatus().tcmalloc;
print("Retained bytes: " + (tc.pageheap_free_bytes + tc.total_free_bytes));
print("Heap size to allocated ratio: " + (tc.generic.heap_size / tc.generic.current_allocated_bytes).toFixed(2));
// Check connection count and churn
var c = db.serverStatus().connections;
printjson({current: c.current, available: c.available, totalCreated: c.totalCreated});
// Check cursor state
printjson(db.serverStatus().metrics.cursor);
// Check plan cache size
print("Plan cache bytes: " + db.serverStatus().metrics.query.planCacheTotalSizeEstimateBytes);
// Find long-running operations
db.currentOp({ "active": true, "secs_running": { "$gt": 60 } }).inprog.forEach(function(op) {
print(op.opid + " | " + op.op + " | " + op.secs_running + "s | " + op.ns);
});
# Check cgroup memory limit if containerized
cat /sys/fs/cgroup/memory/memory.limit_in_bytes 2>/dev/null || cat /sys/fs/cgroup/memory.max 2>/dev/null
How to diagnose it
- Confirm the cache is flat. Sample
db.serverStatus().wiredTiger.cachefill ratio and dirty ratio over time. Stable values below 80 percent rule out cache-driven growth. - Compare RSS to the expected baseline. Expected RSS is approximately
cacheSizeGBplus 500 MB to 1 GB of internal overhead plusconnections.currentmultiplied by roughly 1 MB. If actual RSS exceeds this by more than 20 percent, continue. - Check tcmalloc stats. Sum
pageheap_free_bytesandtotal_free_bytes. If this sum represents a large portion of the RSS gap, the cause is fragmentation or allocator caching rather than a leak. - Check connection count. If
currentis high (thousands) and correlates with the RSS timeline, thread stacks are the likely source. A hightotalCreateddelta indicates churn. - Check cursor state. Elevated
open.noTimeoutor growingopen.totalwithout corresponding workload means cursors are leaking. Each one may hold a snapshot and memory. - Check plan cache size on MongoDB 7.0. Continuous growth of
planCacheTotalSizeEstimateByteswith complex$inarrays suggests the SBE plan cache leak (SERVER-96924). - Check for aggregation memory pressure. Review
db.currentOp()for long-running aggregations. - Verify container memory limits. Ensure
cacheSizeGBis sized for the container limit, not host RAM.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
mem.resident | Physical memory consumed by mongod | Exceeds (cacheSizeGB + 2GB + connections x 1MB) by more than 20 percent |
tcmalloc retained bytes | Allocator-cached memory inflates RSS independently of cache | pageheap_free_bytes + total_free_bytes grows steadily or exceeds 20 percent of RSS |
connections.current | Each connection adds roughly 1 MB stack RSS | Sustained count greater than 1000 or rapid spikes |
metrics.cursor.open.noTimeout | Leaked cursors pin snapshots and memory | Count greater than 10 or growing steadily |
metrics.query.planCacheTotalSizeEstimateBytes | Plan cache leak indicator on affected versions | Monotonic growth without workload change |
wiredTiger.cache fill ratio | Rules out cache-driven growth | Flat while RSS climbs |
opcounters.getmore | High cursor iteration rates can indicate leaked or large result sets | Spike without corresponding query increase |
currentOp max duration | Long-running aggregations allocate outside cache | Operations exceeding 300 seconds |
Fixes
TCMalloc fragmentation and retention
High pageheap_free_bytes plus total_free_bytes with stable current_allocated_bytes indicates fragmentation, not a leak.
- MongoDB 8.0: Verify THP is enabled . Ensure Restartable Sequences (rseq) are available. If glibc registered rseq first and tcmalloc fell back to per-thread caches, set
GLIBC_TUNABLES=glibc.pthread.rseq=0before startingmongod. - Prior to 8.0: Disable THP to reduce latency spikes and fragmentation.
- If retained memory threatens OOM: Schedule a rolling restart during a maintenance window. Disruptive but effective.
Connection and thread stack overhead
Reduce connection count to shrink the aggregate stack footprint.
- Review driver pool sizes. Lower
maxIncomingConnectionsif the server accepts more than the workload needs. - Fix connection churn. A high
totalCreateddelta means pools are destroying and recreating connections. Check for network blips, DNS issues, or election storms causing mass reconnects. - Tradeoff: Lowering limits may cause connection refused errors during spikes, but prevents memory exhaustion.
Cursor and aggregation leaks
- Kill leaked cursors. Identify long-running
noTimeoutcursors indb.currentOp()and terminate them withdb.killOp()if safe. - Fix application code to close cursors explicitly and avoid
noCursorTimeoutunless necessary. - For aggregations that risk exceeding memory limits, enable
allowDiskUse: true. This spills intermediate data to disk and avoids OOM, though it increases latency. In MongoDB 6.0 and later,allowDiskUseByDefaultcontrols the global default . - Tradeoff: Disk-based aggregation increases I/O load and slows pipeline execution.
Plan cache leak (MongoDB 7.0)
- Upgrade to MongoDB 8.0, which resolves SERVER-96924.
- If upgrading is not viable, disable the Slot-Based Execution engine .
- As an interim measure, schedule weekly rolling restarts to truncate the plan cache.
- Tradeoff: Disabling SBE may change query plans and performance characteristics. Test before applying.
Container memory limits
- Explicitly set
cacheSizeGBinmongod.confbased on the container memory limit, not host RAM. - Leave headroom for connection stacks and heap overhead (typically
cacheSizeGBplus 2 to 3 GB).
Prevention
- Trend RSS, cache fill, and tcmalloc retained bytes together. A widening gap between RSS and cache used predicts allocator pressure before it becomes critical.
- Monitor connection count and
totalCreateddelta. Alert on connection churn, not just max connections. - Avoid
noCursorTimeoutcursors in application code. Close cursors explicitly and use standard timeouts. - Size
cacheSizeGBexplicitly in containers and standalone deployments so MongoDB does not default to host RAM. - For MongoDB 7.0 deployments using complex aggregations with large
$inarrays, plan an upgrade path to 8.0.
How Netdata helps
- Charts
mem.residentalongsidewiredTiger.cachefill ratio, exposing divergence between RSS and cache. - Tracks connection count and churn.
- Collects
tcmallocmemory statistics where exposed, distinguishing allocator-retained bytes from active allocations. - High-resolution operation latency and queue depth metrics help identify cursor leaks and aggregation pressure.
- Container-aware memory charts reveal cgroup limit pressure even when the process sees host RAM.
Related guides
- How MongoDB actually works in production: a mental model for operators
- MongoDB pages evicted by application threads: when eviction becomes user latency
- MongoDB balancer stuck and jumbo chunks: permanent imbalance and how to fix it
- MongoDB WiredTiger cache dirty ratio high: the leading indicator nobody watches
- MongoDB WiredTiger cache pressure cascade: eviction stalls and latency spikes
- MongoDB cache too small: sizing the WiredTiger cache for your working set
- MongoDB checkpoint duration climbing: diagnosing slow WiredTiger checkpoints
- MongoDB checkpoint stall write freeze: when all writes stop with no error
- MongoDB chunk migration storms: moveChunk I/O pressure and range locks
- MongoDB connection churn: high totalCreated rate and thread creation overhead
- MongoDB connection refused at maxIncomingConnections: hitting the connection ceiling
- MongoDB connection storm spiral: reconnection floods after an election or deploy







