MongoDB and Transparent Huge Pages: why THP must be disabled
If you see a mongod startup warning about Transparent Huge Pages, or you are chasing tail latency spikes and memory fragmentation that do not correlate with cache pressure, ticket exhaustion, or slow queries, the cause may be THP. The standard advice has always been to disable it, but that is now version-dependent. MongoDB 8.0 upgraded TCMalloc to use per-CPU caches, which changes the recommendation for most x86_64 Linux deployments. For MongoDB 7.0 and earlier, disabling THP is still correct. For 8.0 on x86_64 Linux, THP should be enabled. Running the wrong configuration silently degrades throughput and increases latency.
What THP is and why MongoDB cares
Transparent Huge Pages (THP) is a Linux kernel feature that aggregates standard 4 KB pages into 2 MB huge pages. Some workloads benefit from reduced TLB pressure. Database engines with sparse, non-contiguous memory access patterns typically do not. The kernel’s background compaction to form huge pages introduces latency spikes and memory fragmentation that show up as unpredictable tail latency in WiredTiger.
MongoDB officially recommended disabling THP for years. The most visible sign of a misconfiguration is the mongod startup warning, which fires when /sys/kernel/mm/transparent_hugepage/enabled is set to always. On versions prior to 4.2, the warning also checked the defrag sub-setting independently. From 4.2 onward, only the enabled setting triggers the warning on pre-8.0 releases.
How the recommendation changed in MongoDB 8.0
MongoDB 8.0 ships with an upgraded TCMalloc allocator that replaces per-thread caches with per-CPU caches. This design benefits from THP being enabled. The official guidance reversed for standard x86_64 Linux deployments.
Recommended settings for x86_64 Linux running MongoDB 8.0:
/sys/kernel/mm/transparent_hugepage/enabled=always/sys/kernel/mm/transparent_hugepage/defrag=defer+madvise/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none=0/proc/sys/vm/overcommit_memory=1
Exceptions: The legacy TCMalloc remains in use on RHEL/PPC64LE and s390x on RHEL 8, Oracle 8, RHEL 9, CentOS 9, and Oracle 9. Windows also uses legacy TCMalloc. On these platforms, THP must remain disabled even under MongoDB 8.0.
The per-CPU cache implementation depends on the kernel’s rseq (Restartable Sequences) feature, available from Linux 4.18. If another library registers an rseq structure before TCMalloc initializes, TCMalloc falls back to per-thread caches. In that situation, setting GLIBC_TUNABLES=glibc.pthread.rseq=0 before starting mongod is the documented workaround.
flowchart TD
A[Check MongoDB version] --> B{8.0 or later?}
B -->|Yes| C{Platform}
C -->|x86_64 Linux| D[Enable THP]
C -->|PPC64LE s390x Windows| E[Disable THP]
B -->|No| E
D --> F[Verify usingPerCPUCaches true]
E --> G[Verify kernel shows never]
D --> H[Configure persistence]
E --> HVerifying the current state
Before changing anything, confirm the architecture and MongoDB version:
uname -m
mongod --version | head -1
Then check the runtime kernel setting.
For MongoDB 7.0 and earlier, or for any version on PPC64LE, s390x, or Windows:
cat /sys/kernel/mm/transparent_hugepage/enabled
cat /sys/kernel/mm/transparent_hugepage/defrag
Expected output: never for both. If you see [always] inside the brackets, the system is using the configuration MongoDB warns against.
For MongoDB 8.0 on x86_64 Linux:
cat /sys/kernel/mm/transparent_hugepage/enabled
cat /sys/kernel/mm/transparent_hugepage/defrag
cat /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none
cat /proc/sys/vm/overcommit_memory
Expected output: always defer+madvise 0 1.
Some RHEL variants expose the control at /sys/kernel/mm/redhat_transparent_hugepage/enabled instead of the canonical path. Init scripts and service files should detect which path exists and write to the correct one.
On MongoDB 8.0, verify that the upgraded allocator is active:
db.serverStatus().tcmalloc.usingPerCPUCaches
This should return true on supported platforms. If it returns false despite correct THP settings, check whether the process fell back to per-thread caches due to an rseq conflict.
Making the setting persistent
Writing to sysfs directly is lost on reboot. Apply the setting before mongod starts.
tmpfiles.d. Create /etc/tmpfiles.d/transparent_hugepage.conf with a line such as w /sys/kernel/mm/transparent_hugepage/enabled - - - - never and run systemd-tmpfiles --create to apply it immediately. This applies the setting at boot and coexists cleanly with systemd unit files.
sysctl. vm.overcommit_memory is not exposed under sysfs. Persist it by creating /etc/sysctl.d/99-mongodb.conf containing vm.overcommit_memory = 1, then run sysctl --system.
GRUB boot parameter. Add transparent_hugepage=never (for pre-8.0 and legacy platforms) or remove it (for 8.0+ x86_64) from the kernel command line in /etc/default/grub. Regenerate the bootloader configuration with the appropriate command for your distribution, such as update-grub on Debian or Ubuntu, or grub2-mkconfig on RHEL-family systems. This survives kernel upgrades and is more robust than runtime scripts alone on immutable or cloud-provisioned instances.
tuned profiles. On RHEL and CentOS, the tuned daemon can override THP settings at runtime. Even with a correct systemd unit, switching to a tuned profile that sets transparent_hugepages=always will re-enable THP. You must create a dedicated tuned profile matching the desired THP state and activate it with tuned-adm profile <profile-name>. Failure to account for tuned is the most common reason THP silently re-enables after a reboot on RHEL-family systems.
Platform edge cases and gotchas
- Mixed-version clusters. A replica set or sharded cluster may run MongoDB 7.0 on some nodes and 8.0 on others during a rolling upgrade. Do not apply a single cluster-wide kernel tuning. The THP setting must match the mongod version running on each specific host.
- Containers. If mongod runs inside a container, THP is a host-level setting. An unprivileged container cannot change the host sysfs knobs. Tune the host kernel, not the container image.
- tuned overrides everything. A systemd one-shot unit that disables THP is useless if tuned later switches to a profile that enables it. Always verify with
tuned-adm activeafter provisioning changes. - Red Hat path variant. Scripts that hardcode
/sys/kernel/mm/transparent_hugepage/enabledwill fail on systems that expose/sys/kernel/mm/redhat_transparent_hugepage/enabled. Always test for path existence. - Windows. Windows does not use THP, but the legacy TCMalloc guidance applies. Use Windows-specific documentation rather than Linux init scripts.
Signals to watch in production
| Signal | Why it matters | Warning sign |
|---|---|---|
| mongod startup warning about THP | MongoDB detects an enabled THP configuration on a version or platform that requires it disabled | Log contains transparent hugepage warning |
tcmalloc.usingPerCPUCaches | On 8.0+ x86_64, confirms the upgraded allocator is active and paired with correct THP settings | Returns false on a supported platform after startup |
| OS page fault rate | THP compaction stalls or fragmentation can elevate page faults independent of working set growth | extra_info.page_faults rate increasing without cache pressure |
| Operation latency tails | THP-related stalls appear as sporadic latency spikes not tied to specific queries | opLatencies histogram shows unexplained bucket growth |
| Memory RSS | Fragmentation from misconfigured THP bloats RSS beyond expected bounds | RSS exceeds WiredTiger cache + 2 GB + connections by more than 20% |
| khugepaged CPU time | Excessive compaction churn consumes host CPU without benefit | khugepaged visible in top or perf output during latency spikes |
How Netdata helps
Netdata correlates MongoDB opLatencies with system-level CPU and memory metrics. THP stalls often appear as sporadic latency spikes without corresponding WiredTiger cache or ticket pressure.
- Monitor per-process page fault rates to detect memory management pressure that differs from ordinary cache miss behavior.
- Parse MongoDB logs for startup warnings to catch provisioning drift immediately after deploy or restart.
- Track memory RSS against expected baselines to identify fragmentation bloat before it triggers OOM risk.
- Alert on
khugepagedCPU usage to detect THP compaction churn on hosts where THP should be disabled.
Related guides
- How MongoDB actually works in production: a mental model for operators
- MongoDB pages evicted by application threads: when eviction becomes user latency
- MongoDB balancer stuck and jumbo chunks: permanent imbalance and how to fix it
- MongoDB WiredTiger cache dirty ratio high: the leading indicator nobody watches
- MongoDB WiredTiger cache pressure cascade: eviction stalls and latency spikes
- MongoDB cache too small: sizing the WiredTiger cache for your working set
- MongoDB checkpoint duration climbing: diagnosing slow WiredTiger checkpoints
- MongoDB checkpoint stall write freeze: when all writes stop with no error
- MongoDB chunk migration storms: moveChunk I/O pressure and range locks
- MongoDB connection churn: high totalCreated rate and thread creation overhead
- MongoDB connection refused at maxIncomingConnections: hitting the connection ceiling
- MongoDB connection storm spiral: reconnection floods after an election or deploy







