MongoDB and Transparent Huge Pages: why THP must be disabled

If you see a mongod startup warning about Transparent Huge Pages, or you are chasing tail latency spikes and memory fragmentation that do not correlate with cache pressure, ticket exhaustion, or slow queries, the cause may be THP. The standard advice has always been to disable it, but that is now version-dependent. MongoDB 8.0 upgraded TCMalloc to use per-CPU caches, which changes the recommendation for most x86_64 Linux deployments. For MongoDB 7.0 and earlier, disabling THP is still correct. For 8.0 on x86_64 Linux, THP should be enabled. Running the wrong configuration silently degrades throughput and increases latency.

What THP is and why MongoDB cares

Transparent Huge Pages (THP) is a Linux kernel feature that aggregates standard 4 KB pages into 2 MB huge pages. Some workloads benefit from reduced TLB pressure. Database engines with sparse, non-contiguous memory access patterns typically do not. The kernel’s background compaction to form huge pages introduces latency spikes and memory fragmentation that show up as unpredictable tail latency in WiredTiger.

MongoDB officially recommended disabling THP for years. The most visible sign of a misconfiguration is the mongod startup warning, which fires when /sys/kernel/mm/transparent_hugepage/enabled is set to always. On versions prior to 4.2, the warning also checked the defrag sub-setting independently. From 4.2 onward, only the enabled setting triggers the warning on pre-8.0 releases.

How the recommendation changed in MongoDB 8.0

MongoDB 8.0 ships with an upgraded TCMalloc allocator that replaces per-thread caches with per-CPU caches. This design benefits from THP being enabled. The official guidance reversed for standard x86_64 Linux deployments.

Recommended settings for x86_64 Linux running MongoDB 8.0:

  • /sys/kernel/mm/transparent_hugepage/enabled = always
  • /sys/kernel/mm/transparent_hugepage/defrag = defer+madvise
  • /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none = 0
  • /proc/sys/vm/overcommit_memory = 1

Exceptions: The legacy TCMalloc remains in use on RHEL/PPC64LE and s390x on RHEL 8, Oracle 8, RHEL 9, CentOS 9, and Oracle 9. Windows also uses legacy TCMalloc. On these platforms, THP must remain disabled even under MongoDB 8.0.

The per-CPU cache implementation depends on the kernel’s rseq (Restartable Sequences) feature, available from Linux 4.18. If another library registers an rseq structure before TCMalloc initializes, TCMalloc falls back to per-thread caches. In that situation, setting GLIBC_TUNABLES=glibc.pthread.rseq=0 before starting mongod is the documented workaround.

flowchart TD
    A[Check MongoDB version] --> B{8.0 or later?}
    B -->|Yes| C{Platform}
    C -->|x86_64 Linux| D[Enable THP]
    C -->|PPC64LE s390x Windows| E[Disable THP]
    B -->|No| E
    D --> F[Verify usingPerCPUCaches true]
    E --> G[Verify kernel shows never]
    D --> H[Configure persistence]
    E --> H

Verifying the current state

Before changing anything, confirm the architecture and MongoDB version:

uname -m
mongod --version | head -1

Then check the runtime kernel setting.

For MongoDB 7.0 and earlier, or for any version on PPC64LE, s390x, or Windows:

cat /sys/kernel/mm/transparent_hugepage/enabled
cat /sys/kernel/mm/transparent_hugepage/defrag

Expected output: never for both. If you see [always] inside the brackets, the system is using the configuration MongoDB warns against.

For MongoDB 8.0 on x86_64 Linux:

cat /sys/kernel/mm/transparent_hugepage/enabled
cat /sys/kernel/mm/transparent_hugepage/defrag
cat /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none
cat /proc/sys/vm/overcommit_memory

Expected output: always defer+madvise 0 1.

Some RHEL variants expose the control at /sys/kernel/mm/redhat_transparent_hugepage/enabled instead of the canonical path. Init scripts and service files should detect which path exists and write to the correct one.

On MongoDB 8.0, verify that the upgraded allocator is active:

db.serverStatus().tcmalloc.usingPerCPUCaches

This should return true on supported platforms. If it returns false despite correct THP settings, check whether the process fell back to per-thread caches due to an rseq conflict.

Making the setting persistent

Writing to sysfs directly is lost on reboot. Apply the setting before mongod starts.

tmpfiles.d. Create /etc/tmpfiles.d/transparent_hugepage.conf with a line such as w /sys/kernel/mm/transparent_hugepage/enabled - - - - never and run systemd-tmpfiles --create to apply it immediately. This applies the setting at boot and coexists cleanly with systemd unit files.

sysctl. vm.overcommit_memory is not exposed under sysfs. Persist it by creating /etc/sysctl.d/99-mongodb.conf containing vm.overcommit_memory = 1, then run sysctl --system.

GRUB boot parameter. Add transparent_hugepage=never (for pre-8.0 and legacy platforms) or remove it (for 8.0+ x86_64) from the kernel command line in /etc/default/grub. Regenerate the bootloader configuration with the appropriate command for your distribution, such as update-grub on Debian or Ubuntu, or grub2-mkconfig on RHEL-family systems. This survives kernel upgrades and is more robust than runtime scripts alone on immutable or cloud-provisioned instances.

tuned profiles. On RHEL and CentOS, the tuned daemon can override THP settings at runtime. Even with a correct systemd unit, switching to a tuned profile that sets transparent_hugepages=always will re-enable THP. You must create a dedicated tuned profile matching the desired THP state and activate it with tuned-adm profile <profile-name>. Failure to account for tuned is the most common reason THP silently re-enables after a reboot on RHEL-family systems.

Platform edge cases and gotchas

  • Mixed-version clusters. A replica set or sharded cluster may run MongoDB 7.0 on some nodes and 8.0 on others during a rolling upgrade. Do not apply a single cluster-wide kernel tuning. The THP setting must match the mongod version running on each specific host.
  • Containers. If mongod runs inside a container, THP is a host-level setting. An unprivileged container cannot change the host sysfs knobs. Tune the host kernel, not the container image.
  • tuned overrides everything. A systemd one-shot unit that disables THP is useless if tuned later switches to a profile that enables it. Always verify with tuned-adm active after provisioning changes.
  • Red Hat path variant. Scripts that hardcode /sys/kernel/mm/transparent_hugepage/enabled will fail on systems that expose /sys/kernel/mm/redhat_transparent_hugepage/enabled. Always test for path existence.
  • Windows. Windows does not use THP, but the legacy TCMalloc guidance applies. Use Windows-specific documentation rather than Linux init scripts.

Signals to watch in production

SignalWhy it mattersWarning sign
mongod startup warning about THPMongoDB detects an enabled THP configuration on a version or platform that requires it disabledLog contains transparent hugepage warning
tcmalloc.usingPerCPUCachesOn 8.0+ x86_64, confirms the upgraded allocator is active and paired with correct THP settingsReturns false on a supported platform after startup
OS page fault rateTHP compaction stalls or fragmentation can elevate page faults independent of working set growthextra_info.page_faults rate increasing without cache pressure
Operation latency tailsTHP-related stalls appear as sporadic latency spikes not tied to specific queriesopLatencies histogram shows unexplained bucket growth
Memory RSSFragmentation from misconfigured THP bloats RSS beyond expected boundsRSS exceeds WiredTiger cache + 2 GB + connections by more than 20%
khugepaged CPU timeExcessive compaction churn consumes host CPU without benefitkhugepaged visible in top or perf output during latency spikes

How Netdata helps

Netdata correlates MongoDB opLatencies with system-level CPU and memory metrics. THP stalls often appear as sporadic latency spikes without corresponding WiredTiger cache or ticket pressure.

  • Monitor per-process page fault rates to detect memory management pressure that differs from ordinary cache miss behavior.
  • Parse MongoDB logs for startup warnings to catch provisioning drift immediately after deploy or restart.
  • Track memory RSS against expected baselines to identify fragmentation bloat before it triggers OOM risk.
  • Alert on khugepaged CPU usage to detect THP compaction churn on hosts where THP should be disabled.