Redis and Transparent Huge Pages: why THP must be disabled

Redis latency spikes during background saves, AOF rewrites, or replica full resyncs often trace to a frozen main thread. Clients time out. Replicas disconnect. If the write rate is high, the next reconnection triggers another fork, and the cascade repeats. One common root cause is Transparent Huge Pages (THP), enabled by default on most Linux distributions. Redis detects THP at startup and logs a warning, but provisioning automation often buries it. The impact is severe: THP can increase fork latency by 10 to 100 times by amplifying copy-on-write memory traffic.

What this means

THP reduces TLB pressure by collapsing contiguous 4KB memory pages into 2MB huge pages. This benefits many workloads, but it is catastrophic for Redis because of how fork and copy-on-write (COW) interact.

Redis uses fork to create child processes for BGSAVE, BGREWRITEAOF, and full replication syncs. The child inherits the parent’s page tables. Memory is marked read-shared, so parent and child appear to share the same physical pages. When either process writes to a shared page, the kernel copies that page before the write completes. This is COW.

With standard 4KB pages, a single write copies one 4KB page. With THP enabled, the kernel backs Redis memory with 2MB huge pages. The same single-byte write copies the entire 2MB page. That is 512 times more memory traffic per write. The child process dirties huge pages faster, RSS spikes higher, and the main thread remains frozen longer while the kernel handles the oversized copies.

At startup, Redis checks /sys/kernel/mm/transparent_hugepage/enabled. If the file contains [always], Redis prints a warning similar to the following:

WARNING: You have Transparent Huge Pages (THP) support enabled in your kernel.
This will create latency and memory usage issues with Redis.
To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root,
and add it to your /etc/rc.local in order to retain the setting after a reboot.
Redis must be restarted after THP is disabled (set to 'madvise' or 'never').

The inline command in the warning suggests madvise, which suppresses the warning. Most production operators prefer never to eliminate the risk entirely. Both values satisfy the startup check.

The penalty is not limited to fork events. The khugepaged background daemon, controlled by /sys/kernel/mm/transparent_hugepage/defrag, scans memory to collapse pages. This scanning consumes CPU and can introduce latency spikes independent of persistence operations. For Redis hosts, both knobs should be set to never.

flowchart TD
  A[Redis forks for BGSAVE] --> B[Client issues write]
  B --> C{THP enabled}
  C -->|Yes| D[2MB huge page]
  C -->|No| E[4KB page]
  D --> F[Copy entire 2MB]
  E --> G[Copy 4KB]
  F --> H[512x COW amplification]
  G --> I[Normal COW]
  H --> J[Fork latency 10-100x]
  I --> K[Baseline latency]

Common causes

CauseWhat it looks likeFirst thing to check
THP set to [always]Redis startup warning; fork latency 10-100x above baseline; latest_fork_usec spikescat /sys/kernel/mm/transparent_hugepage/enabled
THP defrag (khugepaged) activeLatency spikes even when fork is not running; CPU usage from background defragmentationcat /sys/kernel/mm/transparent_hugepage/defrag
THP disabled at runtime but not persistentWarning returns after reboot; sporadic fork latency after host restartInit scripts or systemd units for THP disable
THP disabled on host but Redis not restartedWarning persists after sysfs change; process still using huge pagesRedis start time versus THP disable time

Quick checks

Run these read-only checks to confirm THP status and correlate it with fork behavior.

# Check THP enabled mode
cat /sys/kernel/mm/transparent_hugepage/enabled

# Check THP defrag mode
cat /sys/kernel/mm/transparent_hugepage/defrag

# Check latest fork duration in microseconds
redis-cli INFO stats | grep latest_fork_usec

# Check if a background save or rewrite is active
redis-cli INFO persistence | grep -E "rdb_bgsave_in_progress|aof_rewrite_in_progress"

# Check Redis uptime to verify restart timing
redis-cli INFO server | grep uptime_in_seconds

If /sys/kernel/mm/transparent_hugepage/enabled returns [always], THP is active. If latest_fork_usec exceeds 500ms, or roughly 20ms per GB of dataset, the host is likely suffering from THP-related fork latency.

How to diagnose it

  1. Verify THP state. Read /sys/kernel/mm/transparent_hugepage/enabled and /sys/kernel/mm/transparent_hugepage/defrag. If either shows [always], the host is misconfigured for Redis.
  2. Check Redis logs for the startup warning. If THP was enabled when Redis started, the warning was emitted to stdout and the configured log destination.
  3. Correlate fork duration with dataset size. Run redis-cli INFO stats | grep latest_fork_usec. Compare the value to the rule of thumb of roughly 10-20ms per GB of RSS on modern hardware with THP disabled. A value 10-100x higher indicates page-table or THP interference.
  4. Check COW size after persistence. Run redis-cli INFO persistence | grep -E "rdb_last_cow_size|aof_last_cow_size". If either exceeds 50% of used_memory, write amplification during fork is unusually high.
  5. Confirm the cascade. Check redis-cli INFO stats | grep sync_full and redis-cli INFO replication | grep connected_slaves. If sync_full increments after fork events and connected_slaves fluctuates, replicas are timing out during slow forks and re-triggering resyncs.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
latest_fork_usecMeasures how long the main thread is frozen during fork> 500,000 (500ms), or > 20ms per GB of dataset
rdb_last_cow_sizeMemory copied by COW during last RDB save> 50% of used_memory
aof_last_cow_sizeMemory copied by COW during last AOF rewrite> 50% of used_memory
used_memory_rss during forkPhysical memory pressure from oversized page copiesSpike approaching host or container limit while rdb_bgsave_in_progress = 1
THP sysfs stateDirect indicator of OS misconfiguration[always] in /sys/kernel/mm/transparent_hugepage/enabled
sync_full rateFull resyncs triggered when replicas timeout during slow forksCounter increments coinciding with fork events

Fixes

Disable THP immediately at runtime

Run the following as root. These commands are safe and non-destructive.

echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag

Disabling THP at runtime prevents new huge page allocations. Existing huge pages mapped by running processes are not split and remain in place until unmapped.

Restart Redis

Disabling THP at runtime does not affect an already-running Redis process. You must restart Redis after disabling THP. If you disable THP via automation or reboot but do not restart Redis, the warning may persist and the process continues to operate on huge pages.

Make the setting persistent across reboots

Use a systemd oneshot unit or an /etc/rc.local entry to apply the settings before Redis starts. If the disable runs after Redis starts, the process will not benefit from the change until its next restart.

Boot-time kernel parameter

You can also disable THP at boot by adding transparent_hugepage=never to the kernel command line. If you use this approach, note that the defrag sysfs knob may still show a non-never value after boot. This is expected behavior on some kernels and does not mean defrag is active. However, the safest production posture is to set both the enabled and defrag knobs to never at runtime regardless of boot parameters.

Containers and managed Kubernetes

Some cloud environments and Kubernetes distributions block direct sysfs writes from application containers. In these environments, inject an init container or cloud-init script that sets both THP knobs before the Redis container starts. If your platform prevents even init container access to sysfs, you may need to disable THP at the node level through the cloud provider’s node configuration or daemon set.

Prevention

  • Bake THP disable into base VM and container images. Every host provisioning a Redis workload should set both knobs before Redis starts.
  • Use configuration management to enforce the setting and alert if [always] reappears after patches or reboots.
  • Include THP status checks in pre-deployment health validation.
  • Do not rely on application-level THP avoidance as a substitute for the OS-level disable.

How Netdata helps

  • Correlates latest_fork_usec with rdb_bgsave_in_progress and aof_rewrite_in_progress to confirm that fork is the source of latency spikes.
  • Tracks used_memory_rss during persistence operations to quantify COW overhead and warn when RSS approaches host limits.
  • Surfaces fork duration thresholds tied to dataset size, helping distinguish normal fork cost from THP-induced pathology.
  • Monitors connected_slaves drops and sync_full increments to detect replica disconnect cascades triggered by slow forks.
  • Combines Redis persistence signals with OS-level memory and CPU metrics to rule out disk saturation or NUMA issues.
  • How Redis actually works in production: a mental model for operators: /guides/redis/how-redis-works-in-production/
  • Redis aof_last_write_status:err: AOF write failures and recovery: /guides/redis/redis-aof-last-write-status-err/
  • Redis appendfsync always latency: durability vs throughput trade-offs: /guides/redis/redis-appendfsync-always-latency/
  • Redis big keys: finding the giant key that blocks the event loop: /guides/redis/redis-big-keys-latency/
  • Redis blocked_clients growing: dead consumers vs healthy queues: /guides/redis/redis-blocked-clients-growing/
  • Redis BUSY Redis is busy running a script: blocking Lua and how to recover: /guides/redis/redis-busy-running-script/
  • Redis Can’t save in background: fork: Cannot allocate memory - diagnosis and fix: /guides/redis/redis-cant-save-in-background-fork/
  • Redis client output buffer overflow: slow consumers and client-output-buffer-limit: /guides/redis/redis-client-output-buffer-limit/
  • Redis cluster_slots_pfail > 0: impending node failure in a cluster: /guides/redis/redis-cluster-slots-pfail/
  • Redis CLUSTERDOWN / cluster_state:fail: slot coverage and recovery: /guides/redis/redis-cluster-state-fail/
  • Redis connected_clients climbing: connection leak detection: /guides/redis/redis-connected-clients-climbing/
  • Redis connected_slaves dropped: detecting replica disconnects on the primary: /guides/redis/redis-connected-slaves-dropped/