Redis and Transparent Huge Pages: why THP must be disabled
Redis latency spikes during background saves, AOF rewrites, or replica full resyncs often trace to a frozen main thread. Clients time out. Replicas disconnect. If the write rate is high, the next reconnection triggers another fork, and the cascade repeats. One common root cause is Transparent Huge Pages (THP), enabled by default on most Linux distributions. Redis detects THP at startup and logs a warning, but provisioning automation often buries it. The impact is severe: THP can increase fork latency by 10 to 100 times by amplifying copy-on-write memory traffic.
What this means
THP reduces TLB pressure by collapsing contiguous 4KB memory pages into 2MB huge pages. This benefits many workloads, but it is catastrophic for Redis because of how fork and copy-on-write (COW) interact.
Redis uses fork to create child processes for BGSAVE, BGREWRITEAOF, and full replication syncs. The child inherits the parent’s page tables. Memory is marked read-shared, so parent and child appear to share the same physical pages. When either process writes to a shared page, the kernel copies that page before the write completes. This is COW.
With standard 4KB pages, a single write copies one 4KB page. With THP enabled, the kernel backs Redis memory with 2MB huge pages. The same single-byte write copies the entire 2MB page. That is 512 times more memory traffic per write. The child process dirties huge pages faster, RSS spikes higher, and the main thread remains frozen longer while the kernel handles the oversized copies.
At startup, Redis checks /sys/kernel/mm/transparent_hugepage/enabled. If the file contains [always], Redis prints a warning similar to the following:
WARNING: You have Transparent Huge Pages (THP) support enabled in your kernel.
This will create latency and memory usage issues with Redis.
To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root,
and add it to your /etc/rc.local in order to retain the setting after a reboot.
Redis must be restarted after THP is disabled (set to 'madvise' or 'never').
The inline command in the warning suggests madvise, which suppresses the warning. Most production operators prefer never to eliminate the risk entirely. Both values satisfy the startup check.
The penalty is not limited to fork events. The khugepaged background daemon, controlled by /sys/kernel/mm/transparent_hugepage/defrag, scans memory to collapse pages. This scanning consumes CPU and can introduce latency spikes independent of persistence operations. For Redis hosts, both knobs should be set to never.
flowchart TD
A[Redis forks for BGSAVE] --> B[Client issues write]
B --> C{THP enabled}
C -->|Yes| D[2MB huge page]
C -->|No| E[4KB page]
D --> F[Copy entire 2MB]
E --> G[Copy 4KB]
F --> H[512x COW amplification]
G --> I[Normal COW]
H --> J[Fork latency 10-100x]
I --> K[Baseline latency]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
THP set to [always] | Redis startup warning; fork latency 10-100x above baseline; latest_fork_usec spikes | cat /sys/kernel/mm/transparent_hugepage/enabled |
THP defrag (khugepaged) active | Latency spikes even when fork is not running; CPU usage from background defragmentation | cat /sys/kernel/mm/transparent_hugepage/defrag |
| THP disabled at runtime but not persistent | Warning returns after reboot; sporadic fork latency after host restart | Init scripts or systemd units for THP disable |
| THP disabled on host but Redis not restarted | Warning persists after sysfs change; process still using huge pages | Redis start time versus THP disable time |
Quick checks
Run these read-only checks to confirm THP status and correlate it with fork behavior.
# Check THP enabled mode
cat /sys/kernel/mm/transparent_hugepage/enabled
# Check THP defrag mode
cat /sys/kernel/mm/transparent_hugepage/defrag
# Check latest fork duration in microseconds
redis-cli INFO stats | grep latest_fork_usec
# Check if a background save or rewrite is active
redis-cli INFO persistence | grep -E "rdb_bgsave_in_progress|aof_rewrite_in_progress"
# Check Redis uptime to verify restart timing
redis-cli INFO server | grep uptime_in_seconds
If /sys/kernel/mm/transparent_hugepage/enabled returns [always], THP is active. If latest_fork_usec exceeds 500ms, or roughly 20ms per GB of dataset, the host is likely suffering from THP-related fork latency.
How to diagnose it
- Verify THP state. Read
/sys/kernel/mm/transparent_hugepage/enabledand/sys/kernel/mm/transparent_hugepage/defrag. If either shows[always], the host is misconfigured for Redis. - Check Redis logs for the startup warning. If THP was enabled when Redis started, the warning was emitted to stdout and the configured log destination.
- Correlate fork duration with dataset size. Run
redis-cli INFO stats | grep latest_fork_usec. Compare the value to the rule of thumb of roughly 10-20ms per GB of RSS on modern hardware with THP disabled. A value 10-100x higher indicates page-table or THP interference. - Check COW size after persistence. Run
redis-cli INFO persistence | grep -E "rdb_last_cow_size|aof_last_cow_size". If either exceeds 50% ofused_memory, write amplification during fork is unusually high. - Confirm the cascade. Check
redis-cli INFO stats | grep sync_fullandredis-cli INFO replication | grep connected_slaves. Ifsync_fullincrements after fork events andconnected_slavesfluctuates, replicas are timing out during slow forks and re-triggering resyncs.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
latest_fork_usec | Measures how long the main thread is frozen during fork | > 500,000 (500ms), or > 20ms per GB of dataset |
rdb_last_cow_size | Memory copied by COW during last RDB save | > 50% of used_memory |
aof_last_cow_size | Memory copied by COW during last AOF rewrite | > 50% of used_memory |
used_memory_rss during fork | Physical memory pressure from oversized page copies | Spike approaching host or container limit while rdb_bgsave_in_progress = 1 |
| THP sysfs state | Direct indicator of OS misconfiguration | [always] in /sys/kernel/mm/transparent_hugepage/enabled |
sync_full rate | Full resyncs triggered when replicas timeout during slow forks | Counter increments coinciding with fork events |
Fixes
Disable THP immediately at runtime
Run the following as root. These commands are safe and non-destructive.
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
Disabling THP at runtime prevents new huge page allocations. Existing huge pages mapped by running processes are not split and remain in place until unmapped.
Restart Redis
Disabling THP at runtime does not affect an already-running Redis process. You must restart Redis after disabling THP. If you disable THP via automation or reboot but do not restart Redis, the warning may persist and the process continues to operate on huge pages.
Make the setting persistent across reboots
Use a systemd oneshot unit or an /etc/rc.local entry to apply the settings before Redis starts. If the disable runs after Redis starts, the process will not benefit from the change until its next restart.
Boot-time kernel parameter
You can also disable THP at boot by adding transparent_hugepage=never to the kernel command line. If you use this approach, note that the defrag sysfs knob may still show a non-never value after boot. This is expected behavior on some kernels and does not mean defrag is active. However, the safest production posture is to set both the enabled and defrag knobs to never at runtime regardless of boot parameters.
Containers and managed Kubernetes
Some cloud environments and Kubernetes distributions block direct sysfs writes from application containers. In these environments, inject an init container or cloud-init script that sets both THP knobs before the Redis container starts. If your platform prevents even init container access to sysfs, you may need to disable THP at the node level through the cloud provider’s node configuration or daemon set.
Prevention
- Bake THP disable into base VM and container images. Every host provisioning a Redis workload should set both knobs before Redis starts.
- Use configuration management to enforce the setting and alert if
[always]reappears after patches or reboots. - Include THP status checks in pre-deployment health validation.
- Do not rely on application-level THP avoidance as a substitute for the OS-level disable.
How Netdata helps
- Correlates
latest_fork_usecwithrdb_bgsave_in_progressandaof_rewrite_in_progressto confirm that fork is the source of latency spikes. - Tracks
used_memory_rssduring persistence operations to quantify COW overhead and warn when RSS approaches host limits. - Surfaces fork duration thresholds tied to dataset size, helping distinguish normal fork cost from THP-induced pathology.
- Monitors
connected_slavesdrops andsync_fullincrements to detect replica disconnect cascades triggered by slow forks. - Combines Redis persistence signals with OS-level memory and CPU metrics to rule out disk saturation or NUMA issues.
Related guides
- How Redis actually works in production: a mental model for operators: /guides/redis/how-redis-works-in-production/
- Redis aof_last_write_status:err: AOF write failures and recovery: /guides/redis/redis-aof-last-write-status-err/
- Redis appendfsync always latency: durability vs throughput trade-offs: /guides/redis/redis-appendfsync-always-latency/
- Redis big keys: finding the giant key that blocks the event loop: /guides/redis/redis-big-keys-latency/
- Redis blocked_clients growing: dead consumers vs healthy queues: /guides/redis/redis-blocked-clients-growing/
- Redis BUSY Redis is busy running a script: blocking Lua and how to recover: /guides/redis/redis-busy-running-script/
- Redis Can’t save in background: fork: Cannot allocate memory - diagnosis and fix: /guides/redis/redis-cant-save-in-background-fork/
- Redis client output buffer overflow: slow consumers and client-output-buffer-limit: /guides/redis/redis-client-output-buffer-limit/
- Redis cluster_slots_pfail > 0: impending node failure in a cluster: /guides/redis/redis-cluster-slots-pfail/
- Redis CLUSTERDOWN / cluster_state:fail: slot coverage and recovery: /guides/redis/redis-cluster-state-fail/
- Redis connected_clients climbing: connection leak detection: /guides/redis/redis-connected-clients-climbing/
- Redis connected_slaves dropped: detecting replica disconnects on the primary: /guides/redis/redis-connected-slaves-dropped/







