Redis latest_fork_usec too high: THP, NUMA, and fork latency
INFO stats shows latest_fork_usec in the hundreds of milliseconds. Every fork() blocks the single event loop, so during that window no commands are processed. Clients time out, replicas disconnect, and a full resync can trigger another fork, creating a loop of latency and reconnection storms. A normal fork costs roughly 10-20ms per gigabyte of resident memory with Transparent Huge Pages disabled. If you are seeing 10-100x that, the culprit is usually THP, NUMA, or memory overcommit policy.
What this means
latest_fork_usec measures wall-clock time of the fork(2) syscall Redis uses for background RDB saves, AOF rewrites, and full replication resyncs. It measures only the time the main thread is frozen, not the total BGSAVE duration. Redis is single-threaded for command execution, so the entire event loop stops during this window. For latency-sensitive workloads, the impact is indistinguishable from an outage.
With THP disabled, expect roughly 10-20ms per GB of used_memory_rss. Values above 200ms on a reasonably sized instance point to THP interference, NUMA remote-node memory access, or an overcommitted hypervisor. Sustained spikes above 500ms cause client-side timeouts. Above one second, cascading replica disconnects and full resyncs become likely. THP also amplifies copy-on-write cost: after fork, a single byte write to a 2MB huge page forces the kernel to copy the entire page, inflating RSS and latency together.
flowchart TD
A[Fork for BGSAVE or full resync] --> B{THP enabled or NUMA remote?}
B -->|Yes| C[Fork latency spikes >500ms]
B -->|No| D[Normal fork ~20ms/GB]
C --> E[Main thread frozen]
E --> F[Clients timeout]
F --> G[Replicas disconnect]
G --> H[Full resync on reconnect]
H --> ACommon causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Transparent Huge Pages enabled | latest_fork_usec is 10-100x baseline, often over 1s on multi-GB instances | cat /sys/kernel/mm/transparent_hugepage/enabled |
| NUMA misconfiguration | Elevated fork latency on large bare-metal or multi-socket VMs; memory allocated on a remote node | numactl --hardware and numastat |
vm.overcommit_memory not set to 1 | Fork fails or stalls under memory pressure; ENOMEM may appear in logs | sysctl vm.overcommit_memory |
| Dataset too large for the host | Fork latency grows linearly with RSS; approaching physical memory limits | used_memory_rss versus total system RAM |
| Overcommitted VM or page table fragmentation | Slower than expected forks even with THP disabled; virtualization overhead | Guest steal time, host hypervisor memory statistics, and RSS trends |
Quick checks
Run these safe, read-only commands to establish baseline state.
# Check the most recent fork duration in microseconds
redis-cli INFO stats | grep latest_fork_usec
# Check whether THP is active
cat /sys/kernel/mm/transparent_hugepage/enabled
# Check if a background save or rewrite is currently running
redis-cli INFO persistence | grep -E "rdb_bgsave_in_progress|aof_rewrite_in_progress"
# Check RSS to compute the expected fork baseline
redis-cli INFO memory | grep used_memory_rss
# Verify memory overcommit policy
sysctl vm.overcommit_memory
# Check replica count; full resyncs trigger additional forks
redis-cli INFO replication | grep connected_slaves
# Check for recent full resyncs
redis-cli INFO stats | grep -E "sync_full|sync_partial_err"
# Check NUMA layout
numactl --hardware
# Check per-node memory distribution for the Redis process (assumes one instance)
numastat -p $(pgrep -n redis-server)
How to diagnose it
Establish the per-GB ratio. Divide
latest_fork_usecby gigabytes ofused_memory_rss. If the result is much higher than 20ms/GB, continue. If it is under 20ms/GB, the fork is normal and the issue is likely dataset size or client timeout tuning.Check THP status. Run
cat /sys/kernel/mm/transparent_hugepage/enabled. If the value is[always], THP is active and is the most likely cause.[madvise]is usually safe for Redis because it allocates standard pages by default, but set it to[never]to eliminate the variable.Verify
vm.overcommit_memory. Runsysctl vm.overcommit_memory. Redis requires this to be1. Without it, the kernel performs conservative allocation checks that can causefork()to fail with ENOMEM or stall, surfacing asMISCONF Redis is configured to save RDB snapshots.Map the fork to a trigger. Check
rdb_bgsave_in_progressandaof_rewrite_in_progress. If neither is active butlatest_fork_usecupdated, the fork was likely triggered by a replica full resync. Checksync_fullandsync_partial_errinINFO statsto confirm.Evaluate memory headroom. Compare
used_memory_rssto total physical RAM. On persistent instances, maintain at least 50% headroom for COW. If RSS is over 50% of RAM, the kernel is under pressure and fork behavior becomes unpredictable even with correct settings.Inspect NUMA topology. On multi-socket hosts or large VMs, run
numactl --hardwareto list nodes, thennumastat -p $(pgrep -n redis-server)to check memory distribution. IfOther_Nodeis high or the process memory is not on the same node as its CPU, remote memory access is slowing page-table walks during fork. Co-locate CPU and memory, or use interleaved allocation.Enable latency monitoring if it is off. If
LATENCY LATESTreturns empty, the monitor is disabled. RunCONFIG SET latency-monitor-threshold 100, then checkLATENCY HISTORY forkafter the next persistence event to confirm the event is captured.Check for hypervisor overhead. If Redis runs on a virtualized host, run
vmstat 1ortopand watchst(steal time). Consistent steal time alongside ballooning or host overcommit inflates fork times without any change in guest configuration.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
latest_fork_usec | Direct measure of main-thread freeze during fork | >500ms, or >20ms per GB of RSS |
rdb_bgsave_in_progress / aof_rewrite_in_progress | Identifies when forks are triggered by persistence | Correlates with latency spikes |
used_memory_rss | Page table size and COW footprint drive fork cost | RSS approaching total RAM |
sync_full rate | Full resyncs force additional forks on the primary | Increasing counter means replicas are cycling |
mem_fragmentation_ratio | High fragmentation reduces available memory for COW | Sustained >1.5 |
LATENCY LATEST fork events | Built-in latency tracking confirms internal impact | Any fork event over your threshold |
Fixes
Disable Transparent Huge Pages
This is the most common fix and the first change to make.
# Disable THP immediately
echo never > /sys/kernel/mm/transparent_hugepage/enabled
The tradeoff is a slightly higher TLB miss rate for generic workloads, but Redis explicitly recommends disabling THP. The improvement is usually immediate. Persist the change across reboots via your distribution’s kernel boot parameters or sysfs utilities.
Set vm.overcommit_memory to 1
Without this, the kernel may refuse the fork or behave conservatively under load.
# Set immediately
sysctl -w vm.overcommit_memory=1
The tradeoff is that you rely on the OOM killer rather than allocation-time failure, but Redis requires this for reliable fork behavior. Set it permanently in /etc/sysctl.conf or a drop-in file.
Fix NUMA placement
If the host has multiple NUMA nodes, bind the Redis process to cores and memory on the same node, or use interleaved allocation across nodes so no single fork pays remote-memory latency. Binding requires a process restart. The tradeoff is that pinning to one node restricts CPU scheduling, while interleaving removes locality benefits for other workloads.
Reduce fork frequency
Increase repl-backlog-size to prevent full resyncs on brief replica disconnections. The default 1MB is almost always too small for production.
redis-cli CONFIG SET repl-backlog-size 104857600
Also review your save directives. Frequent automatic BGSAVE on a large instance multiplies the fork penalty. The tradeoff is that a larger backlog consumes more memory, and less frequent RDB snapshots widen your recovery point objective.
Right-size or shard the instance
If fork latency is still unacceptable after THP is disabled and NUMA is corrected, the dataset may be too large for a single process. Shard the keyspace across multiple Redis instances or enable clustering. The tradeoff is operational complexity, but it removes the single-core and single-fork bottleneck.
Prevention
- Bake THP disable into base images. Ensure every Redis host boots with THP set to
neverbefore the server starts. - Set
vm.overcommit_memory=1at boot. This avoids fork failures during traffic spikes. - Monitor
latest_fork_usecafter every fork. Alert when it exceeds 20ms per GB of RSS. - Size
repl-backlog-sizeto 100MB or more. This prevents replica reconnections from triggering expensive full resyncs. - Keep RSS below 50% of physical RAM on persistent instances. This leaves headroom for COW pages during the fork window.
How Netdata helps
- Correlates
latest_fork_usecwith persistence flags to tie fork spikes to BGSAVE, AOF rewrite, or replica sync events. - Tracks
used_memory_rss,mem_fragmentation_ratio, and system memory to flag COW pressure. - Surfaces replication metrics including
sync_fullandsync_partial_errto catch backlog overflow loops. - Captures
instantaneous_ops_per_secdrops that coincide with fork events, confirming client impact. - Monitors system-level THP state and
vm.overcommit_memoryalongside Redis metrics.
Related guides
- How Redis actually works in production: a mental model for operators
- Redis eviction policy tuning: allkeys-lru vs volatile-ttl vs noeviction
- Redis maxmemory not set: why every production instance needs a memory limit
- MISCONF Redis is configured to save RDB snapshots — what it means and how to fix it
- Redis monitoring checklist: the signals every production instance needs
- Redis monitoring maturity model: from survival to expert
- Redis OOM command not allowed when used memory > ‘maxmemory’ - causes and fixes
- Redis OOM-killed by the kernel: RSS, overcommit, and recovery







