$ guides / redis / redis-mem-fragmentation-ratio-high ▌

Operations Guides

Redis mem_fragmentation_ratio high: jemalloc fragmentation and active defrag

A mem_fragmentation_ratio sustained above 1.5 on a production instance means Redis holds significantly more physical memory (RSS) than its logical dataset size (used_memory), wasting RAM that could hold data or absorb spikes. This is not a memory leak. Redis uses jemalloc by default, which does not return freed pages to the OS eagerly. Deleted or resized keys leave holes in allocator arenas, inflating RSS while used_memory stays flat or drops.

The immediate risk is OOM kill. The Linux OOM killer targets RSS. An instance with 20 GB of logical data and a fragmentation ratio of 2.0 occupies 40 GB RAM. A background fork for RDB or AOF copy-on-write can push RSS over a 48 GB host or cgroup limit and kill the process, even though the dataset appears to fit.

Ignore the ratio on instances with less than 50 MB of used_memory. jemalloc’s minimum allocation granularity dominates at small scale. On larger instances, sustained elevation above 1.5 is a capacity incident.

What this means

mem_fragmentation_ratio is used_memory_rss divided by used_memory. A value of 1.0 to 1.1 is optimal; 1.1 to 1.5 is common for active workloads. Sustained values above 1.5 indicate significant waste. Values below 1.0 on instances larger than 100 MB suggest swap, which is catastrophic for latency.

This ratio is coarse. It includes process overhead such as code segments, shared libraries, and stack space, not just allocator fragmentation. For precision, Redis 4.0 and later expose allocator_frag_ratio (allocator_active / allocator_allocated), which isolates true jemalloc external fragmentation. An allocator_frag_ratio above 4.0 warrants attention regardless of the top-level ratio.

The ratio is also unreliable after peak memory events. The allocator holds freed pages for reuse rather than releasing them to the OS. If Redis briefly filled memory with a bulk import and then deleted the keys, used_memory drops but RSS remains at the peak. This produces an artificially high ratio that may not reflect active fragmentation. In this scenario, mem_fragmentation_bytes (the absolute difference between RSS and used_memory) is often more actionable than the ratio itself.

flowchart TD
    A[mem_fragmentation_ratio above 1.5] --> B{used_memory below 50MB?}
    B -->|Yes| C[Noise: ignore]
    B -->|No| D{allocator_frag_ratio above 4.0?}
    D -->|Yes| E[True jemalloc fragmentation]
    D -->|No| F[RSS padding or process overhead]
    E --> G{active_defrag_running?}
    G -->|Yes| H[Check hits and misses ratio]
    G -->|No| I[Enable activedefrag or run MEMORY PURGE]
    F --> J[Check used_memory_peak vs used_memory]
    H --> K{Hits high?}
    K -->|Yes| L[Defrag working: wait or tune CPU]
    K -->|No| M[Workload skips large fields or defrag ineffective]

Common causes

Cause	What it looks like	First thing to check
Post-peak deallocation	Ratio jumps after bulk deletion or eviction; `used_memory` drops sharply while RSS stays flat	`used_memory_peak` against current `used_memory`
Heavy key churn with variable-size values	Ratio climbs steadily under write load as jemalloc arenas fragment	`allocator_frag_ratio` in `INFO memory`
Active defrag disabled or ineffective	Sustained high ratio on Redis 4.0+ with heavy mutation; defrag metrics show misses dominating hits	`active_defrag_hits` vs `active_defrag_misses`
Large hash or sorted set fields skipping defrag	`active_defrag_running` is positive but ratio does not decrease	`active_defrag_key_misses` relative to hits
Tiny instance noise	Ratio of 2.0 to 10.0 on instances with less than 50 MB of data	Absolute `used_memory` value

Quick checks

Run these read-only commands to characterize the situation before making changes.

# Check top-level fragmentation and memory totals
redis-cli INFO memory | grep -E "used_memory:|used_memory_rss:|mem_fragmentation_ratio:|used_memory_peak:"

# Check allocator-level fragmentation for precision (Redis 4.0+)
redis-cli INFO memory | grep -E "allocator_frag_ratio:|allocator_rss_ratio:"

# Check active defrag status and effectiveness
redis-cli INFO memory | grep active_defrag_running
redis-cli INFO stats | grep -E "active_defrag_hits|active_defrag_misses"
redis-cli CONFIG GET activedefrag

# Run built-in diagnostics (Redis 7.0+)
redis-cli MEMORY DOCTOR

# Check for THP interference
cat /sys/kernel/mm/transparent_hugepage/enabled

How to diagnose it

Filter out noise. If used_memory is below 50 MB, ignore the ratio.
Check for post-peak artifact. Compare used_memory_peak to current used_memory. If the peak is many times larger than the current value, the high ratio is likely residual RSS from prior bulk allocations.
Isolate true allocator fragmentation. On Redis 4.0+, inspect allocator_frag_ratio. If this is elevated above 4.0, you have genuine jemalloc external fragmentation. If it is low but mem_fragmentation_ratio is high, the gap is process overhead or RSS padding.
Assess active defrag state. If activedefrag is disabled and you are on Redis 4.0 or later, the instance is not attempting to compact live objects. If it is enabled but active_defrag_misses is high relative to active_defrag_hits, defrag is working hard without reducing fragmentation. This can happen when large hash or sorted set fields exceed active-defrag-max-scan-fields (default 1000) and are skipped per scan cycle.
Check for THP interference. If Transparent Huge Pages are not set to never, jemalloc’s page management is impaired, which can amplify fragmentation and worsen fork latency. The enabled file should read [never].
Correlate with persistence events. used_memory_rss spikes temporarily during RDB or AOF rewrite because of copy-on-write. If the ratio is elevated only during rdb_bgsave_in_progress or aof_rewrite_in_progress, the condition is transient.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
`mem_fragmentation_ratio`	Top-level waste indicator; OOM killer uses RSS	Sustained > 1.5 on instances > 50 MB
`allocator_frag_ratio`	True jemalloc external fragmentation, excluding process overhead	> 4.0
`used_memory_peak` vs `used_memory`	Identifies artificial inflation from prior bulk allocations	Peak is 2x or more above current
`active_defrag_running`	Indicates whether the defragmenter is active	Non-zero while misses exceed hits
`active_defrag_hits` / (`hits` + `misses`)	Measures whether defrag is successfully moving objects	< 0.5 while running
`used_memory_rss`	Physical memory footprint; determines OOM proximity	Approaching host or cgroup limit

Fixes

Post-peak RSS retention: run MEMORY PURGE

If the fragmentation is residual from a prior peak, MEMORY PURGE asks jemalloc to purge dirty pages so they can be reclaimed. This is a jemalloc-specific operation; it is a NOOP when using libc or tcmalloc. It reduces used_memory_rss without changing used_memory. On large heaps this command can be slow. Run it during a low-traffic window.

# Ask jemalloc to release dirty pages to the OS
redis-cli MEMORY PURGE

Sustained fragmentation: enable active defrag

For Redis 4.0 and later, enabling active defragmentation moves live objects to fresh memory and releases fragmented pages. This consumes CPU and can add latency if misconfigured. Enable it live to test:

redis-cli CONFIG SET activedefrag yes

The default thresholds are:

active-defrag-ignore-bytes 100mb
active-defrag-threshold-lower 10
active-defrag-threshold-upper 100
active-defrag-cycle-min 1
active-defrag-cycle-max 25
active-defrag-max-scan-fields 1000

Do not lower active-defrag-ignore-bytes without reason. The threshold exists to prevent the CPU cost of defrag on small absolute fragmentation. If mem_fragmentation_bytes is below 100 MB, defrag will not trigger even if the percentage threshold is met.

If defrag is enabled but the ratio does not drop, check whether your workload uses large hash or sorted set fields. Objects with fields above the active-defrag-max-scan-fields threshold are skipped during each cycle, which limits effectiveness on those key types. You can raise the scan limit, but doing so increases the CPU cost per cycle.

Version-specific active defrag bugs

Before applying active defrag as a long-term fix, check your Redis version. On Redis 7.2.5 and later, a known bug can cause RSS to grow unbounded even when active defrag is enabled. If you observe this behavior, the workaround is a planned restart. On Redis 8.0.0 and 8.0.1, enabling active defrag causes cron-based timers to run twice as fast due to a scheduling interaction. Upgrade to 8.0.2 or later if you use active defrag.

Disruptive fallback: restart

Restarting Redis resets RSS and eliminates fragmentation immediately. This is effective but causes downtime, cache warmup, and replication delay if the instance is a primary. Use it only when MEMORY PURGE and active defrag have failed, or when you hit version-specific bugs such as unbounded RSS growth despite active defrag. Plan the restart during a maintenance window.

System-level: disable Transparent Huge Pages

If THP is enabled, disable it. The following commands take effect immediately but do not survive reboot:

echo never | sudo tee /sys/kernel/mm/transparent_hugepage/enabled
echo never | sudo tee /sys/kernel/mm/transparent_hugepage/defrag

Persist the change in your system configuration (for example, sysctl or grub) if required.

Prevention

Monitor allocator_frag_ratio on Redis 4.0+ alongside mem_fragmentation_ratio to catch true fragmentation early. On large instances, track mem_fragmentation_bytes for an absolute view of wasted memory.
Review MEMORY MALLOC-STATS periodically to understand arena-level fragmentation. Sustained growth in dirty or muzzy pages indicates the allocator is struggling to reuse memory efficiently.
Size memory limits with fragmentation headroom. Persistent instances should keep used_memory below roughly 50% of physical RAM or maxmemory, whichever is tighter, to leave space for RSS overhead and COW during fork.
Avoid mass deletion patterns that leave jemalloc arenas sparse. When possible, use expiration with jitter rather than bulk deletes.
Evaluate active defrag effectiveness through active_defrag_hits and active_defrag_misses rather than enabling it and forgetting it.

How Netdata helps

Netdata collects mem_fragmentation_ratio, used_memory, and used_memory_rss natively, so you can correlate RSS growth with logical memory changes.
On Redis 4.0+, Netdata also surfaces allocator_frag_ratio, helping you distinguish allocator waste from process overhead.
Netdata’s alert templates filter out tiny instances, suppressing noise on instances below 50 MB.
You can correlate active_defrag_running with CPU utilization and command latency to spot when defrag itself is becoming a performance cost.
RSS and used_memory are plotted on the same charts, exposing post-peak deallocation patterns.

How Redis actually works in production: a mental model for operators: /guides/redis/how-redis-works-in-production/
Redis aof_last_write_status:err: AOF write failures and recovery: /guides/redis/redis-aof-last-write-status-err/
Redis appendfsync always latency: durability vs throughput trade-offs: /guides/redis/redis-appendfsync-always-latency/
Redis big keys: finding the giant key that blocks the event loop: /guides/redis/redis-big-keys-latency/
Redis blocked_clients growing: dead consumers vs healthy queues: /guides/redis/redis-blocked-clients-growing/
Redis BUSY Redis is busy running a script: blocking Lua and how to recover: /guides/redis/redis-busy-running-script/
Redis Can’t save in background: fork: Cannot allocate memory - diagnosis and fix: /guides/redis/redis-cant-save-in-background-fork/
Redis client output buffer overflow: slow consumers and client-output-buffer-limit: /guides/redis/redis-client-output-buffer-limit/
Redis cluster_slots_pfail > 0: impending node failure in a cluster: /guides/redis/redis-cluster-slots-pfail/
Redis CLUSTERDOWN / cluster_state:fail: slot coverage and recovery: /guides/redis/redis-cluster-state-fail/
Redis connected_clients climbing: connection leak detection: /guides/redis/redis-connected-clients-climbing/
Redis connected_slaves dropped: detecting replica disconnects on the primary: /guides/redis/redis-connected-slaves-dropped/

The Netdata solution

Redis monitoring with Netdata

Netdata monitors Redis with per-second metrics and ML anomaly detection. Track memory usage and fragmentation, fork/COW latency, replication backlog, evictions, and connection pressure to spot the failure modes in these runbooks early.

See Redis monitoring → Start monitoring free

Redis mem_fragmentation_ratio high: jemalloc fragmentation and active defrag

Redis mem_fragmentation_ratio high: jemalloc fragmentation and active defrag

What this means

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Post-peak RSS retention: run MEMORY PURGE

Sustained fragmentation: enable active defrag

Version-specific active defrag bugs

Disruptive fallback: restart

System-level: disable Transparent Huge Pages

Prevention

How Netdata helps

Related guides

Redis monitoring with Netdata