Redis eviction policy tuning: allkeys-lru vs volatile-ttl vs noeviction

When Redis reaches maxmemory, it must either reject new writes or delete existing keys. The maxmemory-policy directive decides which path it takes, yet many production instances run with a policy that mismatches the workload. A cache running noeviction returns OOM errors to clients. A database running allkeys-lru silently deletes committed data. A session store running volatile-ttl suddenly rejects writes the moment an application bug omits a TTL.

What it is and why it matters

Redis stores data in memory. The maxmemory directive sets the ceiling; when a write would push used_memory over it, Redis evaluates maxmemory-policy before executing the command. Eviction runs synchronously in the command path, consuming main-thread CPU and adding latency to the triggering write.

The policy is a workload classification decision, not a tuning detail. allkeys-lru treats Redis as a recoverable cache, evicting the least recently used keys across the entire keyspace. volatile-ttl treats it as a mixed store, evicting only keys with an expiration and choosing the shortest remaining TTL. noeviction treats it as a database, rejecting any write that would exceed the limit. The wrong policy causes silent data loss or unexpected write failures that look like application bugs.

How it works

Before executing a write, Redis checks whether it would push used_memory over maxmemory. If so, it enters the eviction path before responding to the client.

Under allkeys-lru, Redis samples keys and removes the least recently used candidate. The eviction is silent. The command then proceeds. If the working set is larger than memory, this happens on nearly every write, turning the instance into a CPU-bound deletion engine that provides minimal caching value.

Under volatile-ttl, Redis evicts only keys that carry an expiration, choosing the one with the shortest remaining TTL. If no keys have a TTL, the policy has no candidates and behaves exactly like noeviction, returning an OOM error.

Under noeviction, Redis skips eviction and returns an error to the client. Reads continue to work, so the instance can appear healthy while silently failing every write.

Eviction consumes main-thread CPU. Under aggressive memory pressure, choosing and deleting keys adds latency to every write. Replicas ignore maxmemory by default; the primary handles eviction and replicates the resulting DEL commands downstream. After a failover, a promoted replica enforces its own configured policy, which can suddenly change behavior.

flowchart TD
    A[Write command arrives] --> B{used_memory >= maxmemory?}
    B -->|No| C[Execute write]
    B -->|Yes| D{maxmemory-policy}
    D -->|allkeys-lru| E[Evict least recently used key]
    D -->|volatile-ttl| F{Any keys have TTL?}
    F -->|Yes| G[Evict shortest TTL key]
    F -->|No| H[Return OOM error]
    D -->|noeviction| H
    E --> C
    G --> C

Where it shows up in production

Cache workloads with allkeys-lru

Pure cache deployments use allkeys-lru to keep hot data in memory. This works until the working set grows or traffic shifts. When the working set exceeds memory, Redis enters a memory pressure spiral: it evicts a key, the application misses and repopulates, the repopulation triggers another eviction, and CPU climbs while cache effectiveness collapses. The application sees elevated latency and backend load, but Redis returns no errors because the policy is doing exactly what it was configured to do.

Mixed workloads with volatile-ttl

Operators choose volatile-ttl when some data is permanent and some is transient. The policy works only if every evictable key carries an explicit TTL. If an application bug writes keys without an expiration, volatile-ttl has no candidates once memory fills, and writes fail with OOM errors even though the instance is configured to evict. The failure mode depends on application TTL discipline, not just Redis configuration.

Database and store workloads with noeviction

When Redis acts as a primary data store, noeviction prevents silent data loss. The tradeoff is that write-heavy bursts or unexpected growth cause immediate write rejections. If the client does not check the return value of SET or HSET, those failures are silent in the application layer. The instance continues serving reads, so the problem may not surface until downstream systems detect missing data.

Failover edge cases

Because replicas ignore maxmemory by default, they hold the dataset the master sends. After promotion, the new primary enforces its own maxmemory-policy. A replica configured with noeviction that is promoted after an allkeys-lru master will suddenly reject writes once memory is full. A replica with allkeys-lru that is promoted may begin evicting immediately if the dataset is at the limit. Always align replica and master policies, or at least treat the replica’s policy as the future primary’s policy.

Tradeoffs and when to use it

PolicyCorrect workloadFailure mode when wrong
allkeys-lruPure cache where every key is recoverable from an authoritative sourceSilent deletion of data the application treats as persistent; memory pressure spiral when working set exceeds RAM
volatile-ttlMixed store where transient keys always have TTLs and persistent keys have noneWrite failures if TTLs are missing; unpredictable eviction if TTLs are set arbitrarily
noevictionDatabase or primary store where no automatic deletion is acceptableImmediate OOM write rejections under memory pressure; silent application data loss if clients ignore errors

allkeys-lru

Use this only when Redis is strictly a cache and every key is reconstructable from an authoritative source. Do not use it if the application assumes a key survives until explicitly deleted. The critical risk is the memory pressure spiral: sustained eviction drives CPU saturation and renders the cache useless. If your access pattern is uniformly random or scan-like, LRU provides little value and you will evict keys that are immediately re-read.

volatile-ttl

Use this only when your application reliably sets TTLs on every key you are willing to lose. Verify this at the client layer, not just in the Redis configuration. A single code path that omits EXPIRE turns volatile-ttl into noeviction with a surprise OOM failure mode. This policy fits session stores and temporary queues where TTL discipline is enforced by the application framework.

noeviction

Use this when Redis is the system of record or when any automatic key removal would violate data integrity. The instance rejects writes rather than deleting data. Size memory so that the dataset plus overhead stays below maxmemory under normal and burst conditions. Ensure your application handles OOM errors gracefully, because the server returns them regularly if you undersize the instance.

Signals to watch in production

SignalWhy it mattersWarning sign
evicted_keys rateActive eviction volume. Non-zero means the instance is at maxmemory.Sustained rate > 10x baseline for cache workloads; any non-zero rate for database or persistent workloads
used_memory / maxmemory ratioProximity to the limit.> 0.9
keyspace_misses rateEvicted or expired keys being re-requested.Rising alongside evicted_keys, indicating a memory pressure spiral
total_error_replies rateCatches write rejections under noeviction or volatile-* when no TTL keys exist.Sustained increase; check errorstat_OOM:count in INFO errorstats on Redis 6.2+
instantaneous_ops_per_sec with rising latencyEviction runs synchronously in the command path.Throughput plateaus or drops while latency climbs, even though user commands are not individually slow

evicted_keys is a cumulative counter; the raw value is meaningless, so compute the rate of change. If maxmemory-policy is noeviction, evicted_keys stays at zero while writes fail, so total_error_replies is the only signal that catches the failure. On Redis 6.2+, check INFO errorstats for errorstat_OOM:count for precise diagnosis.

How Netdata helps

  • Correlate evicted_keys with keyspace_misses and instantaneous_ops_per_sec to detect a memory pressure spiral before the main thread saturates.
  • Alert on used_memory ratio and total_error_replies to catch noeviction OOM events that are invisible on read-heavy workloads.
  • Track main-thread latency and CPU to identify when eviction overhead in the command path causes latency spikes.
  • Monitor replica memory independently, since replicas ignore maxmemory until promotion and may suddenly enforce a different policy upon failover.