How Redis actually works in production: a mental model for operators

Production incidents do not come from the Redis API. They come from invisible internals: a fork that doubles memory usage, a replication backlog that wraps around, a single slow command that freezes every client. You cannot debug a cascading failure at 3 a.m. without knowing which abstractions compete for which resources.

What it is and why it matters

Redis is a single-threaded event loop around an in-memory dataset. Every incident traces back to resource competition inside one process: memory consumed by the dataset, client buffers, replication backlogs, and allocator fragmentation; CPU consumed by command execution, active expiry, and defragmentation; disk I/O consumed by AOF fsync and RDB snapshots; and network bandwidth consumed by replication and Pub/Sub fan-out.

Without this model, you will misidentify a memory fragmentation spike as a leak, a replication backlog overflow as network instability, or a fork latency spike as generic CPU saturation.

How it works

flowchart TD
    CLIENTS[Client sockets] --> EVENT[Event loop
main thread] EVENT --> COMMANDS[Command execution] COMMANDS --> KEYSPACE[Keyspace hash table] KEYSPACE --> JEMALLOC[jemalloc allocator] EVENT --> BUFFERS[Client buffers] BUFFERS --> JEMALLOC KEYSPACE --> EXPIRY[Lazy + active expiry] JEMALLOC --> FORK[Fork for RDB/AOF
COW duplication] EVENT --> BACKLOG[Replication backlog
PSYNC2] KEYSPACE --> CLUSTER[16384 hash slots
CRC16 modulo] BACKLOG --> REPLICAS[Replica sync]

The single-threaded event loop. Redis uses epoll, kqueue, or IOCP to multiplex client sockets on one main thread. Since Redis 6.0, I/O threads can handle network read and write in parallel, but command execution remains strictly single-threaded. Any command that takes too long blocks every other client. Check SLOWLOG GET for offenders, but remember queue wait time is invisible there.

The keyspace. Keys are stored in a hash table that resizes in powers of two. Rehashing is incremental but adds CPU overhead; activerehashing yes lets the server continue serving reads and writes during resize, but expect latency jitter. Expires are handled lazily on access and actively by sampling random keys on every server cron cycle. The cycle frequency is set by hz (default 10). If keys expire faster than the active cycle deletes them, memory grows and CPU spikes.

Memory allocation. Redis compiles against jemalloc on Linux by default. It does not return memory to the OS eagerly. The gap between used_memory and used_memory_rss reflects allocator free lists, dirty pages, and internal fragmentation. Active defragmentation is available only with jemalloc; switching to libc malloc disables it.

Client buffers. Every connection carries a query buffer governed by client-query-buffer-limit (default 1 GB) and an output buffer controlled by client-output-buffer-limit per class: normal, replica, and pubsub. Normal clients default to unlimited output. During a spike, run CLIENT LIST and inspect the omem field per connection. A forgotten MONITOR client, a slow Pub/Sub subscriber, or a replica that cannot drain its stream allocates from the main heap and can cause sudden RSS growth.

Persistence forks. For RDB snapshots or AOF rewrites, Redis calls fork() to create a child process that serializes the dataset while the parent continues serving commands. Copy-on-write duplicates pages as they are modified. Under a write-heavy workload, RSS can temporarily approach double the dataset size. Disable Transparent Huge Pages before starting Redis; THP multiplies fork latency. During the fork itself, the main thread freezes; latency is proportional to dataset size.

Replication backlog. The primary maintains a fixed-size circular backlog controlled by repl-backlog-size (default 1 MB). Replicas identify themselves by a replication ID and an offset. If the offset falls within the backlog window, a partial resync resumes without a full RDB transfer. Since Redis 4.0, PSYNC2 retains the old master’s replication ID and a second replication offset on promoted replicas, enabling partial resync with downstream replicas after failover. If the backlog wraps before a replica reconnects, a full resync is required, triggering another fork.

Cluster slots. In cluster mode, data is sharded across 16384 hash slots. Keys map via CRC16(key) & 16383. Hash tags using curly braces pin related keys to the same slot, enabling multi-key commands. Gossip uses the client port plus 10000 and must be reachable between all nodes. Partial slot coverage produces a partial outage.

Where it shows up in production

Event loop blocking shows up as uniform latency across all commands. A single KEYS *, a large SMEMBERS, or an unoptimized Lua script freezes the entire server. If SLOWLOG GET shows repeated offenders while overall ops per second drops, the event loop is wedged. Use redis-cli --latency-history to measure queueing delay; if the p99 exceeds the slowlog execution times, the loop is saturated.

Memory pressure shows up in the gap between used_memory and used_memory_rss. The OS OOM killer uses RSS, not Redis’s logical allocator count. In containers, compare RSS to the cgroup limit, not just maxmemory. Fragmentation ratios above 1.5 waste physical RAM and reduce headroom for COW during fork. On instances over 100 MB, a ratio below 1.0 means swapping, which destroys latency. Replicas do not expire keys independently; since Redis 3.2 they mark keys logically expired on read, but the key remains in memory until the primary propagates a DEL. A high-TTL workload can inflate replica memory well beyond the primary’s live dataset size.

Client buffer bloat shows up as sudden RSS spikes even when the keyspace is stable. The default client-output-buffer-limit normal is 0 0 0 (unlimited). A slow Pub/Sub subscriber, a replica that cannot drain its replication stream, or a MONITOR session left running consumes memory from the main heap until either the limit disconnects the client or Redis is OOM killed.

Fork COW pressure shows up during scheduled BGSAVE, AOF rewrite, or an unplanned full resync. In containers with tight memory limits, the RSS spike from dirty page duplication triggers the OOM killer. The fork latency itself, tracked in latest_fork_usec, freezes the main thread. Replicas may time out during long forks, disconnect, and reconnect. If the replication backlog wrapped during the disconnect, another full resync begins.

Replication backlog overflow shows up in INFO stats as sync_full increments and sync_partial_err grows. A brief network blip accumulates more writes than the 1 MB default backlog can hold. The replica reconnects, partial resync fails, and a full resync begins. That resync forks the primary, causing latency. If multiple replicas fall behind simultaneously, the primary enters a fork-latency-resync loop. Increase repl-backlog-size before planned failovers to reduce the chance of full resyncs with downstream replicas.

Cluster slot issues show up as CLUSTERDOWN errors or MOVED and ASK redirects to clients. A node holding a disproportionate share of slots becomes a hot spot. Gossip on port plus 10000 must be reachable between all nodes; missed firewall rules here are the most common cause of cluster_slots_pfail growing.

Tradeoffs and common misuses

Single-threaded command execution trades simplicity and atomicity for a hard throughput ceiling. You cannot add cores to scale command execution. Once main-thread CPU approaches 100% of one core, latency degrades linearly.

jemalloc trades potential fragmentation for allocation performance. Switching to libc malloc eliminates active defragmentation entirely. If your workload has high churn, you need jemalloc and you must monitor fragmentation.

Fork-based persistence trades durability for memory headroom. RDB gives point-in-time snapshots with minimal runtime overhead but large fork cost. AOF gives finer granularity but grows unbounded without rewrite, and rewrite itself requires a fork. Redis 7.0+ uses multi-part AOF with a base file plus incremental deltas to reduce rewrite overhead, but the fork remains.

Replication backlog size trades memory against resync cost. A small backlog saves RAM but guarantees expensive full resyncs after any brief interruption. Production workloads should use at least 100 MB, with write-heavy systems using 512 MB or more.

Hash tags in cluster mode enable multi-key transactions but create slot-level hot spots. A single hash tag receiving heavy traffic pins all load to one node, negating the benefit of sharding.

Common misuses include running KEYS or FLUSHDB without understanding they block the event loop. KEYS scans the entire keyspace. FLUSHDB without the ASYNC flag deletes every key synchronously. Both appear to work in development and destroy latency in production. Using appendfsync always for maximum durability is another misuse in throughput-sensitive environments. Every write waits for fsync, turning disk latency into command latency. The default everysec is the pragmatic compromise, but it requires monitoring aof_delayed_fsync.

Signals to watch in production

SignalWhy it mattersWarning sign
used_memory / maxmemory ratioProximity to memory limit; eviction or write rejection followsRatio > 0.8 trending toward 0.9
mem_fragmentation_ratioAllocator efficiency; high values waste RAM, low values indicate swapSustained > 1.5 or < 0.8 on instances > 100 MB
latest_fork_usecMain thread freeze during RDB/AOF or full resync> 500 ms, or > 20 ms per GB of dataset
evicted_keys rateDataset exceeds memory; cache churn costs CPUSustained rate above baseline, especially with rising misses
sync_full and sync_partial_errBacklog insufficient; full resyncs cost fork latencyAny increase in sync_full or non-zero sync_partial_err
rejected_connectionsHard limit hit; clients are failing immediatelyAny rate > 0
Slowlog growth rateSpecific commands blocking the event loopRepeated entries for KEYS, large SORT, or Lua scripts
aof_delayed_fsyncDisk I/O cannot keep up with appendfsync everysecRate increasing, indicating growing durability window
cluster_state and cluster_slots_failSlot coverage determines availabilitycluster_state:fail or non-zero cluster_slots_fail
connected_clients / maxclientsConnection exhaustion approachingRatio > 0.8

How Netdata helps

  • Charts used_memory and used_memory_rss together to expose fragmentation or COW spikes.
  • Tracks latest_fork_usec alongside rdb_bgsave_in_progress and aof_rewrite_in_progress to isolate fork freeze from slow commands.
  • Surfaces replication offset lag, sync_full, and master_link_status in one view to spot backlog overflow before it forces full resyncs.
  • Correlates slowlog rate and command latency with CPU saturation to distinguish event loop blocking from core exhaustion.
  • Alerts on aof_delayed_fsync and aof_last_write_status to flag disk I/O pressure before write rejection.