Redis client output buffer overflow: slow consumers and client-output-buffer-limit

Redis memory climbs faster than the dataset justifies. used_memory approaches maxmemory, keys evict, or the process is OOM-killed, yet the keyspace has not grown. Logs show “scheduled to be closed ASAP for overcoming of output buffer limits,” or clients vanish and reconnect. The culprit is usually a slow consumer that cannot drain its output buffer as fast as Redis fills it. A forgotten MONITOR session or an application that left a socket open but stopped reading are the textbook cases. Redis allocates client output buffers from the main heap; unread response data counts against maxmemory. The default client-output-buffer-limit normal 0 0 0 leaves normal clients unbounded, turning one slow reader into a memory leak that can kill the instance.

What this means

Redis maintains an output buffer per client. When the server writes faster than the client reads, the buffer grows. Three independent limit classes control this: normal, pubsub, and replica. Each class takes a hard limit, a soft limit, and a soft-limit duration. Crossing the hard limit disconnects the client immediately. Exceeding the soft limit continuously for the duration also disconnects.

The normal class defaults to 0 0 0: no hard limit, no soft limit, no timeout. Pub/Sub and replica classes have defaults, but normal clients can grow without bound. Because the limit is checked only as the buffer grows, a client can accumulate gigabytes of buffered replies before Redis acts. Until the limit is crossed, the buffered memory counts against the Redis heap and drives eviction pressure or the OOM killer. This is especially dangerous with MONITOR, which echoes every command, or with a slow replica that cannot keep up.

flowchart TD
    A[Memory spike or client disconnect] --> B{Check CLIENT LIST omem}
    B -->|Normal client| C[Check for MONITOR or slow app]
    B -->|Pub/Sub| D[Check subscriber read loop]
    B -->|Replica| E[Check replica lag and backlog]
    C --> F[Set normal buffer limits]
    D --> G[Fix consumer or shard channels]
    E --> H[Fix replica or adjust replica limit]

Common causes

Cause	What it looks like	First thing to check
`MONITOR` left running	One normal client with `omem` tracking command throughput exactly	`CLIENT LIST \| grep cmd=monitor`
Slow Pub/Sub subscriber	One subscriber in `CLIENT LIST TYPE pubsub` with large `omem` while channel counts are stable	Sort `CLIENT LIST TYPE pubsub` by `omem`
Replica falling behind	Primary memory rises; replica shows intermittent `master_link_status:down` and `sync_full` increments	`CLIENT LIST TYPE replica` on the primary, sort by `omem`
Slow application consumer	One or more normal clients with large `omem`; often tied to large reads or a stalled socket drain	`CLIENT LIST TYPE normal`, sort by `omem`
Unbounded normal limit	No client disconnected for buffer growth; memory climbs until eviction or OOM	`CONFIG GET client-output-buffer-limit` returns `normal 0 0 0`

Quick checks

# Check aggregate client memory pressure
redis-cli INFO clients | grep -E "connected_clients|client_recent_max_output_buffer"

# Find the largest normal client output buffers
redis-cli CLIENT LIST TYPE normal | tr ' ' '\n' | grep "^omem=" | sort -t= -k2 -nr | head -10

# Detect an active MONITOR session
redis-cli CLIENT LIST | grep "cmd=monitor"

# Inspect Pub/Sub subscriber buffers
redis-cli CLIENT LIST TYPE pubsub | tr ' ' '\n' | grep "^omem=" | sort -t= -k2 -nr | head -10

# Inspect replica buffers on the primary
redis-cli CLIENT LIST TYPE replica | tr ' ' '\n' | grep "^omem=" | sort -t= -k2 -nr | head -10

# View current output buffer limit policy
redis-cli CONFIG GET client-output-buffer-limit

# Check if memory pressure is already causing evictions
redis-cli INFO stats | grep evicted_keys

# Compare dataset size to overhead to confirm non-data bloat
redis-cli INFO memory | grep -E "used_memory_dataset|used_memory_overhead"

How to diagnose it

Confirm buffers are the source. Compare used_memory to used_memory_dataset; a widening gap points to buffers or fragmentation. If used_memory_overhead grows while key count is flat, client buffers or replication backlogs are the likely cause. Check client_recent_max_output_buffer.
Classify the client type. Run CLIENT LIST TYPE normal, TYPE pubsub, and TYPE replica. Look for the largest omem in each class.
Identify the offender. Note addr, name, and cmd. One client with omem in the hundreds of megabytes is usually the target. If name is empty, adopt CLIENT SETNAME in your applications so future incidents map faster to services.
Determine if limits are unbounded. Run CONFIG GET client-output-buffer-limit. If the normal class is 0 0 0, there is no safety rail.
Correlate with workload. If cmd=monitor, the session echoes every command. If the client is a replica, compare master_repl_offset on the primary to slave_repl_offset on the replica; a widening gap that coincides with omem growth confirms the replica is the bottleneck. If it is a subscriber, check whether the application read loop is stalled.
Assess collateral damage. If used_memory is near maxmemory, check evicted_keys and total_error_replies . Buffer bloat can silently evict data or trigger write rejection before the slow client is disconnected.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
`CLIENT LIST` `omem`	Direct per-client output buffer size	Any single client `omem` exceeds the hard limit, or grows without bound while the normal limit is zero
`client_recent_max_output_buffer`	Recent peak output buffer across all clients	Sustained growth over minutes
`used_memory` vs `maxmemory`	Buffers compete with the dataset for memory	Ratio approaching 0.9 while client counts are stable
`used_memory_overhead`	Non-data memory including buffers and backlogs	Growth while keyspace size is flat
`evicted_keys` rate	Bloat forces premature eviction	Spike correlated with traffic but not keyspace growth
`master_link_status`	Replicas disconnected by limits appear as replication failures	Intermittent

[OUTPUT TRUNCATED: Response exceeded output token limit.]

The Netdata solution

Redis monitoring with Netdata

Netdata monitors Redis with per-second metrics and ML anomaly detection. Track memory usage and fragmentation, fork/COW latency, replication backlog, evictions, and connection pressure to spot the failure modes in these runbooks early.

See Redis monitoring → Start monitoring free