Redis stale replica promotion: silent data loss at failover
After a Redis failover, the new primary accepts writes immediately and clients reconnect without errors. Writes that were acknowledged by the old primary but not yet replicated are gone. This is stale replica promotion.
Redis replication is asynchronous by default. The primary persists a write locally, replies OK to the client, then streams the change to replicas. If the primary fails before a replica receives the outstanding writes, that replica never sees them. Sentinel or Redis Cluster promotes the best available replica. If that replica is lagging, the writes in the gap are permanently lost. The client receives no error and no log warns you. The data is simply missing.
What this means
In a standard primary/replica deployment, Sentinel selects a new primary by evaluating replica priority, replication offset, lag, and run ID. The best candidate is promoted even if it is not fully caught up.
Because the primary acknowledges writes before they reach replicas, there is always a window of unsynced data. Under normal conditions this window is small. Under network congestion, replica saturation, or a small replication backlog, lag can grow to megabytes or seconds. If the primary fails then, the promoted replica starts from an older state. Acknowledged but un-replicated writes are discarded. There is no rollback mechanism. The application may only notice through data inconsistencies or missing records.
flowchart TD
A[Primary receives write] --> B[Returns OK to client]
B --> C[Replicates to replica async]
C --> D[Replica lags behind]
D --> E[Primary fails]
E --> F[Sentinel promotes lagging replica]
F --> G[Unsynced writes are permanently lost]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Async replication without a safety fence | Primary accepts writes while replica is disconnected or lagging | CONFIG GET min-replicas-to-write |
| Replication backlog too small | A brief network blip forces a full resync, extending the lag window | CONFIG GET repl-backlog-size |
| Replica resource saturation | Replica CPU or network cannot keep up with the primary write rate | INFO replication offset delta |
| All replicas lag simultaneously | The least-lagging replica is still behind the old primary at failover | Compare replica offsets to the old primary’s last offset |
Quick checks
These read-only commands assess current exposure. Run them against the primary and each replica.
# Check whether the primary rejects writes when replicas are unavailable.
# A value of 0 means there is no safety fence.
redis-cli CONFIG GET min-replicas-to-write
redis-cli CONFIG GET min-replicas-max-lag
# On the primary: compare master_repl_offset to each replica's offset.
# The difference is the current loss window in bytes.
redis-cli INFO replication | grep -E "master_repl_offset|slave[0-9]"
# On the replica: confirm the link is up and view its offset.
# Note: on a replica, master_repl_offset is the replica's own offset.
redis-cli INFO replication | grep -E "master_repl_offset|master_link_status"
# Check backlog size. If it is too small for your write rate,
# brief disconnects force full resyncs.
redis-cli CONFIG GET repl-backlog-size
# Check for replication instability. Rising sync_full or sync_partial_err
# indicates the replica is struggling to stay in sync.
redis-cli INFO stats | grep -E "sync_full|sync_partial_err"
How to diagnose it
Because the data loss is silent, diagnosis is usually post-mortem or preventive assessment.
- Identify the failover timestamp from Sentinel logs or monitoring events.
- Retrieve the old primary’s last known
master_repl_offsetfrom metrics, an RDB header, orINFO replicationcaptured before the failure. If the node is unreachable, check whether your monitoring system recorded the final offset. - Retrieve the promoted replica’s offset at promotion time from metrics or logs. The byte difference is the data loss window.
- Check whether
min-replicas-to-writewas configured on the old primary. A value of 0 means no write fence was active, so the primary continued accepting writes while replicas were disconnected or lagging. - Review replication lag history for the 5-15 minutes before failover. Sustained or growing lag confirms the replica was falling behind.
- Examine
sync_partial_errandsync_fullcounters before the incident. Rising values indicate the replica was repeatedly failing partial syncs, which extends lag during recovery. - Check application logs for data inconsistencies that correlate with the failover window, such as duplicate keys, missing records, or foreign-key violations in downstream systems.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Replication offset lag | Byte gap between primary and replica is the maximum data loss window | Sustained lag larger than your replication backlog, or monotonically growing lag |
min-replicas-to-write / min-replicas-max-lag | Safety fence that rejects writes when replicas do not acknowledge | Value is 0 or missing in configuration |
master_link_status | A down link means the replica is not receiving writes | down for more than 60 seconds, or flapping |
sync_partial_err | Failed partial resyncs force expensive full resyncs that increase lag | Non-zero or increasing rate |
connected_slaves | Fewer replicas than expected reduces redundancy and safety | Below expected count for your topology |
Prevention
Configure a replication safety fence. Set
min-replicas-to-writeto at least 1 andmin-replicas-max-lagto a value that bounds your acceptable loss window (for example, 10 seconds). The primary will then reject writes when fewer replicas are connected or when they lag beyond the threshold. This trades availability for consistency: if all replicas disconnect, writes block. Tune the value to your network RTT and replica capacity. A value of 0, the default, means no fence. Apply the change withCONFIG SETand persist it inredis.confso it survives restarts.Size the replication backlog. Increase
repl-backlog-sizefrom the 1 MB default to at least 100 MB, or enough to cover your typical disconnect duration multiplied by peak write throughput. For example, a 10 MB/s write rate and a 10 second blip require 100 MB. An undersized backlog overflows quickly, forcing full resyncs that leave the replica exposed for extended periods.Use WAIT for critical writes. The
WAIT numreplicas timeoutcommand blocks until at leastnumreplicasacknowledge the write. For example,WAIT 1 100waits up to 100 ms for one replica. If the timeout expires, the command returns the number of replicas that synced, but the write remains on the primary. It does not make Redis a CP system.Monitor offset lag, not just link status. A replication link can report
upwhile the replica is megabytes behind. Alert on the byte delta betweenmaster_repl_offsetand the replica’s reported offset. Treat monotonically growing lag, or lag that exceeds your replication backlog, as an incident requiring investigation.Right-size replica resources. A replica that saturates its CPU, memory bandwidth, or network interface cannot apply writes as fast as the primary. This creates structural lag. Safety fences and backpressure cannot compensate for an under-provisioned replica. Profile the replica’s
used_cpu_sysand network throughput during peak primary load.
How Netdata helps
- Tracks replication offset lag continuously, so you can verify whether the promoted replica was behind at failover time.
- Alerts on
master_link_status:downand drops inconnected_slaves. - Correlates
sync_fullandsync_partial_errspikes with system events to surface replication instability. - Surfaces CPU, memory, and network saturation on replica nodes to explain why lag is growing.
- Retains historical metrics around failover events, enabling post-mortem offset comparison without relying solely on logs.
Related guides
- How Redis actually works in production: a mental model for operators
- Redis aof_last_write_status:err: AOF write failures and recovery
- Redis appendfsync always latency: durability vs throughput trade-offs
- Redis blocked_clients growing: dead consumers vs healthy queues
- Redis BUSY Redis is busy running a script: blocking Lua and how to recover
- Redis Can’t save in background: fork: Cannot allocate memory - diagnosis and fix
- Redis client output buffer overflow: slow consumers and client-output-buffer-limit
- Redis connected_clients climbing: connection leak detection
- Redis connection exhaustion: leaks, pools, and the retry storm
- Redis event loop blocked: when one slow command freezes everything
- Redis eviction policy tuning: allkeys-lru vs volatile-ttl vs noeviction
- Redis fork/COW memory storm: why persistence doubles RSS and OOM-kills the box







