MISCONF Redis is configured to save RDB snapshots - what it means and how to fix it
Applications see MISCONF Redis is configured to save RDB snapshots, but it is currently not able to persist on disk on every write. Reads still work, but SET, HSET, LPUSH, and all mutating commands are rejected.
Redis makes itself read-only when the last background save failed and stop-writes-on-bgsave-error is yes (the default). The error persists until a subsequent BGSAVE succeeds. Retrying writes will not help. The instance is protecting you from accepting writes that can never be persisted. To recover, fix the underlying persistence failure and clear the error state with a successful save.
What this means
Redis persists data to disk by forking a child process that writes the dataset to an RDB file. This happens automatically based on the save directives in redis.conf, or manually via BGSAVE. If the child fails, rdb_last_bgsave_status flips to err and stays there. It is sticky: it remains err across reconnections and even after restart until a save succeeds.
With stop-writes-on-bgsave-error yes (the default), a failed background save stops all writes. If the server cannot snapshot its data, continuing to accept writes creates a silent durability gap. Every write accepted after the failure would be lost in a crash.
The MISCONF error is a safety mechanism, not a configuration mistake. Do not simply disable the setting unless you are running a pure cache that can repopulate from an authoritative source. Resolve why BGSAVE cannot complete.
flowchart TD
A[Client sees MISCONF on write] --> B{Check INFO persistence}
B -->|rdb_last_bgsave_status:err| C{Check disk space}
C -->|Full or quota| D[Free disk space]
C -->|Adequate space| E{Check logs and fork}
E -->|fork failed| F[Fix memory or overcommit]
E -->|Permission denied| G[Fix dir or file ownership]
E -->|Child OOM killed| H[Add memory headroom for COW]
D --> I[Trigger manual BGSAVE]
F --> I
G --> I
H --> I
I -->|BGSAVE ok| J[Writes resume]
I -->|BGSAVE fails| K[Re-check root cause]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Disk full or I/O errors | rdb_last_bgsave_status:err, disk utilization at 100%, or filesystem remounted read-only. | df -h on the volume holding the Redis dir. |
| Fork failure from memory pressure | Log contains Can't save in background: fork: Cannot allocate memory. Common on large instances or containers with tight memory limits. | vm.overcommit_memory and the gap between used_memory_rss and total RAM. |
| Permission denied on RDB path | The child process cannot open or rename the RDB file in the configured directory. Often after permission changes or SELinux policy updates. | Ownership and mode of the dir and dbfilename path. |
| Child process killed by OOM killer | The forked child exceeds container or host memory limits during copy-on-write and is killed mid-save. | dmesg or kernel logs for OOM killer events targeting the Redis child. |
| Conflicting RDB paths in containers | Multiple Redis instances share a volume and fight over the same dump.rdb file. Common in Docker Compose or misconfigured StatefulSets. | Whether CONFIG GET dir and dbfilename point to unique paths per instance. |
Quick checks
Run these read-only checks to confirm state and narrow the cause.
# Confirm the error state and see when the last save succeeded
redis-cli INFO persistence | grep -E "rdb_last_bgsave_status|rdb_last_save_time|rdb_bgsave_in_progress"
# Check if write errors are accumulating
redis-cli INFO stats | grep total_error_replies
# Check per-error counters on Redis 6.2+
redis-cli INFO errorstats
# Check free disk space on the persistence volume
df -h "$(redis-cli CONFIG GET dir | tail -1)"
# Check RDB file and directory ownership
ls -l "$(redis-cli CONFIG GET dir | tail -1)/$(redis-cli CONFIG GET dbfilename | tail -1)"
# Check if the kernel allows fork overcommit
cat /proc/sys/vm/overcommit_memory
# Check for recent OOM killer or fork failures in kernel logs
dmesg | grep -iE "oom|fork|redis"
How to diagnose it
Verify the sticky error. Run
redis-cli INFO persistenceand look forrdb_last_bgsave_status:err. If it isokbut writes are still failing, the issue is likely AOF-related or the error is coming from a replica with a different condition. If it iserr, noterdb_last_save_timeto see how stale the last successful snapshot is.Check disk space. Redis needs enough free space to write a temporary RDB file before atomically renaming it to the final file. Run
df -hon the persistence volume. A full volume is the root cause. Also check inode usage withdf -ion smaller or heavily fragmented filesystems.Inspect logs for fork failures. If disk space is adequate, check the Redis server log and
dmesgfor messages likefork: Cannot allocate memoryor OOM killer events. The Redis child process duplicates page tables duringfork(). On a memory-heavy instance, this can fail ifvm.overcommit_memoryis0or if a container memory limit leaves no room for page table overhead.Validate permissions. Ensure the user running
redis-serverhas write access to the directory returned byCONFIG GET dirand can create and rename files there. Do not assume the directory is correct after a migration or container restart.Check for path collisions. If you are running multiple Redis instances on the same host or shared volume, confirm each instance has a unique
diranddbfilename. Two instances writing the samedump.rdbsimultaneously will corrupt it or cause one to fail.Assess whether AOF is active. If
aof_enabledis1andaof_last_write_statusisok, you still have append-only durability while RDB is failing. This does not fix the MISCONF block, but it changes the urgency: you are not fully without persistence. If AOF is also failing, treat this as a critical data-loss risk.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
rdb_last_bgsave_status | Binary health of the last background save. | err persisting beyond one save interval. |
Age of rdb_last_save_time | How long since the last successful snapshot. | Older than 2x the configured save interval. |
| Disk free on persistence volume | RDB and AOF need sufficient free space for temp files. | Less than 3x the current dataset size. |
latest_fork_usec | The main thread is blocked during fork(). | Greater than 500ms, or trending upward with dataset growth. |
total_error_replies rate | Direct evidence that clients are seeing failures. | Increasing while rdb_last_bgsave_status is err. |
used_memory_rss vs host memory | Predicts fork failure and COW-driven OOM kills. | RSS exceeding 80% of available RAM. |
Fixes
Disk full or I/O errors
Free space on the persistence volume. Delete old logs, oversized backups, or orphaned temp files left by interrupted saves. After cleanup, trigger a manual save:
redis-cli BGSAVE
Watch rdb_bgsave_in_progress and rdb_last_bgsave_status until it returns ok. Do not simply restart Redis; a restart with a full disk will fail to save during shutdown and may leave you with no recent RDB on startup.
Fork failure or memory pressure
If the fork fails with a memory error and the host is not actually out of RAM, the most common cause is vm.overcommit_memory=0. Set it to 1 to allow the kernel to overcommit memory for the page tables Redis needs during fork():
sysctl -w vm.overcommit_memory=1
Make it permanent in /etc/sysctl.conf. This is a system-wide setting. If Redis runs inside a container, increase the container memory limit to account for copy-on-write overhead. Plan for at least 50% headroom above used_memory_rss on persistent instances; heavy writes during a fork can temporarily double RSS.
If freeing memory or adding capacity is not immediately possible and you need writes to resume, see the temporary workarounds below. Only use them if you understand the durability tradeoff.
Permission or ownership issues
Fix ownership so the Redis process can write to its configured directory:
chown redis:redis /var/lib/redis
If you use SELinux or AppArmor, check for denials in the audit log and adjust the profile. After fixing permissions, run BGSAVE and verify rdb_last_bgsave_status returns ok.
Path collisions in containers
If multiple instances target the same dump.rdb, reconfigure each instance with a unique dir or unique dbfilename, then restart the affected instances. Restarting is acceptable here because the root cause is configuration, not resource exhaustion.
Temporary workarounds and their tradeoffs
If you must unblock writes immediately and you are running a pure cache workload where data can be reconstructed from another source, you can disable the write block:
redis-cli CONFIG SET stop-writes-on-bgsave-error no
This makes the instance accept writes again even though RDB saves are failing. Your data loss window widens to the time since the last successful save. If you do this, also call CONFIG REWRITE to persist the change, or it will revert on restart.
Alternatively, if you rely on AOF for durability and do not need RDB snapshots at all, you can disable automatic RDB saves:
redis-cli CONFIG SET save ""
Again, follow with CONFIG REWRITE if you want the change to survive restart.
Do not use these workarounds on primary databases where RDB is the only persistence mechanism. The correct path is to fix the underlying failure and then run a successful BGSAVE.
Prevention
- Monitor disk space proactively. Maintain at least 3x the dataset size as free space on the persistence volume to accommodate temp files, AOF rewrites, and RDB snapshots simultaneously.
- Set
vm.overcommit_memory=1on every Redis host. Fork failures on healthy machines are almost always caused by this setting. - Size memory for copy-on-write. If you use RDB or AOF rewrite, keep
used_memory_rssbelow roughly 50% of physical RAM on persistent instances. On cache-only instances, 75% is a safer ceiling. - Audit persistence completion. Do not assume that because
savedirectives exist, saves are succeeding. Alert onrdb_last_bgsave_status:errand on stalerdb_last_save_time. - Validate container and orchestration configs. Ensure each Redis pod or container has a distinct persistent volume and does not share a
dump.rdbpath with another instance.
How Netdata helps
- Netdata collects
rdb_last_bgsave_statusandrdb_last_save_timefromINFO persistence, so you can alert on a stickyerrbefore application write failures spike. - Disk space and memory metrics on the same node are correlated with Redis persistence health, making it faster to distinguish a disk-full incident from a fork-failure incident.
total_error_repliesand throughput charts let you confirm the exact moment writes started failing and whether the failure correlates with a scheduledBGSAVE.- RSS and
used_memoryare plotted together, which helps you forecast when copy-on-write overhead during a fork will exceed available RAM. - Fork duration (
latest_fork_usec) is tracked over time, revealing when Transparent Huge Pages or memory pressure are slowing the background save process before it fails entirely.







