$ guides / redis / redis-misconf-rdb-snapshots ▌

Operations Guides

MISCONF Redis is configured to save RDB snapshots - what it means and how to fix it

Applications see MISCONF Redis is configured to save RDB snapshots, but it is currently not able to persist on disk on every write. Reads still work, but SET, HSET, LPUSH, and all mutating commands are rejected.

Redis makes itself read-only when the last background save failed and stop-writes-on-bgsave-error is yes (the default). The error persists until a subsequent BGSAVE succeeds. Retrying writes will not help. The instance is protecting you from accepting writes that can never be persisted. To recover, fix the underlying persistence failure and clear the error state with a successful save.

What this means

Redis persists data to disk by forking a child process that writes the dataset to an RDB file. This happens automatically based on the save directives in redis.conf, or manually via BGSAVE. If the child fails, rdb_last_bgsave_status flips to err and stays there. It is sticky: it remains err across reconnections and even after restart until a save succeeds.

With stop-writes-on-bgsave-error yes (the default), a failed background save stops all writes. If the server cannot snapshot its data, continuing to accept writes creates a silent durability gap. Every write accepted after the failure would be lost in a crash.

The MISCONF error is a safety mechanism, not a configuration mistake. Do not simply disable the setting unless you are running a pure cache that can repopulate from an authoritative source. Resolve why BGSAVE cannot complete.

flowchart TD
    A[Client sees MISCONF on write] --> B{Check INFO persistence}
    B -->|rdb_last_bgsave_status:err| C{Check disk space}
    C -->|Full or quota| D[Free disk space]
    C -->|Adequate space| E{Check logs and fork}
    E -->|fork failed| F[Fix memory or overcommit]
    E -->|Permission denied| G[Fix dir or file ownership]
    E -->|Child OOM killed| H[Add memory headroom for COW]
    D --> I[Trigger manual BGSAVE]
    F --> I
    G --> I
    H --> I
    I -->|BGSAVE ok| J[Writes resume]
    I -->|BGSAVE fails| K[Re-check root cause]

Common causes

Cause	What it looks like	First thing to check
Disk full or I/O errors	`rdb_last_bgsave_status:err`, disk utilization at 100%, or filesystem remounted read-only.	`df -h` on the volume holding the Redis `dir`.
Fork failure from memory pressure	Log contains `Can't save in background: fork: Cannot allocate memory`. Common on large instances or containers with tight memory limits.	`vm.overcommit_memory` and the gap between `used_memory_rss` and total RAM.
Permission denied on RDB path	The child process cannot open or rename the RDB file in the configured directory. Often after permission changes or SELinux policy updates.	Ownership and mode of the `dir` and `dbfilename` path.
Child process killed by OOM killer	The forked child exceeds container or host memory limits during copy-on-write and is killed mid-save.	`dmesg` or kernel logs for OOM killer events targeting the Redis child.
Conflicting RDB paths in containers	Multiple Redis instances share a volume and fight over the same `dump.rdb` file. Common in Docker Compose or misconfigured StatefulSets.	Whether `CONFIG GET dir` and `dbfilename` point to unique paths per instance.

Quick checks

Run these read-only checks to confirm state and narrow the cause.

# Confirm the error state and see when the last save succeeded
redis-cli INFO persistence | grep -E "rdb_last_bgsave_status|rdb_last_save_time|rdb_bgsave_in_progress"

# Check if write errors are accumulating
redis-cli INFO stats | grep total_error_replies

# Check per-error counters on Redis 6.2+
redis-cli INFO errorstats

# Check free disk space on the persistence volume
df -h "$(redis-cli CONFIG GET dir | tail -1)"

# Check RDB file and directory ownership
ls -l "$(redis-cli CONFIG GET dir | tail -1)/$(redis-cli CONFIG GET dbfilename | tail -1)"

# Check if the kernel allows fork overcommit
cat /proc/sys/vm/overcommit_memory

# Check for recent OOM killer or fork failures in kernel logs
dmesg | grep -iE "oom|fork|redis"

How to diagnose it

Verify the sticky error. Run redis-cli INFO persistence and look for rdb_last_bgsave_status:err. If it is ok but writes are still failing, the issue is likely AOF-related or the error is coming from a replica with a different condition. If it is err, note rdb_last_save_time to see how stale the last successful snapshot is.
Check disk space. Redis needs enough free space to write a temporary RDB file before atomically renaming it to the final file. Run df -h on the persistence volume. A full volume is the root cause. Also check inode usage with df -i on smaller or heavily fragmented filesystems.
Inspect logs for fork failures. If disk space is adequate, check the Redis server log and dmesg for messages like fork: Cannot allocate memory or OOM killer events. The Redis child process duplicates page tables during fork(). On a memory-heavy instance, this can fail if vm.overcommit_memory is 0 or if a container memory limit leaves no room for page table overhead.
Validate permissions. Ensure the user running redis-server has write access to the directory returned by CONFIG GET dir and can create and rename files there. Do not assume the directory is correct after a migration or container restart.
Check for path collisions. If you are running multiple Redis instances on the same host or shared volume, confirm each instance has a unique dir and dbfilename. Two instances writing the same dump.rdb simultaneously will corrupt it or cause one to fail.
Assess whether AOF is active. If aof_enabled is 1 and aof_last_write_status is ok, you still have append-only durability while RDB is failing. This does not fix the MISCONF block, but it changes the urgency: you are not fully without persistence. If AOF is also failing, treat this as a critical data-loss risk.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
`rdb_last_bgsave_status`	Binary health of the last background save.	`err` persisting beyond one save interval.
Age of `rdb_last_save_time`	How long since the last successful snapshot.	Older than 2x the configured `save` interval.
Disk free on persistence volume	RDB and AOF need sufficient free space for temp files.	Less than 3x the current dataset size.
`latest_fork_usec`	The main thread is blocked during `fork()`.	Greater than 500ms, or trending upward with dataset growth.
`total_error_replies` rate	Direct evidence that clients are seeing failures.	Increasing while `rdb_last_bgsave_status` is `err`.
`used_memory_rss` vs host memory	Predicts fork failure and COW-driven OOM kills.	RSS exceeding 80% of available RAM.

Fixes

Disk full or I/O errors

Free space on the persistence volume. Delete old logs, oversized backups, or orphaned temp files left by interrupted saves. After cleanup, trigger a manual save:

redis-cli BGSAVE

Watch rdb_bgsave_in_progress and rdb_last_bgsave_status until it returns ok. Do not simply restart Redis; a restart with a full disk will fail to save during shutdown and may leave you with no recent RDB on startup.

Fork failure or memory pressure

If the fork fails with a memory error and the host is not actually out of RAM, the most common cause is vm.overcommit_memory=0. Set it to 1 to allow the kernel to overcommit memory for the page tables Redis needs during fork():

sysctl -w vm.overcommit_memory=1

Make it permanent in /etc/sysctl.conf. This is a system-wide setting. If Redis runs inside a container, increase the container memory limit to account for copy-on-write overhead. Plan for at least 50% headroom above used_memory_rss on persistent instances; heavy writes during a fork can temporarily double RSS.

If freeing memory or adding capacity is not immediately possible and you need writes to resume, see the temporary workarounds below. Only use them if you understand the durability tradeoff.

Permission or ownership issues

Fix ownership so the Redis process can write to its configured directory:

chown redis:redis /var/lib/redis

If you use SELinux or AppArmor, check for denials in the audit log and adjust the profile. After fixing permissions, run BGSAVE and verify rdb_last_bgsave_status returns ok.

Path collisions in containers

If multiple instances target the same dump.rdb, reconfigure each instance with a unique dir or unique dbfilename, then restart the affected instances. Restarting is acceptable here because the root cause is configuration, not resource exhaustion.

Temporary workarounds and their tradeoffs

If you must unblock writes immediately and you are running a pure cache workload where data can be reconstructed from another source, you can disable the write block:

redis-cli CONFIG SET stop-writes-on-bgsave-error no

This makes the instance accept writes again even though RDB saves are failing. Your data loss window widens to the time since the last successful save. If you do this, also call CONFIG REWRITE to persist the change, or it will revert on restart.

Alternatively, if you rely on AOF for durability and do not need RDB snapshots at all, you can disable automatic RDB saves:

redis-cli CONFIG SET save ""

Again, follow with CONFIG REWRITE if you want the change to survive restart.

Do not use these workarounds on primary databases where RDB is the only persistence mechanism. The correct path is to fix the underlying failure and then run a successful BGSAVE.

Prevention

Monitor disk space proactively. Maintain at least 3x the dataset size as free space on the persistence volume to accommodate temp files, AOF rewrites, and RDB snapshots simultaneously.
Set vm.overcommit_memory=1 on every Redis host. Fork failures on healthy machines are almost always caused by this setting.
Size memory for copy-on-write. If you use RDB or AOF rewrite, keep used_memory_rss below roughly 50% of physical RAM on persistent instances. On cache-only instances, 75% is a safer ceiling.
Audit persistence completion. Do not assume that because save directives exist, saves are succeeding. Alert on rdb_last_bgsave_status:err and on stale rdb_last_save_time.
Validate container and orchestration configs. Ensure each Redis pod or container has a distinct persistent volume and does not share a dump.rdb path with another instance.

How Netdata helps

Netdata collects rdb_last_bgsave_status and rdb_last_save_time from INFO persistence, so you can alert on a sticky err before application write failures spike.
Disk space and memory metrics on the same node are correlated with Redis persistence health, making it faster to distinguish a disk-full incident from a fork-failure incident.
total_error_replies and throughput charts let you confirm the exact moment writes started failing and whether the failure correlates with a scheduled BGSAVE.
RSS and used_memory are plotted together, which helps you forecast when copy-on-write overhead during a fork will exceed available RAM.
Fork duration (latest_fork_usec) is tracked over time, revealing when Transparent Huge Pages or memory pressure are slowing the background save process before it fails entirely.

The Netdata solution

Redis monitoring with Netdata

Netdata monitors Redis with per-second metrics and ML anomaly detection. Track memory usage and fragmentation, fork/COW latency, replication backlog, evictions, and connection pressure to spot the failure modes in these runbooks early.

See Redis monitoring → Start monitoring free

MISCONF Redis is configured to save RDB snapshots - what it means and how to fix it

MISCONF Redis is configured to save RDB snapshots - what it means and how to fix it

What this means

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Disk full or I/O errors

Fork failure or memory pressure

Permission or ownership issues

Path collisions in containers

Temporary workarounds and their tradeoffs

Prevention

How Netdata helps

Related guides

Redis monitoring with Netdata