Redis LOADING Redis is loading the dataset in memory - why and how long
What this means
When INFO persistence returns loading:1, Redis is reading an RDB dump or replaying an AOF log into memory. This happens after any restart where persistence files are present. redis-cli PING returns PONG, but all data commands such as GET, SET, and HGETALL are rejected with a -LOADING error. The loading_loaded_perc, loading_loaded_bytes, and loading_eta_seconds fields expose real-time progress. While loading is active, latency, hit rate, and eviction metrics are meaningless. Load duration ranges from seconds for small RDB snapshots to tens of minutes, or even hours, for large AOF files on slow disks.
If the instance is a replica, the LOADING error can also appear when the replica reconnects to a master that is itself in the loading phase. In that case, the replica is healthy; the master is not ready for replication connections. Other symptoms, such as 100% cache misses or replica disconnects, are side effects of the same root cause. The loading state is expected after restart, but if it persists far beyond your baseline, it indicates slow disk I/O, an unexpectedly large persistence file, or corruption causing Redis to hang or abort.
flowchart TD
A[Client receives -LOADING error] --> B[Check INFO persistence]
B --> C{loading:1?}
C -->|Yes| D{Is uptime low?}
D -->|Yes| E[Expected startup restore]
D -->|No| F[Unexpected reload or hung process]
C -->|No| G[Replica reading from a loading master]
E --> H{Progress advancing?}
H -->|Yes| I[Monitor disk I/O and ETA]
H -->|No| J[Run redis-check-aof and redis-check-rdb]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Normal RDB restore after restart | loading:1 after an uptime reset; loading_loaded_perc climbs steadily | INFO persistence progress fields |
| Normal AOF restore after restart | Load duration is much longer than RDB baseline; progress advances slowly | AOF file size in your Redis data directory |
| Corrupt AOF preventing startup | Redis aborts with a bad file format error, or load hangs indefinitely | redis-check-aof output on the AOF file |
| Corrupt RDB preventing startup | Startup fails after partial RDB load | redis-check-rdb on the dump file |
| Slow disk I/O | loading_eta_seconds grows or progress stalls; disk latency spikes | OS-level disk metrics |
| Replica connecting to loading master | Replica logs show LOADING errors but its own loading is 0 | Master’s INFO persistence state |
Quick checks
# Check loading state and progress
redis-cli INFO persistence | grep -E "loading:|loading_loaded_perc|loading_eta_seconds|loading_loaded_bytes"
# Confirm recent restart
redis-cli INFO server | grep uptime_in_seconds
# Determine which persistence method is active
redis-cli INFO persistence | grep -E "aof_enabled|rdb_last_save_time"
# Check persistence file sizes
ls -lh <your-redis-data-directory>
# Verify AOF file integrity
redis-check-aof <path-to-aof>
# Verify RDB file integrity
redis-check-rdb <path-to-rdb>
Replace <your-redis-data-directory> and file paths with the locations configured for your deployment.
How to diagnose it
- Verify
loading:1inINFO persistence. If it is0and a replica reportsLOADING, the master is still restoring. Diagnose the master. - Check
uptime_in_secondsinINFO server. A low value confirms a recent restart. A high value withloading:1suggests an unexpected internal reload or a hung process. - Inspect progress fields.
loading_loaded_percshould increase andloading_eta_secondsshould trend toward zero. If either is stuck, the load is not progressing. - Identify the persistence source. If
aof_enabledis1and the AOF file exists, Redis is replaying AOF. If only RDB is configured, it is loading the RDB snapshot. AOF replay is sequential and typically much slower than loading an RDB snapshot because it replays every write operation. - Compare duration to your baseline. RDB load time is roughly proportional to dataset size divided by disk read throughput. AOF load time depends on the number of write operations logged, so a high-write workload with a long AOF history takes significantly longer.
- If progress is stalled or Redis failed to start, run
redis-check-aofon the AOF file orredis-check-rdbon the dump file. Look for corruption, truncation, or bad format errors. - Check OS-level disk I/O metrics. If disk latency or queue depth is elevated, the storage subsystem is the bottleneck. In virtualized environments, also check for hypervisor-level I/O throttling.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
loading | Binary state of startup restoration | Remains 1 for longer than your baseline |
loading_loaded_perc | Percentage of file loaded | Stuck or advancing slower than expected |
loading_eta_seconds | Estimated time to completion | Growing instead of shrinking |
loading_loaded_bytes | Bytes loaded so far | Not increasing between checks |
uptime_in_seconds | Time since restart | Low value confirms cold start |
| Disk I/O latency | Loading is disk-bound | Sustained high latency or queue depth |
Fixes
Normal restore: wait
If loading_loaded_perc is advancing and disk I/O is healthy, wait. Do not restart Redis; a restart resets progress to zero and begins the load again. If a load balancer or orchestrator sits in front of Redis, ensure its health check treats loading:1 as not-ready. The Redis PING command succeeds during loading, so a naive TCP or PING probe will falsely report the instance as healthy. A proper readiness check should attempt a lightweight data command or parse INFO persistence directly.
Slow AOF restore: reduce startup time
If AOF replay is too slow for your startup time requirements, consider switching to RDB snapshots or the RDB-AOF hybrid mode for faster restarts. The tradeoff is coarser durability: RDB snapshots can lose more data on crash than AOF. You can also trigger an AOF rewrite before planned restarts to compact the AOF file and reduce replay time.
Corrupt AOF: repair and truncate
If redis-check-aof reports corruption, run redis-check-aof --fix <file>. This is destructive: it discards everything from the corruption point to the end of the file. After running it, restart Redis. If aof-load-truncated is enabled (the default since Redis 3.0), Redis already handles cleanly truncated AOF files automatically. Byte-level corruption in the middle of the file, however, requires manual repair and results in data loss from the corruption point forward.
Corrupt RDB: diagnose and restore
Use redis-check-rdb to confirm RDB corruption. If the RDB is corrupt and AOF is also present and healthy, Redis preferring AOF may bypass the issue at the next startup. If only RDB exists and it is corrupt, restore from a backup RDB file or start without persistence, accepting data loss. To start without persistence, move or delete the corrupt RDB file before restarting Redis.
Disk bottleneck: upgrade or relocate
If disk latency is the constraint, move Redis persistence to faster storage. For containerized deployments, ensure the persistence volume is not sharing IOPS with noisy neighbors. In cloud environments, check for volume throughput caps.
Prevention
- Configure readiness probes that poll
INFO persistenceand wait forloading:0before marking the instance ready. - Establish a baseline for loading duration after each restart. Alert when the loading phase exceeds that baseline by a significant margin.
- If startup time matters, prefer RDB snapshots over AOF alone, or ensure AOF rewrites run frequently enough to keep the AOF file compact.
- Keep valid backup RDB snapshots for recovery from corruption.
- Monitor disk I/O latency as part of your standard Redis health checks so you can forecast when storage will become a bottleneck.
- For replicas, reconnecting to a master in the loading phase propagates the
LOADINGerror downstream.
How Netdata helps
- Correlates the
loadingstate withuptime_in_secondsto immediately distinguish a startup restore from an unexpected failure. - Tracks
loading_loaded_percandloading_eta_secondsso you can visualize restore progress and share time estimates during incidents. - Alerts when
loadingremains active beyond your configured baseline, flagging disk stalls or corruption before operators manually check. - Correlates the loading phase with disk I/O saturation metrics to identify storage bottlenecks.
- Uses the
loadingstate to suppress false alerts for eviction, hit rate, and latency while the dataset is being restored.
Related guides
- How Redis actually works in production: a mental model for operators
- Redis aof_last_write_status:err: AOF write failures and recovery
- Redis appendfsync always latency: durability vs throughput trade-offs
- Redis blocked_clients growing: dead consumers vs healthy queues
- Redis BUSY Redis is busy running a script: blocking Lua and how to recover
- Redis Can’t save in background: fork: Cannot allocate memory - diagnosis and fix
- Redis client output buffer overflow: slow consumers and client-output-buffer-limit
- Redis cluster_slots_pfail > 0: impending node failure in a cluster
- Redis CLUSTERDOWN / cluster_state:fail: slot coverage and recovery
- Redis connected_clients climbing: connection leak detection
- Redis connected_slaves dropped: detecting replica disconnects on the primary
- Redis connection exhaustion: leaks, pools, and the retry storm







