MongoDB checkpoint stall write freeze: when all writes stop with no error
Writes time out or hang while mongod is running, TCP port 27017 is open, and reads still return results from cache. The MongoDB logs are quiet, but db.serverStatus().opcounters shows write counts frozen. This is a WiredTiger checkpoint stall: the checkpoint process fell behind, dirty pages accumulated, and new writes blocked. The freeze lasts until the current checkpoint completes. If the I/O bottleneck remains, queued writes flood through and the next checkpoint stalls again.
This failure mode looks like a network or application issue, so operators often restart application pods or fail over load balancers while the real problem is storage I/O that cannot keep pace with dirty page flush demand. Do not kill the checkpoint or restart the node; that forces journal replay and makes the outage longer. Learn the signal pattern so you can diagnose it in seconds.
What this means
WiredTiger keeps data in an in-memory cache and flushes dirty pages to disk via a checkpoint. By default, a checkpoint runs every 60 seconds. If storage cannot complete the checkpoint within that interval, dirty data accumulates in the cache and journal files cannot be reclaimed until the checkpoint finishes. Once cache or journal pressure crosses WiredTiger safety limits, new writes stall or queue indefinitely while reads from cache may continue. When the checkpoint finally completes, blocked writes flush through together. If the I/O bottleneck remains, the next checkpoint also stalls, repeating the cycle.
flowchart TD
A[Checkpoint exceeds 60 second interval] --> B[Dirty data accumulates in cache]
B --> C[Journal files cannot be reclaimed]
C --> D[WiredTiger blocks new writes]
D --> E[Applications queue writes indefinitely]
E --> F[Connection count rises]
F --> G[Checkpoint eventually completes]
G --> H[Queued writes execute at once]
H --> I[Next checkpoint stalls if I/O bottleneck persists]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Storage device saturation or failure | iostat shows %util near 100% and await above 50 ms | iostat -x 1 5 on the data and journal volumes |
| Cloud storage burst credit exhaustion | Journal sync latency spikes 10-100x without any code change | Cloud volume burst balance metrics |
| RAID rebuild or degraded array | Checkpoint duration jumps after a disk replacement or failure event | RAID controller status and rebuild progress |
| Massive write burst | Dirty ratio climbs rapidly during bulk imports or migrations | opcounters write rate versus baseline |
Quick checks
Run these on the affected node in mongosh and at the OS level.
// Latest checkpoint duration in milliseconds
var txn = db.serverStatus().wiredTiger.transaction;
print("Checkpoint duration (ms): " + txn["transaction checkpoint most recent time (msecs)"]);
// WiredTiger cache dirty ratio
var c = db.serverStatus().wiredTiger.cache;
var max = c["maximum bytes configured"];
var dirty = c["tracked dirty bytes in the cache"];
print("Dirty ratio: " + (100 * dirty / max).toFixed(1) + "%");
// Average journal sync latency in microseconds
var wt = db.serverStatus().wiredTiger.log;
var syncTime = wt["log sync time duration (usecs)"];
var syncOps = wt["log sync operations"];
print("Avg journal sync (us): " + (syncTime / syncOps).toFixed(0));
// Write ops and latency to confirm a freeze
var lat = db.serverStatus().opLatencies;
print("Write ops: " + lat.writes.ops + ", total write latency (us): " + lat.writes.latency);
// Connection growth as writes pile up
var conn = db.serverStatus().connections;
print("Current connections: " + conn.current + ", available: " + conn.available);
# OS-level storage latency and utilization
iostat -x 1 5
// Active operations running longer than 10 seconds
db.currentOp({ "active": true, "secs_running": { "$gt": 10 } }).inprog.length
How to diagnose it
- Confirm the freeze pattern. Verify that writes are not completing while reads still work. Check
db.serverStatus().opLatencies. If write latency is climbing or the write ops counter is flat under active load, writes are blocked. - Check checkpoint duration. Inspect
db.serverStatus().wiredTiger.transaction["transaction checkpoint most recent time (msecs)"]. A value above 60,000 ms confirms the checkpoint is taking longer than the default interval. - Check the dirty ratio. Compute
(tracked dirty bytes / maximum bytes configured)fromdb.serverStatus().wiredTiger.cache. A value above 20% means dirty data is accumulating faster than the checkpoint can flush it. - Check journal sync latency. Calculate the average from
db.serverStatus().wiredTiger.logusinglog sync time duration (usecs)divided bylog sync operations. Sustained values above 100,000 microseconds (100 ms) indicate the storage layer is struggling. - Inspect current operations. Run
db.currentOp({ "active": true, "secs_running": { "$gt": 10 } }). Many write operations with highsecs_runningand no progress means they are queued behind the stall. - Check storage health at the OS level. Run
iostat -x 1 5.%utilnear 100% with highawaitandw_awaitconfirms a storage bottleneck. - Correlate with application metrics. Look for write timeouts and connection pool exhaustion in application logs; these side effects often appear before the database stall is noticed.
- Determine if the issue is transient or persistent. A one-time spike during a backup may resolve itself. Sustained elevation means the storage layer is undersized or failing.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Checkpoint duration | Measures how long dirty data takes to reach disk | Sustained values above 60 seconds |
| WiredTiger cache dirty ratio | Indicates dirty data accumulation ahead of flush capacity | Sustained values above 20% |
| Journal sync latency | Reveals storage health before application latency spikes | Average above 100,000 microseconds (100 ms) |
| Write operation latency | Direct user-facing impact of the stall | Write p99 climbing while read latency stays flat |
| Connection count | Writes pile up and hold connections | Rapid growth coinciding with a write throughput drop |
| Current operation age | Shows queued operations waiting for the stall to clear | Many active writes with secs_running above 10 seconds |
Fixes
Storage device saturation or failure
Do not restart mongod and do not attempt to kill the running checkpoint. Let the checkpoint complete. Interrupting it forces journal replay on recovery, which makes the outage longer. If the primary is on degraded storage and a healthy secondary exists, fail over to shift writes away from the bottleneck.
Warning: Only run rs.stepDown() on the primary. Ensure a healthy secondary exists and the replica set can elect a new primary with a majority.
// Step down the primary to force failover to a healthier secondary
rs.stepDown()
This triggers an election and brief write unavailability, but it moves write load off the failing device. After the stepdown, replace failing disks, move to faster volumes, or resolve the hypervisor I/O scheduling issue.
Cloud storage burst credit exhaustion
If burst credits are depleted, baseline IOPS may be too low for your write workload. Increase the volume size or provisioned IOPS to raise the floor, then let the current checkpoint finish. Do not scale the MongoDB process vertically until the storage layer can sustain the checkpoint flush rate.
RAID rebuild or maintenance
If the stall correlates with a RAID rebuild, options are limited until the rebuild completes. Reduce write pressure by pausing batch jobs, throttling ingestion, or disabling non-critical writes. If the cluster cannot tolerate the latency, fail over to a secondary that is not undergoing rebuild.
Massive write burst
Throttle the bulk load or migration at the application layer. The checkpoint mechanism is designed for steady-state traffic, not sustained write floods that exceed disk sequential write capacity. Spread bulk loads across time or redirect them to a dedicated secondary.
Prevention
- Monitor checkpoint duration and dirty ratio as leading indicators. A checkpoint duration creeping from 5 seconds toward 30 seconds signals shrinking headroom.
- Size storage to sustain peak write rate plus periodic checkpoint flush. Avoid relying solely on burst-credit storage for write-heavy primaries.
- Watch journal sync latency. It typically degrades 30 to 60 seconds before application-visible latency spikes.
- Keep a healthy secondary on independent storage to provide a clean failover target if the primary’s storage degrades.
- Avoid running bulk imports or large index builds during peak traffic. Schedule them when checkpoint duration is low and journal sync latency is stable.
- Set application driver timeouts to fail fast during a stall rather than holding connections open indefinitely. This limits connection pile-up and reduces recovery time.
How Netdata helps
- Correlates MongoDB checkpoint duration with OS disk I/O utilization and latency to show whether the stall is a database or storage issue.
- Alerts on WiredTiger cache dirty ratio thresholds before writes freeze.
- Tracks journal sync latency as an early storage health signal, often warning before application timeouts trigger.
- Visualizes WiredTiger ticket utilization and queue depths to help distinguish a checkpoint stall from a single bad query or a broader cache pressure cascade.
- Monitors connection growth alongside write latency to surface the pile-up pattern that follows a stall.
- Surfaces
currentOpmetrics to confirm operations are stuck waiting for storage rather than a specific lock.
Related guides
- How MongoDB actually works in production: a mental model for operators
- MongoDB pages evicted by application threads: when eviction becomes user latency
- MongoDB WiredTiger cache dirty ratio high: the leading indicator nobody watches
- MongoDB WiredTiger cache pressure cascade: eviction stalls and latency spikes
- MongoDB cache too small: sizing the WiredTiger cache for your working set
- MongoDB monitoring checklist: the signals every production cluster needs
- MongoDB monitoring maturity model: from survival to expert







