MongoDB chunk migration storms: moveChunk I/O pressure and range locks
Write latency spikes across multiple shards simultaneously. Queries time out while mongos logs show no election events. Donor shard primaries report growing queue depths, and config.changelog shows a wall of moveChunk.error entries interleaved with retries. The balancer is hammering the cluster instead of helping.
A single moveChunk operation is expensive. It copies documents to the recipient, enters a critical section with a shared range lock on the donor, then deletes orphaned documents. When the balancer triggers many migrations in quick succession, or individual migrations fail and retry in a tight loop, overlapping I/O pressure and lock contention compound into a cluster-wide latency event.
flowchart TD
A[Hot shard or jumbo chunk] --> B[Chunk imbalance]
B --> C[Balancer triggers moveChunk]
C --> D[Clone phase: heavy I/O on recipient]
C --> E[Critical section: range lock on donor]
D --> F[Recipient cache dirty ratio rises]
E --> G[Donor write latency spikes]
F --> H{Catch-up fails or config server slow}
G --> H
H --> I[Migration fails or stalls]
I --> J[Balancer retries]
J --> K[Migration storm]What this means
A chunk migration proceeds in three phases. Clone: the donor copies documents matching the chunk bounds to the recipient via a cursor. Recipient insertions flow through WiredTiger and can pressure cache if the working set is cold. Critical section: the donor holds a shared range lock over the chunk’s shard key interval and queues writes targeting that range. Cleanup: after the metadata commit, the donor deletes orphaned documents.
Under normal conditions the balancer runs migrations sequentially and impact is brief. A storm develops when:
- The balancer retries a failed migration repeatedly.
- Config server latency delays migration commits, extending the critical section.
- A hot shard or jumbo chunks force the balancer to run continuously.
- Recipient I/O saturation slows the clone, causing the donor to hold the range lock longer.
The result is latency spikes on both the donor (range lock queuing) and the recipient (clone-phase write I/O), with the config server gating the critical section.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Jumbo chunks blocking migration | Balancer is active but chunk counts never equalize; repeated moveChunk.error entries | db.getSiblingDB("config").chunks.find({jumbo: true}) |
| Config server latency gating commits | Write latency spikes outlast the clone phase; critical sections hang | db.serverStatus().opLatencies on the config primary |
| Hot shard forcing aggressive rebalancing | One shard holds >20% more chunks than the average; donor and recipient I/O both saturated | Per-shard chunk aggregation from config.chunks |
| Recipient cache saturation during clone | Recipient WiredTiger dirty ratio climbs and application threads start evicting | WiredTiger cache stats on the recipient primary |
| Sustained write load extending catch-up | moveChunk.start entries without matching commits; donor currentQueue.writers grows | db.serverStatus().globalLock.currentQueue on the donor |
Quick checks
Run these read-only checks to confirm a migration storm is in progress.
// Check if the balancer is running
sh.isBalancerRunning()
// Review recent migration history and outcomes
db.getSiblingDB("config").changelog.find(
{ what: /moveChunk/ },
{ time: 1, what: 1, details: 1 }
).sort({ time: -1 }).limit(20)
// Count migration failures
db.getSiblingDB("config").changelog.find({ what: /moveChunk.error/ }).count()
// Check for unmigratable jumbo chunks
db.getSiblingDB("config").chunks.find({ jumbo: true }).count()
// See chunk distribution across shards
db.getSiblingDB("config").chunks.aggregate([
{ $group: { _id: "$shard", count: { $sum: 1 } } },
{ $sort: { count: -1 } }
])
// Check donor write queue depth
db.serverStatus().globalLock.currentQueue
# Check recipient disk health and I/O saturation
iostat -x 1 3
// Check config server primary latency
db.serverStatus().opLatencies
How to diagnose it
Confirm the storm from
config.changelog. Inspect the last 30 minutes. IfmoveChunk.errorentries outnumber successful commits, the balancer is retrying faster than migrations complete. Notedetails.fromanddetails.toto identify the affected shard pair, anddetails.errmsgfor the failure reason.Correlate donor latency with lock queuing. On the donor primary, compare
db.serverStatus().opLatencies.writesagainst baseline. If write latency spikes whileglobalLock.currentQueue.writersis elevated, the critical section is serializing writes.Check recipient cache pressure from the clone phase. On the recipient primary, inspect
db.serverStatus().wiredTiger.cache. A rising dirty ratio or nonzeropages evicted by application threadsmeans the clone write stream is overwhelming WiredTiger’s flush capacity. This slows catch-up and extends the donor lock.Measure config server command latency. On the config server primary, check
opLatencies.commands. Elevated latency delays the metadata commit that ends the critical section. Even fast clone phases turn into long lock holds if the config server is slow.Identify the imbalance trigger. Run the chunk distribution aggregation. If skew exceeds 20% and jumbo chunks exist, the balancer is stuck attempting to move unmigratable chunks.
Map the timeline. Overlay changelog timestamps with per-shard
opcountersand disk I/O metrics. If migration start times correlate with I/O saturation and latency spikes, the causal chain is confirmed.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
config.changelog moveChunk error rate | Failed migrations waste I/O and retry in tight loops | Errors exceed successes over any 10-minute window |
Donor shard write opLatencies | The critical section range lock blocks concurrent writes to the chunk | p99 write latency spikes correlating with changelog entries |
| Recipient WiredTiger cache dirty ratio | Clone-phase writes hot data; if the disk cannot absorb it, dirty pages accumulate | Dirty ratio >15% or application-thread evictions incrementing |
| Config server primary command latency | Metadata commits gate how long the donor holds the range lock | Average command latency >2x baseline during balancing windows |
| Chunk distribution skew | Persistent imbalance forces the balancer to run continuously | Max-min chunk count >20% of the per-shard average |
| Jumbo chunk count | Unmigratable chunks create permanent hot spots and balancer retry loops | Any jumbo chunks on actively growing collections |
Fixes
Stop the balancer immediately
Run sh.stopBalancer() to halt new migrations. This breaks the retry loop and stops new range locks and clone I/O.
Warning: This pauses rebalancing. Chunk imbalance will persist until you re-enable the balancer, but it gives immediate relief.
Resolve jumbo chunks
Jumbo chunks cannot be migrated by the balancer. If your shard key cardinality supports it, manually split the chunk range. If the shard key itself is the problem, plan a reshard.
Warning: Resharding is heavy I/O. Schedule it outside peak traffic.
Tune the balancer window
Restrict balancing to off-peak hours so migration I/O does not compete with application traffic. Tradeoff: data distribution lags behind traffic shifts during the day, but you avoid peak-hour latency spikes.
Throttle writes during catch-up
If sustained write volume keeps extending the critical section, temporarily pause bulk ingestion or lower application write concurrency. Tradeoff: slower pipeline, but shorter range-lock hold time.
Fix config server storage latency
If the config server primary shows elevated opLatencies, investigate its underlying disk with iostat or host-level storage metrics. Do not restart config servers during a storm. Resolve the storage contention first so metadata commits can flow.
Prevention
- Monitor chunk distribution trends. Alert when skew exceeds 15%, before the balancer storms.
- Size the recipient WiredTiger cache for clone load. Ensure the cache can absorb migration writes without dirty-ratio spikes.
- Choose a high-cardinality shard key. This minimizes jumbo chunks and reduces the frequency of rebalancing.
- Watch config server latency as a leading indicator. Elevated command latency on the config primary predicts migration stalls before they cascade to shards.
- Restrict the balancer window. Limit automatic balancing to maintenance or low-traffic periods.
How Netdata helps
- WiredTiger cache dirty ratio and application-thread eviction charts on recipient shards reveal clone-phase I/O saturation.
opLatenciesacross shard primaries and config servers in one view surface cross-shard write latency patterns from range-lock queuing.- Disk latency and utilization alerts flag when migration I/O saturates donor or recipient storage.
- Ticket utilization and queue-depth charts on donor shards highlight admission-control backlog from range locks.
Related guides
- How MongoDB actually works in production: a mental model for operators
- MongoDB pages evicted by application threads: when eviction becomes user latency
- MongoDB WiredTiger cache dirty ratio high: the leading indicator nobody watches
- MongoDB WiredTiger cache pressure cascade: eviction stalls and latency spikes
- MongoDB cache too small: sizing the WiredTiger cache for your working set
- MongoDB checkpoint duration climbing: diagnosing slow WiredTiger checkpoints
- MongoDB checkpoint stall write freeze: when all writes stop with no error
- MongoDB connection churn: high totalCreated rate and thread creation overhead
- MongoDB connection refused at maxIncomingConnections: hitting the connection ceiling
- MongoDB connection storm spiral: reconnection floods after an election or deploy
- MongoDB disk full: emergency recovery when mongod can’t write the journal
- MongoDB disk I/O saturation: correlating iostat with WiredTiger signals







