MongoDB sharding hot shard: one shard saturating while others idle

Application latency climbs while aggregate CPU, I/O, and connection counts look healthy. Drill into individual shards and one node is pinned near 95% utilization while peers sit near 20%. sh.status() shows an even chunk distribution. You have a hot shard.

A hot shard occurs when one member of a sharded cluster receives a disproportionate share of read or write traffic. The cause is usually an access-pattern mismatch against the shard key, not a chunk-count imbalance. The overloaded shard saturates its WiredTiger cache and exhausts read or write tickets, pushing operations into queues. Because other shards are idle, aggregate cluster metrics hide the problem until the hot shard cascades into a latency spike or availability event.

What this means

mongos routes operations to shards based on the shard key. When the key design or workload creates a hotspot, traffic concentrates on a single shard’s replica set. That shard’s mongod instances experience cache pressure, ticket exhaustion, and lock contention while other shards remain underutilized.

The balancer can only redistribute chunks. It cannot redistribute access patterns inside a chunk. If your shard key is monotonically increasing, low cardinality, or subject to temporal skew, chunk counts may look even while one shard absorbs most operations.

flowchart TD
    A[App latency climbs] --> B{Aggregate metrics normal?}
    B -->|Yes| C[Compare per-shard opLatencies]
    B -->|No| D[Check cluster-wide saturation]
    C --> E{One shard elevated?}
    E -->|Yes| F[Check chunk distribution]
    E -->|No| G[Check mongos routing]
    F --> H{Chunks balanced?}
    H -->|Yes| I[Access-pattern hotspot]
    H -->|No| J[Balancer or jumbo chunk issue]
    I --> K[Reshard or redesign key]
    J --> L[Fix jumbos or enable balancer]

Common causes

CauseSymptomsFirst check
Monotonically increasing shard keyNew writes land on the max-value chunk; the host shard’s WiredTiger cache and tickets saturatesh.status() showing the last shard owning the highest chunk range
Low-cardinality shard keyToo few possible chunk boundaries; the balancer cannot create enough chunks to spread loadDistinct shard key value count versus chunk count in config.chunks
Jumbo chunksA single chunk cannot split because many documents share the same shard key value; the balancer skips it, leaving one shard permanently overloadeddb.getSiblingDB("config").chunks.find({ jumbo: true })
Temporal read hotspotRecent data lives on one shard; reads concentrate there while historical shards idlePer-shard opcounters showing read skew on the newest chunk range

Quick checks

Run these read-only commands on mongos or the hot shard primary.

// Chunk distribution summary
sh.status()
// Exact chunk counts per shard
db.getSiblingDB("config").chunks.aggregate([
  { $group: { _id: "$shard", count: { $sum: 1 } } },
  { $sort: { count: -1 } }
])
// Jumbo chunks that block balancing
db.getSiblingDB("config").chunks.find({ jumbo: true })
// Balancer state
sh.isBalancerRunning()
sh.getBalancerState()
// Operation latency on the suspected hot shard
db.serverStatus().opLatencies
// Operation rates on the suspected hot shard
db.serverStatus().opcounters
// Long-running operations consuming resources
db.currentOp({ "active": true, "secs_running": { "$gt": 10 } })
// Data distribution for a specific collection
db.collection.getShardDistribution()

How to diagnose it

  1. Confirm aggregate metrics hide the problem. Run sh.status() and inspect per-shard chunk counts. If counts are roughly even but latency is high, you are likely dealing with an access-pattern hotspot, not a distribution problem.

  2. Check for jumbo chunks. Query config.chunks for jumbo: true. Jumbo chunks cannot migrate. If a jumbo exists on the hot shard, the balancer cannot relieve it regardless of shard key design.

  3. Compare per-shard throughput and latency. Collect opcounters and opLatencies from each shard primary. A hot shard shows a disproportionately high operation rate or elevated latency compared to peers. If one shard handles more than 40% of total traffic, treat it as a hotspot.

  4. Determine whether reads, writes, or both are skewed. Compare opcounters.insert, query, update, and delete on the hot shard. Write skew suggests a monotonic shard key. Read skew suggests temporal concentration on recent data.

  5. Inspect the shard key and recent document distribution. Examine the shard key values of recently inserted documents. If they cluster in a narrow range mapped to a single chunk, the key is likely monotonic or low cardinality.

  6. Review query routing. On mongos, run explain() on slow queries. If targeted queries route to the hot shard and scatter-gather queries are not the issue, the problem is data distribution or the shard key, not the query layer.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Per-shard opLatenciesReveals latency skew that aggregate metrics hideOne shard’s read or write average exceeds the mean by more than 2x
Per-shard opcountersShows traffic concentration, not just data distributionOne shard handles more than 40% of a given operation type
Chunk distributionConfirms whether the imbalance is count-based or access-basedAny shard exceeds the average chunk count by more than 20%
Jumbo chunk countJumbos cannot migrate, creating permanent imbalanceNonzero jumbo count on the hot shard
Balancer stateA disabled or stuck balancer allows imbalance to persistsh.isBalancerRunning() returns false during the active window
WiredTiger cache dirty ratio per shardHot shards often show elevated dirty ratio before global latency spikesDirty ratio above 10% on one shard while peers remain below 5%
WiredTiger ticket availability per shardDirect measure of storage engine saturation on the hot shardAvailable read or write tickets below 25% of total on one shard

Fixes

Jumbo chunks blocking migration

If config.chunks shows jumbo chunks on the hot shard, the balancer cannot split or move them. Try to manually split the chunk if the shard key allows. Use sh.splitAt() or sh.splitFind() on the collection, targeting a median shard key value inside the jumbo chunk.

Warning: These operations change chunk metadata. Run them during low traffic and verify config server health first. If the jumbo exists because every document in the chunk shares the exact same shard key value, splitting is impossible and the only durable fix is resharding.

Monotonically increasing shard key

If new writes land on the shard owning the max chunk, the shard key is monotonic. The durable fix is resharding with a non-monotonic key.

Warning: Resharding copies the entire collection and rebuilds indexes. Ensure each participating shard has free disk space roughly equal to the collection size plus index size. During resharding, expect elevated I/O and brief write pauses. Schedule around peak traffic.

Low-cardinality shard key

If the shard key has too few distinct values, the balancer cannot create enough chunks to spread load. Reshard with a higher-cardinality compound key. Adding a hashed prefix or a UUID field increases cardinality and breaks hotspots.

Tradeoff: Compound shard keys can increase index size and make range queries less efficient. Test the new key with a representative write workload in a non-production environment before applying it.

Temporal read hotspot

If reads concentrate on recent data that resides on one shard, consider zone sharding. Define zones that pin recent date ranges to specific shards or spread them explicitly. Alternatively, adjust application read preferences to route traffic away from the hot shard, accepting slightly stale data from secondaries.

Tradeoff: Zone sharding adds operational complexity. Moving chunks manually to satisfy zones causes I/O spikes on both source and destination shards. Application-side read preference changes only help read skew; they do not fix write skew.

Prevention

  • Choose shard keys with high cardinality and uniform distribution. Avoid timestamps, auto-incrementing integers, and ObjectId as the sole shard key.
  • Test distribution before deploying. Use a production-like write workload to verify chunks split evenly across shards.
  • Monitor per-shard metrics, not just aggregates. Aggregate dashboards hide hot shards.
  • Alert on per-shard latency deviation. Page when a shard’s opLatencies exceed the cluster mean by 2x.
  • Size shards for peak skew. Ensure each shard can handle 1.5x its fair share without saturating tickets or cache.

How Netdata helps

  • Per-shard charts for opcounters, opLatencies, and WiredTiger cache metrics make skew visible without manual aggregation.
  • The MongoDB collector exposes per-instance ticket utilization, so you can correlate one shard’s ticket exhaustion with application latency spikes.
  • Custom alerts on per-shard dirty ratio or queue depth catch hot shards before they cascade into cluster-wide slowdowns.
  • Netdata charts disk I/O wait alongside cache pressure on the hot shard, helping distinguish a storage bottleneck from a routing problem.