MongoDB storage not reclaimed after delete: WiredTiger, compact, and resync

You deleted a large portion of a collection, and db.collection.stats() confirms the logical size dropped. df -h on the data volume does not. The filesystem blocks were not returned. WiredTiger maintains internal free-space lists and reclaims pages for future writes rather than shrinking files. Dropping a database or collection removes the underlying files and returns space to the OS immediately; this article addresses the case where documents are deleted but the collection files remain. This guide explains how to verify the state and the two operational paths to return blocks to the OS: the compact command and replica set member resync.

What this means

WiredTiger does not return freed disk blocks to the OS after document deletes or in-place updates that reduce record size. Removed documents leave pages marked free inside the block manager; those pages become eligible for reuse by future writes to the same collection. Logical size decreases, but allocated on-disk size typically does not shrink. This avoids expensive file-resize operations during normal writes, but it creates a persistent gap between logical data volume and physical disk consumption.

flowchart LR
    A[Bulk delete] --> B[WiredTiger marks pages free]
    B --> C{Need OS space back?}
    C -->|No| D[Reused by future writes]
    C -->|Yes| E[compact command]
    C -->|Yes| F[Resync member]
    E --> G[Files rewritten, blocks returned to OS]
    F --> G

Common causes

CauseWhat it looks likeFirst thing to check
WiredTiger internal free spacesize drops after delete; storageSize and filesystem usage stay flatdb.collection.stats(): compare size to storageSize
Index fragmentationIndex size dominates collection footprint after deletionsdb.collection.stats().totalIndexSize and indexSizes
Oplog capped collectionFixed-size oplog consumes predictable disk regardless of data deletesrs.printReplicationInfo()

Quick checks

# Filesystem utilization and actual directory size
df -h /data/db
du -sh /data/db
ls -lh /data/db
// Database-level logical vs allocated size
// In recent versions this also returns fsUsedSize and fsTotalSize
db.stats()
// Per-collection breakdown
db.getCollectionNames().forEach(function(c) {
  var s = db[c].stats();
  printjson({
    collection: c,
    dataMB: (s.size/1024/1024).toFixed(1),
    storageMB: (s.storageSize/1024/1024).toFixed(1),
    indexMB: (s.totalIndexSize/1024/1024).toFixed(1),
    freeStorageMB: s.freeStorageSize ? (s.freeStorageSize/1024/1024).toFixed(1) : "N/A"
  });
});
// Check oplog footprint
rs.printReplicationInfo()
// List all databases with on-disk size
db.adminCommand({ listDatabases: 1 })

How to diagnose it

  1. Run db.collection.stats() on the affected collection. Compare size (logical, uncompressed) to storageSize (allocated on-disk, compressed). A ratio of storageSize to size above 3:1 after a bulk delete indicates significant internal fragmentation. In MongoDB 4.4+, freeStorageSize states exactly how many bytes are reclaimable. If that value is within a few gigabytes of the total free space shown by df, the collection is your primary target.
  2. Check filesystem utilization with df -h and directory size with du -sh on dbPath. Remember that du includes journal files, diagnostic logs, and other metadata files in addition to WiredTiger data files. If du matches pre-delete values and no other databases or collections grew, WiredTiger has not returned blocks to the OS.
  3. Identify the largest consumers by running db.collection.stats() across all collections. Look for collections where storageSize is disproportionately large relative to size, or where totalIndexSize is unexpectedly large.
  4. Check the oplog. rs.printReplicationInfo() shows the configured cap. The oplog is a capped collection and does not shrink.
  5. Check db.stats(). If fsUsedSize and fsTotalSize are present, they show filesystem consumption directly from MongoDB’s perspective.
  6. Decide whether you need immediate filesystem reclamation. If the collection will grow again, the freed space will be reused naturally. If disk is critically full, proceed to compact or resync.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Filesystem usage (df -h)MongoDB crashes when it cannot write journal entries>85% sustained
storageSize per collectionTrue on-disk footprint including internal free spaceGrowing while size is flat
freeStorageSize (MongoDB 4.4+)Direct measure of reclaimable space inside a collection>20% of storageSize after bulk deletes
Database dataSize to storageSize ratioIndicates fragmentation and reclaimable spaceRatio drops significantly after bulk deletes
Oplog windowFixed cap on oplog size affects disk planningWindow below expected maintenance duration

Fixes

Reclaim space with compact

compact rewrites collection and index data, defragmenting files and releasing obsolete blocks to the filesystem. It is a local operation; run it on each replica set member individually. On sharded clusters, compact each shard independently.

Tradeoffs and risks:

  • compact takes an exclusive lock on the collection. All writes and most reads to that collection stall for the duration.
  • The operation can require temporary free space equal to the collection’s current storageSize. Do not run it on a volume already near full. If df -h shows less than roughly storageSize bytes free, use resync instead.
  • On a replica set, perform a rolling compact. Run on secondaries first, then compact the primary during a maintenance window. Do not run on multiple secondaries simultaneously if it risks degrading application read capacity.
  • On the primary, compact stalls writes. If your application cannot tolerate write stalls on the primary, step it down first and compact it as a secondary.
  • Large collections on multi-terabyte nodes may take hours. Monitor mongod logs for progress.

Command:

db.runCommand({ compact: "collectionName" })

Reclaim space by resyncing a member

If fragmentation is severe across many collections, or compact would take too long and require too much temp space, remove the member’s data files and trigger an initial sync. The new files will be written compactly, reclaiming all wasted space.

Tradeoffs and risks:

  • The member is unusable during initial sync. For large datasets this can last hours or days.
  • Cluster redundancy is reduced until the sync completes. Do not resync more than one member at a time in a three-node replica set.
  • Ensure the oplog window on the sync source is large enough to retain operations for the entire sync duration. If the sync falls off the oplog, it will restart and may never complete. If the oplog window is too short, clone another member’s data files using a filesystem snapshot instead.

Procedure:

  1. Stop mongod on the secondary.
  2. Move the existing dbPath contents to a temporary path. Do not delete them until the resync completes successfully.
  3. Create an empty dbPath directory with correct ownership (mongod:mongod or the user running the process).
  4. Restart mongod. The member enters STARTUP2 and begins an initial sync from the primary.
  5. Confirm the member reaches SECONDARY with rs.status() before removing the backed-up data.

Prevention

  • Plan capacity using storageSize or sizeOnDisk, not logical document size.
  • Maintain at least 20% free disk space, plus additional headroom for compaction temp files, index rebuilds, and initial sync.
  • Monitor filesystem growth rate independently of document count.
  • Alert on the ratio of storageSize to size for critical collections.
  • For predictable bulk delete workflows, schedule a maintenance window for compaction or a rolling resync before disk usage becomes critical.

How Netdata helps

  • Tracks filesystem utilization on the MongoDB data volume and alerts before writes fail.
  • Surfaces MongoDB storageSize, dataSize, and freeStorageSize (where available) to expose the gap between logical and physical size.
  • Correlates disk usage trends with opcounters.delete rates to identify collections growing in allocation without growing in documents.
  • Monitors oplog window and replication lag to warn when a member is at risk of falling off the oplog during maintenance.
  • How MongoDB actually works in production: a mental model for operators: /guides/mongodb/how-mongodb-works-in-production/
  • MongoDB pages evicted by application threads: when eviction becomes user latency: /guides/mongodb/mongodb-application-thread-evictions/
  • MongoDB WiredTiger cache dirty ratio high: the leading indicator nobody watches: /guides/mongodb/mongodb-cache-dirty-ratio-high/
  • MongoDB WiredTiger cache pressure cascade: eviction stalls and latency spikes: /guides/mongodb/mongodb-cache-pressure-cascade/
  • MongoDB cache too small: sizing the WiredTiger cache for your working set: /guides/mongodb/mongodb-cache-undersized-working-set/
  • MongoDB checkpoint duration climbing: diagnosing slow WiredTiger checkpoints: /guides/mongodb/mongodb-checkpoint-duration-high/
  • MongoDB checkpoint stall write freeze: when all writes stop with no error: /guides/mongodb/mongodb-checkpoint-stall-write-freeze/
  • MongoDB connection churn: high totalCreated rate and thread creation overhead: /guides/mongodb/mongodb-connection-churn/
  • MongoDB connection refused at maxIncomingConnections: hitting the connection ceiling: /guides/mongodb/mongodb-connection-limit-reached/
  • MongoDB connection storm spiral: reconnection floods after an election or deploy: /guides/mongodb/mongodb-connection-storm-spiral/
  • MongoDB exceeded memory limit for $group – aggregation spills and allowDiskUse: /guides/mongodb/mongodb-exceeded-memory-limit-group-sort/
  • MongoDB flow control throttling writes: when the primary slows itself down: /guides/mongodb/mongodb-flow-control-throttling-writes/