MongoDB Too many open files: file descriptor exhaustion and ulimit tuning
Connection timeouts appear in application logs. MongoDB logs show Too many open files or error accepting new connection. New secondaries fail to sync, or a stable node rejects connections after restart. The mongod process hit the OS file descriptor limit. The failure is often silent until a dependent system breaks.
FD exhaustion is not always about connection count. WiredTiger maintains open file descriptors for data files, indexes, and journals. A dense deployment with thousands of collections can hold tens of thousands of FDs in steady state. A connection surge, index build, or restarted node rebuilding caches can push the process over the limit. MongoDB then cannot accept new connections, open new data files, or continue replication.
The most common root cause is a mismatch between the configured limit and how MongoDB uses file descriptors. Operators set ulimit in a shell profile, restart via systemctl, and find the old limit still applies because systemd overrides it.
What this means
MongoDB uses file descriptors for client connections, WiredTiger data and index files, and internal files such as journals and logs. The default mongod configuration allows up to 65,536 incoming connections, but the OS soft limit is often 1,024 or 4,096. Once the process exceeds the limit, system calls return EMFILE and MongoDB rejects connections or logs errors.
WiredTiger maps each collection and index to at least one file, so baseline FD count scales with schema size. In dense deployments, data files can consume more FDs than client connections. Systemd enforces its own limit via LimitNOFILE, overriding /etc/security/limits.conf. When the effective limit is lower than connections plus data files plus journals and logs, the process hits the ceiling.
Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Connection surge | Connection count spikes and logs show error accepting new connection | `ls /proc/ |
| Dense collections and indexes | Steady-state FD count is already high; new secondaries struggle | db.adminCommand({listDatabases: 1}) and per-collection stats |
| systemd overriding limits.conf | limits.conf is set to 64000 but /proc/<PID>/limits shows 4096 | `cat /proc/ |
| File descriptor leak | FD count grows faster than connection count over days | Compare FD growth to connections.totalCreated and current |
Quick checks
# Substitute the actual mongod PID. If multiple processes exist, pick the correct one.
MONGOD_PID=$(pgrep mongod)
# Current FD count
ls /proc/$MONGOD_PID/fd | wc -l
# Effective limit for the running process
cat /proc/$MONGOD_PID/limits | grep "Max open files"
# System-wide hard ceiling
cat /proc/sys/fs/file-max
# Open files grouped by type
lsof -p $MONGOD_PID
# MongoDB connection count
mongosh --quiet --eval 'db.serverStatus().connections.current'
# systemd unit limit
systemctl cat mongod | grep -i limitnofile
These checks are read-only and safe during an incident.
How to diagnose it
- Confirm the effective limit. Read
/proc/<PID>/limits. This shows the kernel-enforced limit, ignoring shell profiles. - Count open FDs.
ls /proc/<PID>/fd | wc -l. If this is within 10% of the limit, exhaustion is imminent. - Correlate FDs with connections. Compare
db.serverStatus().connections.currentto the FD count. Low connections with high FDs points to data files or journals. - Check schema density.
db.adminCommand({listDatabases: 1})and count collections and indexes. Rapid schema growth explains high baseline usage. - Identify init system enforcement. If
/proc/<PID>/limitsis lower thanlimits.conf,mongodwas likely started via systemd.LimitNOFILEin the unit file or a drop-in takes precedence overlimits.conf. - Look for leaks. If FDs grow while
connections.currentand schema size stay flat, the application may be leaking connections, or WiredTiger may be failing to close idle files. Checkdb.serverStatus().connections.totalCreatedfor churn.
flowchart TD
A[Errors or connection rejects] --> B{Check /proc//limits}
B -->|Limit too low| C[systemd or limits.conf mismatch]
B -->|Limit adequate| D{Check FD count vs connections}
D -->|FDs >> connections| E[Schema density or leak]
D -->|FDs ~ connections| F[Connection surge]
E --> G[Count collections/indexes]
G -->|Growth normal| H[Investigate leak or restart]
G -->|Growth high| I[Raise limit or shard]
F --> J[Throttle clients or raise limit]
C --> K[Apply systemd drop-in and reload] Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| FD utilization | Direct measure of headroom before EMFILE | >80% of effective limit |
| Connection count | Each connection consumes approximately one FD | current trending toward maxIncomingConnections |
| Database and collection count | WiredTiger opens files per collection and index | Rapid growth in multi-tenant schemas |
| Connection errors | Early indicator that MongoDB is rejecting work | Sustained error accepting new connection in logs |
| Connection churn | High totalCreated rate without high current suggests leak | totalCreated delta growing while current is flat |
Fixes
Raise the OS file descriptor limit
The recommended production floor is 64,000 for both nofile and nproc. Edit /etc/security/limits.conf or a file in /etc/security/limits.d/:
<mongodb-user> hard nofile 128000
<mongodb-user> soft nofile 128000
<mongodb-user> hard nproc 64000
<mongodb-user> soft nproc 64000
This requires a process restart and a new user session.
Apply a systemd override
On systemd hosts, limits.conf is ignored if the unit sets LimitNOFILE. Create a drop-in:
# /etc/systemd/system/mongod.service.d/override.conf
[Service]
LimitNOFILE=128000
Then reload and restart:
systemctl daemon-reload
systemctl restart mongod
Warning: systemctl restart mongod is disruptive. It interrupts all connections and can trigger an election on replica sets.
After restart, verify with /proc/<PID>/limits.
Account for schema density
For dense schemas, estimate baseline FD demand: one per connection, two or more per collection, one per additional index, plus journals and logs. If the baseline approaches 64,000, raise the limit to 128,000 before the next growth phase. A higher limit has no performance penalty; only the per-process FD table consumes kernel memory.
Address connection leaks
If FD growth exceeds connection growth, check for application-side connection leaks. Ensure drivers use connection pooling correctly and that clients close cursors.
As a temporary relief, restarting mongod closes all FDs and resets counts. Warning: this is disruptive and does not fix the leak.
Prevention
- Set the limit for peak plus headroom. Use 64,000 as a minimum. Dense multi-tenant deployments often need 128,000 or more.
- Verify limits after every restart. Automate a check that compares
/proc/<PID>/limitsagainst your intended value. Package upgrades can reset systemd unit files. - Monitor FD utilization as a percentage. Alert when utilization exceeds 80%.
- Audit schema growth. Rapid creation of collections or indexes increases the baseline FD footprint. Track collection and index counts alongside connection counts.
- Ensure systemd drop-ins survive upgrades. Store override files in
/etc/systemd/system/mongod.service.d/rather than editing the vendor unit file directly.
How Netdata helps
- Correlate process-level FD count with MongoDB connection metrics and error logs.
- Chart FD utilization as a percentage of the effective limit.
- Monitor
totalCreatedconnection deltas alongsidecurrentto distinguish leaks from surges. - Compare systemd unit limits against process usage to catch override mismatches after restarts.
Related guides
- How MongoDB actually works in production: a mental model for operators: /guides/mongodb/how-mongodb-works-in-production/
- MongoDB pages evicted by application threads: when eviction becomes user latency: /guides/mongodb/mongodb-application-thread-evictions/
- MongoDB WiredTiger cache dirty ratio high: the leading indicator nobody watches: /guides/mongodb/mongodb-cache-dirty-ratio-high/
- MongoDB WiredTiger cache pressure cascade: eviction stalls and latency spikes: /guides/mongodb/mongodb-cache-pressure-cascade/
- MongoDB cache too small: sizing the WiredTiger cache for your working set: /guides/mongodb/mongodb-cache-undersized-working-set/
- MongoDB checkpoint duration climbing: diagnosing slow WiredTiger checkpoints: /guides/mongodb/mongodb-checkpoint-duration-high/
- MongoDB checkpoint stall write freeze: when all writes stop with no error: /guides/mongodb/mongodb-checkpoint-stall-write-freeze/
- MongoDB connection storm spiral: reconnection floods after an election or deploy: /guides/mongodb/mongodb-connection-storm-spiral/
- MongoDB flow control throttling writes: when the primary slows itself down: /guides/mongodb/mongodb-flow-control-throttling-writes/
- MongoDB journal sync latency high: the storage signal that warns 60 seconds early: /guides/mongodb/mongodb-journal-sync-latency-high/
- MongoDB monitoring checklist: the signals every production cluster needs: /guides/mongodb/mongodb-monitoring-checklist/
- MongoDB monitoring maturity model: from survival to expert: /guides/mongodb/mongodb-monitoring-maturity-model/







