MongoDB connection refused at maxIncomingConnections: hitting the connection ceiling
Application logs show connection timeouts. MongoDB logs show connection refused or error accepting new connection. db.serverStatus().connections shows current well below the configured maximum. This disconnect means you are hitting a hard ceiling at the TCP accept layer, not experiencing gradual degradation.
MongoDB uses a one-thread-per-connection model. Each accepted connection consumes at least one file descriptor and a thread stack sized by the OS ulimit -s. While maxIncomingConnections sets the logical inbound cap, the OS file-descriptor limit (ulimit -n) usually enforces the actual ceiling. Rejections happen before the connection handshake completes, so serverStatus().connections.current never counts refused connections. Look at logs, ratios, and OS-level resource counts to find the real limit.
What this means
maxIncomingConnections controls how many simultaneous inbound TCP connections MongoDB will accept. The default is 65,536. Practical ceilings are lower and are set by RAM, thread scheduling overhead, WiredTiger ticket contention, and file descriptors consumed by data files, journals, and indexes.
serverStatus().connections.current only tracks accepted connections. If the OS file-descriptor limit is lower than maxIncomingConnections – which is common – the server logs error accepting new connection while current sits far below the configured maximum. If maxIncomingConnections is the binding constraint, the log shows connection refused because there are too many open connections. In both cases, rejections occur at accept time and are invisible to the server counters.
flowchart TD
A[Inbound TCP connection] --> B{OS FD limit reached?}
B -->|Yes| C[Error accepting new connection]
B -->|No| D{MongoDB maxIncomingConnections reached?}
D -->|Yes| E[Connection refused too many open connections]
D -->|No| F[Connection accepted]
F --> G[Thread plus FD allocated]
G --> H[connections.current incremented]
C --> I[Not reflected in serverStatus]
E --> ICommon causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Connection pool bloat | current count is high and flat; few application hosts hold most connections | db.serverStatus().connections and db.currentOp() grouped by client |
| Connection leak | current high but active low; connections never drop after traffic spikes | active versus current ratio in serverStatus().connections |
| Connection storm | Sudden spike in current and totalCreated; memory RSS jumps simultaneously | MongoDB logs for election or network events; totalCreated delta over 10 seconds |
| OS file descriptor exhaustion | error accepting new connection appears before current reaches 80% of max | /proc/<pid>/limits and `ls /proc/ |
| Long-running operations pinning connections | Connections idle but not closing; ticket utilization drops | db.currentOp() for operations with high secs_running |
Quick checks
Run these safe, read-only checks to assess the boundary that is actually enforcing the ceiling.
# Check MongoDB connection counters
mongosh --quiet --eval 'db.serverStatus().connections'
# Check active versus current ratio
mongosh --quiet --eval 'var c = db.serverStatus().connections; print("current:", c.current, "available:", c.available, "active:", c.active, "totalCreated:", c.totalCreated)'
# Search logs for refusals and accept errors (adjust path for your deployment)
grep -iE 'connection refused|error accepting|too many open connections' /var/log/mongodb/mongod.log | tail -n 20
# Check OS file descriptor usage against the hard limit
# NOTE: If multiple mongod processes exist on this host, specify the correct PID manually.
MONGO_PID=$(pgrep -x mongod)
ls /proc/$MONGO_PID/fd | wc -l
grep 'Max open files' /proc/$MONGO_PID/limits
# Check memory RSS
mongosh --quiet --eval 'db.serverStatus().mem.resident'
# List current connections by client to find heavy sources
mongosh --quiet --eval 'db.currentOp().inprog.forEach(function(o){ print(o.client || "local", o.connectionId) })' | sort | uniq -c | sort -rn | head
How to diagnose it
Confirm the exact rejection site.
error accepting new connectionpoints to the OS file-descriptor limit.connection refused because there are too many open connectionspoints tomaxIncomingConnections.Calculate the MongoDB utilization ratio. Divide
currentbycurrent + available. Sustained ratios above 80% degrade performance through thread overhead and ticket contention, even if rejections have not started.Check the real OS limit. File-descriptor exhaustion produces the same client-side symptom but requires a different fix. Compare the fd count in
/proc/<pid>/fdtoMax open filesin/proc/<pid>/limits.Identify connection sources. Group
db.currentOp()byclient. A few hosts holding hundreds of connections each indicates oversized driver pools. A broad distribution across many hosts suggests a legitimate traffic surge or a reconnection storm.Measure connection churn. Sample
totalCreatedover a 10-second window. Ifcurrentis stable buttotalCreatedclimbs rapidly, pools are thrashing. Churn burns CPU on thread creation and destroys latency before the connection count reaches the ceiling.Correlate with recent events. Elections, rolling application deploys, DNS changes, or load-balancer health-check failures can trigger synchronized reconnections. Check
rs.status()for recent election timestamps and match them to the start of connection spikes in your metrics.Check RSS growth. If
mem.residentgrows in step withconnections.current, thread stack allocation is the dominant memory consumer. If RSS is high but connections are low, look for cursor leaks or large aggregations instead.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Connection utilization ratio | current / (current + available) is the only reliable headroom metric inside MongoDB | Sustained above 80% |
| Connection churn rate | totalCreated delta reveals pool thrashing that static counts hide | totalCreated increasing faster than current by 5x or more over 10 s |
| File descriptor utilization | OS limits enforce the real ceiling before maxIncomingConnections | FD count above 80% of ulimit -n |
| Memory RSS | Each connection allocates thread stack memory; RSS spikes predict OOM | RSS growth correlating with connection spikes |
| Log rejection messages | Rejections are invisible in serverStatus; logs are the primary source | Any sustained rate of connection refused or error accepting messages |
| Active to current ratio | Idle connections waste resources without doing work | active / current below 20% sustained |
Fixes
Raise or fix the OS file descriptor limit
If ulimit -n is the blocker, increase LimitNOFILE in the systemd service unit or /etc/security/limits.conf and restart mongod. This requires a rolling restart in a replica set. Warning: restarting a primary triggers an election; plan maintenance windows accordingly.
Tradeoff: higher fd limits prevent accept errors but do not reduce memory or ticket pressure from open connections.
Right-size application connection pools
If db.currentOp() shows a few hosts holding hundreds of connections, reduce the driver maxPoolSize. Tradeoff: smaller pools may increase application-side queue latency under burst load, but they prevent server-side thread exhaustion.
Fix connection leaks
If active is low and current is high, applications are not closing connections. Fix client code to close cursors and dispose of clients on shutdown. Tradeoff: requires an application deploy, but it is the only fix that stops memory growth.
Shed load during a connection storm
During a storm, block or rate-limit new connections at the application tier or load balancer until the storm subsides. Do not restart mongod during an active storm unless the instance is unresponsive. If thread overhead consistently drives OOM, plan a maintenance window to lower net.maxIncomingConnections and restart with a harder boundary.
Kill or timeout long-running operations
If connections are held open by slow operations, use db.killOp() on non-essential long runners, or set maxTimeMS in application queries. Tradeoff: killed operations fail on the client, but they free threads and tickets for new connections.
Prevention
- Monitor
current / (current + available), not just absolute connection count. - Trend
totalCreateddeltas to detect churn before pools max out. - Monitor OS file-descriptor utilization alongside MongoDB connection metrics.
- Keep application connection pools sized with headroom; never sum pool maximums to equal the server ceiling.
- Run monitoring tools and backups against hidden secondaries to reserve primary capacity for application traffic.
- Review
db.currentOp()for the longest-running operation age during peak hours to catch connection holders early.
How Netdata helps
- Netdata collects
serverStatus().connectionsautomatically and chartscurrent,available, andtotalCreateddeltas, making churn visible without manual sampling. - Connection utilization percentage is computed and alerted when the ratio crosses 80%.
- Connection spikes are correlated on the same timeline with RSS, WiredTiger ticket utilization, and
opLatencies, so you can distinguish a connection storm from a storage-layer slowdown. - Netdata alerts on rapid
totalCreatedincreases that indicate pool thrashing, even whencurrentappears stable. - File descriptor utilization is collected at the OS level and correlated with the MongoDB process, exposing the
ulimit -nboundary that MongoDB metrics alone cannot show.
Related guides
- How MongoDB actually works in production: a mental model for operators: /guides/mongodb/how-mongodb-works-in-production/
- MongoDB pages evicted by application threads: when eviction becomes user latency: /guides/mongodb/mongodb-application-thread-evictions/
- MongoDB WiredTiger cache dirty ratio high: the leading indicator nobody watches: /guides/mongodb/mongodb-cache-dirty-ratio-high/
- MongoDB WiredTiger cache pressure cascade: eviction stalls and latency spikes: /guides/mongodb/mongodb-cache-pressure-cascade/
- MongoDB cache too small: sizing the WiredTiger cache for your working set: /guides/mongodb/mongodb-cache-undersized-working-set/
- MongoDB checkpoint duration climbing: diagnosing slow WiredTiger checkpoints: /guides/mongodb/mongodb-checkpoint-duration-high/
- MongoDB checkpoint stall write freeze: when all writes stop with no error: /guides/mongodb/mongodb-checkpoint-stall-write-freeze/
- MongoDB connection storm spiral: reconnection floods after an election or deploy: /guides/mongodb/mongodb-connection-storm-spiral/
- MongoDB flow control throttling writes: when the primary slows itself down: /guides/mongodb/mongodb-flow-control-throttling-writes/
- MongoDB journal sync latency high: the storage signal that warns 60 seconds early: /guides/mongodb/mongodb-journal-sync-latency-high/
- MongoDB monitoring checklist: the signals every production cluster needs: /guides/mongodb/mongodb-monitoring-checklist/
- MongoDB monitoring maturity model: from survival to expert: /guides/mongodb/mongodb-monitoring-maturity-model/







