MongoDB Authentication failed: credential rotation, brute force, and the log signal
A spike of Authentication failed entries in the MongoDB log is abnormal: in a healthy deployment the baseline is near zero. The signal usually means one of four things: a brute-force attack, an application presenting stale credentials after rotation, an expired x.509 certificate, or an upstream LDAP or Kerberos directory issue. Distinguishing between these quickly determines whether you are facing a security incident or an impending application outage.
The log line is unambiguous: the server rejected the credentials during the initial handshake. This differs from not authorized (error code 13), which means the user authenticated successfully but lacks privileges. Treat Authentication failed and not authorized as a two-step triage. Pure authentication failures point to the credential itself or the trust path to the authentication provider. If both signals appear together, check for an attacker probing accounts and then using any that were misconfigured with excessive privileges.
What this means
An Authentication failed entry means the client-supplied credential could not be validated. The failure occurs before authorization, so the server never evaluates roles or permissions. In password-based deployments the hash did not match. In x.509 deployments the certificate may be expired, self-signed when a CA is required, or presented by a client whose identity does not map to a known user. In external-directory deployments the MongoDB server may be unable to reach the directory, or the directory may have rejected the bind request.
Even a handful of failures per minute from a single source is suspicious. Exceeding ten per minute from one IP warrants immediate investigation; exceeding one hundred per minute is attack-like. Failures spread across many internal application IPs after a maintenance window usually indicate credential rotation that was not propagated to all clients.
flowchart TD
A[Auth failed spike in logs] --> B{See "not authorized"
code 13?}
B -->|Yes| C[Authorization issue:
check roles]
B -->|No| D{Single IP >10/min?}
D -->|Yes| E[Brute-force or
scan: check source]
D -->|No| F{Recent credential
rotation?}
F -->|Yes| G[Stale creds:
check app configs]
F -->|No| H{TLS/x.509
deployment?}
H -->|Yes| I[Check certificate
expiry]
H -->|No| J[Check LDAP/Kerberos
directory health]Common causes
| Cause | What it looks like | First thing to check |
|---|---|---|
| Brute-force attack | Sustained failures from one or a few external IPs. >10/min is suspicious; >100/min is attack-like. | Source IP distribution in mongod.log |
| Stale credentials after rotation | Spike from known application hosts minutes after a password or certificate change | Correlation between rotation timestamp and first failure |
| Expired x.509 certificate | Sudden failures in deployments using x.509 client certificate authentication | Certificate validity dates with openssl |
| LDAP/Kerberos directory issue | Failures correlate with directory service maintenance or network partition | Directory service health and reachability |
Quick checks
Run these read-only commands to characterize the scope and source of the failures.
# Count authentication failures in the current log
grep -c "Authentication failed" /var/log/mongodb/mongod.log
# View recent failures with source metadata
grep "Authentication failed" /var/log/mongodb/mongod.log | tail -20
# Check for authorization errors (code 13) to separate auth from privilege issues
grep "not authorized" /var/log/mongodb/mongod.log | tail -20
// Check total connections and churn to see if apps are retrying aggressively
var c = db.serverStatus().connections;
print("Current: " + c.current + ", Available: " + c.available + ", Total created: " + c.totalCreated);
// Check if auditing exposes authentication counters
db.serverStatus().security.authentication
# Check server TLS certificate expiry. For x.509 authentication, also inspect client certificate files on application hosts.
echo | openssl s_client -connect localhost:27017 2>/dev/null | openssl x509 -noout -dates
// Verify TLS mode and certificate configuration
var opts = db.adminCommand({getCmdLineOpts: 1}).parsed.net;
print("TLS mode: " + (opts.tls ? opts.tls.mode : (opts.ssl ? opts.ssl.mode : "disabled")));
How to diagnose it
- Confirm the baseline deviation. Compute the failure rate over the last hour. A healthy cluster shows near-zero authentication failures; any sustained non-zero rate is abnormal.
- Separate authentication from authorization. Search the same log window for
not authorizedand error code 13. If code 13 dominates, the credentials are valid but roles are wrong. Stop here and fix privileges. - Map failures to source IPs. Extract client addresses from the
Authentication failedlines. A single IP generating most failures suggests a scanner or misconfigured host. Many internal IPs suggest a widespread credential issue. - Correlate with recent changes. Check whether the spike began immediately after a credential rotation, application deploy, certificate renewal, or directory service maintenance window.
- Validate certificates. If the deployment uses x.509, verify the not-after dates on the server and client certificates. Renew any certificates that have crossed their expiry boundary.
- Check directory services. If the deployment relies on LDAP or Kerberos, verify that the directory is reachable from the database hosts and that the service account used for binding is not locked or expired.
- Assess severity by rate. Exceeding ten per minute from one IP is suspicious; exceeding one hundred per minute is attack-like. Escalate accordingly.
Metrics and signals to monitor
| Signal | Why it matters | Warning sign |
|---|---|---|
| Authentication failure rate | Direct measure of credential rejections | Sustained rate above the near-zero baseline |
| Unauthorized command rate (code 13) | Distinguishes bad credentials from missing privileges | Spikes of not authorized after auth succeeds |
| Connection utilization | Brute-force or retry storms exhaust capacity | current / (current + available) > 80% |
Connection churn (totalCreated delta) | Credential rotation causes pools to reconnect and re-auth | totalCreated rate spikes alongside auth failures |
| TLS certificate expiry | Expired x.509 certs cause handshake failures | Validity date within the renewal window |
| Network bind address | Exposure increases brute-force attack surface | Bound to 0.0.0.0 without network-level restrictions |
Fixes
Brute-force attack
Block the offending source at the network firewall or cloud security group. Do not attempt to mitigate by lowering maxIncomingConnections; that harms legitimate clients and can destabilize the replica set. If the attacker is internal, revoke the exposed credentials and force a rotation. Review bind addresses to ensure MongoDB is not reachable from untrusted networks. Tradeoff: IP blocks can be circumvented by distributed scans and may block legitimate users behind NAT.
Stale credentials after rotation
Identify every application host or microservice still using the old password or certificate. Restart their connection pools or processes so they pick up the new credential. A brief spike after rotation is expected, but should resolve within minutes. If it persists, an instance or config file was missed. Tradeoff: restarting pools causes a short burst of reconnections and thread creation overhead.
Expired x.509 certificate
Provision a new certificate, update the configured certificate paths, and perform a rolling restart of the replica set or sharded cluster members. Verify that client certificates are also renewed if mutual TLS is in use. Tradeoff: each restarted member briefly leaves the set, so perform during a maintenance window or tolerate brief replication lag.
LDAP or Kerberos integration failure
Restore directory service connectivity. If the outage is prolonged, consider failing over to locally defined MongoDB users as a temporary measure. Document this as a break-glass procedure only; revert to directory authentication once service is restored. Tradeoff: local authentication bypasses central audit and policy controls.
Prevention
- Monitor authentication failure rate. Any sustained rate above the near-zero baseline indicates an active problem.
- Coordinate rotation with connection pool drains. Rotate the database credential only after all clients have been updated, and verify the spike resolves.
- Monitor certificate expiry dates externally. Renew and deploy new certificates before they expire.
- Restrict bindIp to necessary interfaces. Avoid
0.0.0.0unless a proxy or load balancer sits in front. - Map services to credentials. Maintain a runbook that lists which applications use which principals so no instance is left behind during rotation.
How Netdata helps
- Correlates
Authentication failedlog volume with connection count and churn, so you can see whether an auth spike is causing a connection storm. - Tracks connection utilization to warn when brute-force retries or rotation-induced reconnections approach
maxIncomingConnections. - Surfaces
Authentication failedpatterns alongsidenot authorized(code 13) errors in the same time window to speed up authentication-vs-authorization triage. - Alerts on TLS certificate expiry independently of the database server’s internal warnings, giving advance notice before x.509 authentication breaks.
Related guides
- How MongoDB actually works in production: a mental model for operators
- MongoDB pages evicted by application threads: when eviction becomes user latency
- MongoDB balancer stuck and jumbo chunks: permanent imbalance and how to fix it
- MongoDB WiredTiger cache dirty ratio high: the leading indicator nobody watches
- MongoDB WiredTiger cache pressure cascade: eviction stalls and latency spikes
- MongoDB cache too small: sizing the WiredTiger cache for your working set
- MongoDB checkpoint duration climbing: diagnosing slow WiredTiger checkpoints
- MongoDB checkpoint stall write freeze: when all writes stop with no error
- MongoDB chunk migration storms: moveChunk I/O pressure and range locks
- MongoDB connection churn: high totalCreated rate and thread creation overhead
- MongoDB connection refused at maxIncomingConnections: hitting the connection ceiling
- MongoDB connection storm spiral: reconnection floods after an election or deploy







