MongoDB Authentication failed: credential rotation, brute force, and the log signal

A spike of Authentication failed entries in the MongoDB log is abnormal: in a healthy deployment the baseline is near zero. The signal usually means one of four things: a brute-force attack, an application presenting stale credentials after rotation, an expired x.509 certificate, or an upstream LDAP or Kerberos directory issue. Distinguishing between these quickly determines whether you are facing a security incident or an impending application outage.

The log line is unambiguous: the server rejected the credentials during the initial handshake. This differs from not authorized (error code 13), which means the user authenticated successfully but lacks privileges. Treat Authentication failed and not authorized as a two-step triage. Pure authentication failures point to the credential itself or the trust path to the authentication provider. If both signals appear together, check for an attacker probing accounts and then using any that were misconfigured with excessive privileges.

What this means

An Authentication failed entry means the client-supplied credential could not be validated. The failure occurs before authorization, so the server never evaluates roles or permissions. In password-based deployments the hash did not match. In x.509 deployments the certificate may be expired, self-signed when a CA is required, or presented by a client whose identity does not map to a known user. In external-directory deployments the MongoDB server may be unable to reach the directory, or the directory may have rejected the bind request.

Even a handful of failures per minute from a single source is suspicious. Exceeding ten per minute from one IP warrants immediate investigation; exceeding one hundred per minute is attack-like. Failures spread across many internal application IPs after a maintenance window usually indicate credential rotation that was not propagated to all clients.

flowchart TD
    A[Auth failed spike in logs] --> B{See "not authorized"
code 13?} B -->|Yes| C[Authorization issue:
check roles] B -->|No| D{Single IP >10/min?} D -->|Yes| E[Brute-force or
scan: check source] D -->|No| F{Recent credential
rotation?} F -->|Yes| G[Stale creds:
check app configs] F -->|No| H{TLS/x.509
deployment?} H -->|Yes| I[Check certificate
expiry] H -->|No| J[Check LDAP/Kerberos
directory health]

Common causes

CauseWhat it looks likeFirst thing to check
Brute-force attackSustained failures from one or a few external IPs. >10/min is suspicious; >100/min is attack-like.Source IP distribution in mongod.log
Stale credentials after rotationSpike from known application hosts minutes after a password or certificate changeCorrelation between rotation timestamp and first failure
Expired x.509 certificateSudden failures in deployments using x.509 client certificate authenticationCertificate validity dates with openssl
LDAP/Kerberos directory issueFailures correlate with directory service maintenance or network partitionDirectory service health and reachability

Quick checks

Run these read-only commands to characterize the scope and source of the failures.

# Count authentication failures in the current log
grep -c "Authentication failed" /var/log/mongodb/mongod.log
# View recent failures with source metadata
grep "Authentication failed" /var/log/mongodb/mongod.log | tail -20
# Check for authorization errors (code 13) to separate auth from privilege issues
grep "not authorized" /var/log/mongodb/mongod.log | tail -20
// Check total connections and churn to see if apps are retrying aggressively
var c = db.serverStatus().connections;
print("Current: " + c.current + ", Available: " + c.available + ", Total created: " + c.totalCreated);
// Check if auditing exposes authentication counters
db.serverStatus().security.authentication
# Check server TLS certificate expiry. For x.509 authentication, also inspect client certificate files on application hosts.
echo | openssl s_client -connect localhost:27017 2>/dev/null | openssl x509 -noout -dates
// Verify TLS mode and certificate configuration
var opts = db.adminCommand({getCmdLineOpts: 1}).parsed.net;
print("TLS mode: " + (opts.tls ? opts.tls.mode : (opts.ssl ? opts.ssl.mode : "disabled")));

How to diagnose it

  1. Confirm the baseline deviation. Compute the failure rate over the last hour. A healthy cluster shows near-zero authentication failures; any sustained non-zero rate is abnormal.
  2. Separate authentication from authorization. Search the same log window for not authorized and error code 13. If code 13 dominates, the credentials are valid but roles are wrong. Stop here and fix privileges.
  3. Map failures to source IPs. Extract client addresses from the Authentication failed lines. A single IP generating most failures suggests a scanner or misconfigured host. Many internal IPs suggest a widespread credential issue.
  4. Correlate with recent changes. Check whether the spike began immediately after a credential rotation, application deploy, certificate renewal, or directory service maintenance window.
  5. Validate certificates. If the deployment uses x.509, verify the not-after dates on the server and client certificates. Renew any certificates that have crossed their expiry boundary.
  6. Check directory services. If the deployment relies on LDAP or Kerberos, verify that the directory is reachable from the database hosts and that the service account used for binding is not locked or expired.
  7. Assess severity by rate. Exceeding ten per minute from one IP is suspicious; exceeding one hundred per minute is attack-like. Escalate accordingly.

Metrics and signals to monitor

SignalWhy it mattersWarning sign
Authentication failure rateDirect measure of credential rejectionsSustained rate above the near-zero baseline
Unauthorized command rate (code 13)Distinguishes bad credentials from missing privilegesSpikes of not authorized after auth succeeds
Connection utilizationBrute-force or retry storms exhaust capacitycurrent / (current + available) > 80%
Connection churn (totalCreated delta)Credential rotation causes pools to reconnect and re-authtotalCreated rate spikes alongside auth failures
TLS certificate expiryExpired x.509 certs cause handshake failuresValidity date within the renewal window
Network bind addressExposure increases brute-force attack surfaceBound to 0.0.0.0 without network-level restrictions

Fixes

Brute-force attack

Block the offending source at the network firewall or cloud security group. Do not attempt to mitigate by lowering maxIncomingConnections; that harms legitimate clients and can destabilize the replica set. If the attacker is internal, revoke the exposed credentials and force a rotation. Review bind addresses to ensure MongoDB is not reachable from untrusted networks. Tradeoff: IP blocks can be circumvented by distributed scans and may block legitimate users behind NAT.

Stale credentials after rotation

Identify every application host or microservice still using the old password or certificate. Restart their connection pools or processes so they pick up the new credential. A brief spike after rotation is expected, but should resolve within minutes. If it persists, an instance or config file was missed. Tradeoff: restarting pools causes a short burst of reconnections and thread creation overhead.

Expired x.509 certificate

Provision a new certificate, update the configured certificate paths, and perform a rolling restart of the replica set or sharded cluster members. Verify that client certificates are also renewed if mutual TLS is in use. Tradeoff: each restarted member briefly leaves the set, so perform during a maintenance window or tolerate brief replication lag.

LDAP or Kerberos integration failure

Restore directory service connectivity. If the outage is prolonged, consider failing over to locally defined MongoDB users as a temporary measure. Document this as a break-glass procedure only; revert to directory authentication once service is restored. Tradeoff: local authentication bypasses central audit and policy controls.

Prevention

  • Monitor authentication failure rate. Any sustained rate above the near-zero baseline indicates an active problem.
  • Coordinate rotation with connection pool drains. Rotate the database credential only after all clients have been updated, and verify the spike resolves.
  • Monitor certificate expiry dates externally. Renew and deploy new certificates before they expire.
  • Restrict bindIp to necessary interfaces. Avoid 0.0.0.0 unless a proxy or load balancer sits in front.
  • Map services to credentials. Maintain a runbook that lists which applications use which principals so no instance is left behind during rotation.

How Netdata helps

  • Correlates Authentication failed log volume with connection count and churn, so you can see whether an auth spike is causing a connection storm.
  • Tracks connection utilization to warn when brute-force retries or rotation-induced reconnections approach maxIncomingConnections.
  • Surfaces Authentication failed patterns alongside not authorized (code 13) errors in the same time window to speed up authentication-vs-authorization triage.
  • Alerts on TLS certificate expiry independently of the database server’s internal warnings, giving advance notice before x.509 authentication breaks.