$ guides / network / network-snmp-auth-failure-spikes ▌

Operations Guides

SNMP authentication-failure spikes: misconfiguration vs reconnaissance

SNMP authentication-failure traps are one of the few security signals built into the network monitoring stack. When they spike, the question is never “is something wrong?” - it is “is this a broken poller or someone probing my devices?” The answer changes the response from a quiet config fix to a security incident.

The authenticationFailure trap (OID 1.3.6.1.6.3.1.1.5.5) fires whenever an SNMP agent receives a protocol message that is not properly authenticated. On SNMPv2c, that means a wrong community string. On SNMPv3, it means a wrong username, wrong auth protocol, wrong auth password, or wrong privacy password. The trap is defined in SNMPv2-MIB and every compliant agent can generate it, but many vendors ship with it disabled by default. If you have never explicitly enabled it (for example, snmp-server enable traps snmp authentication on Cisco IOS), you may have no signal at all.

The hard part is interpretation. A burst of auth failures from a single IP can be a monitoring station with a stale credential, or an attacker enumerating community strings. A burst across many devices from one source is almost always scanning. The discriminating signals are source-IP distribution, timing regularity, credential variety, and whether the source is inside or outside the management subnet.

Signal types: traps vs cumulative counters

The authenticationFailure trap is event-based: it fires on each failed authentication attempt. This is distinct from the cumulative counters that track the same failure at the agent level.

For SNMPv2c, the relevant counters are:

.1.3.6.1.2.1.11.4 (snmpInBadCommunityNames) - community name not recognized by the agent
.1.3.6.1.2.1.11.5 (snmpInBadCommunityUses) - community recognized but the SNMP operation was not permitted for that community

For SNMPv3, the USM (User-based Security Model) exposes two separate counters:

.1.3.6.1.6.3.15.1.1.3 (usmStatsUnknownUserNames) - the username was not recognized by the agent
.1.3.6.1.6.3.15.1.1.5 (usmStatsWrongDigests) - the username was recognized but the authentication digest did not match

This distinction matters operationally. Rising usmStatsUnknownUserNames with stable usmStatsWrongDigests means someone is probing usernames that do not exist on the device. Rising usmStatsWrongDigests with stable usmStatsUnknownUserNames means the username is valid but the password or auth protocol is wrong, which points to credential rotation or key compromise rather than blind probing.

Common causes

Cause	What it looks like	First thing to check
Misconfigured poller	Failures from one known management IP, periodic at the polling interval (often 300s), same credential each time	Verify the poller’s SNMP credential configuration against the device
Credential rotation	Failures from one or a few known IPs, starting at a specific timestamp, correct username but wrong digest	Check whether a credential rotation was applied to one side but not the other
Vulnerability scanner sweep	Failures across many devices from one source IP, varied credentials attempted, non-periodic timing	Identify the scanner source and verify it is authorized
External reconnaissance	Failures from IPs outside the management subnet, rapid succession, varied community strings or usernames	Check perimeter ACL and firewall logs for the source IP
Community string “public” still configured	No auth-failure traps at all from scanned devices, but `snmpInBadCommunityNames` is high because scans succeed against “public”	Poll `snmpInBadCommunityNames` and check device config for default community strings

Quick checks

# Check SNMPv2c bad community name counter
snmpget -v2c -c <community> <device> .1.3.6.1.2.1.11.4.0

# Check SNMPv3 USM unknown usernames counter
# Match -a (SHA/SHA-256/MD5) to the agent's configured auth protocol
snmpget -v3 -l authNoPriv -u <user> -a SHA -A <authpass> <device> .1.3.6.1.6.3.15.1.1.3.0

# Check SNMPv3 USM wrong digests counter
snmpget -v3 -l authNoPriv -u <user> -a SHA -A <authpass> <device> .1.3.6.1.6.3.15.1.1.5.0

# Search trap receiver logs for auth failures
grep "authenticationFailure" /var/log/snmptrapd.log | tail -50

# Count auth failure traps from the device syslog (Cisco IOS/IOS XE)
ssh <device> 'show logging | include SNMP-3-AUTHFAIL'

# Verify the auth-failure trap is enabled on the device (Cisco IOS)
ssh <device> 'show run | include snmp-server enable traps snmp authentication'

How to diagnose it

flowchart TD
    A["authFailure spike"] --> B{"Single source IP?"}
    B -- Yes, known poller --> C["Misconfigured poller or credential rotation"]
    B -- Yes, unknown --> D{"Inside mgmt subnet?"}
    D -- Yes --> E["Unauthorized internal tool"]
    D -- No --> F["External scanning: escalate to security"]
    B -- No, many sources --> G{"Varied credentials?"}
    G -- Yes --> H["Reconnaissance or vuln scanner"]
    G -- No --> I["Shared wrong credential across estate"]

Extract source IPs from the trap stream and syslog. The standard authenticationFailure trap carries no varbinds beyond sysUpTime and snmpTrapOID. The source IP of the failed request is more reliably available in device syslog. On Cisco IOS, the SNMP-3-AUTHFAIL syslog message includes the source IP directly. Parse both the trap receiver log and the device syslog to build a source-IP frequency table.
Classify each source IP. For each source, determine: is it a known monitoring station? Is it inside the management subnet? Any nonzero auth-failure rate from a source outside the management subnet is an unauthorized access attempt. Escalate immediately.
Check timing regularity. Misconfigured pollers produce failures at their polling interval, typically every 300 seconds. If failures arrive at a precise, repeating interval, the source is almost certainly a monitoring probe with a stale credential. Random timing or rapid bursts suggest active scanning.
Distinguish USM error types for SNMPv3. Poll usmStatsUnknownUserNames (.1.3.6.1.6.3.15.1.1.3.0) and usmStatsWrongDigests (.1.3.6.1.6.3.15.1.1.5.0) separately. Rising unknown-usernames with stable wrong-digests indicates username probing. The inverse indicates a valid user with a wrong password or auth protocol mismatch, pointing to misconfiguration rather than probing.
Check for multi-vector scanning. Correlate with syslog for SSH and HTTP authentication failures from the same source IP. An attacker probing SNMP is often probing other protocols simultaneously. Check AAA logs (TACACS+/RADIUS) for login failures from the same source.
Verify the absence of traps is not hiding a problem. Many devices still have SNMPv2c community string “public” configured for read-only access. Scans against “public” succeed and therefore do not generate auth-failure traps. Poll snmpInBadCommunityNames even when no traps are seen. If the counter is rising with no corresponding traps, investigate whether “public” or “private” is still configured.
Check for silent v3 failures. Some platforms silently fail SNMPv3 authentication without generating a trap unless auth-failure trapping is explicitly enabled. If you suspect v3 failures but see no traps, verify the trap configuration on the device.

Metrics and signals to monitor

Signal	Why it matters	Warning sign
`snmpInBadCommunityNames` (`.1.3.6.1.2.1.11.4`)	Cumulative count of community-name mismatches at the agent	Any nonzero in production; rising without corresponding traps suggests scans succeeding against “public”
`snmpInBadCommunityUses` (`.1.3.6.1.2.1.11.5`)	Community recognized but operation not permitted	Nonzero indicates a tool attempting write operations with read-only credentials
`usmStatsUnknownUserNames` (`.1.3.6.1.6.3.15.1.1.3`)	SNMPv3 username not recognized by agent	Rising counter with varied usernames indicates reconnaissance
`usmStatsWrongDigests` (`.1.3.6.1.6.3.15.1.1.5`)	SNMPv3 username valid but auth digest mismatch	Rising counter with stable user count indicates credential rotation or key compromise
authenticationFailure trap rate (`.1.3.6.1.6.3.1.1.5.5`)	Event-level signal	Rate > 10 events/min from a single source is scanning; any event from outside the management subnet is unauthorized
SSH/HTTP auth failure rate (syslog/AAA)	Multi-vector scanning probes multiple protocols from one source	Same source IP failing auth across SNMP, SSH, and HTTP is active reconnaissance
Per-source trap frequency distribution	A single sender dominating trap volume is a finding	One source IP accounting for the majority of auth-failure traps

Fixes

Misconfigured poller or stale credential

The most common cause. A monitoring station was recently updated, moved, or reconfigured, and its SNMP credentials no longer match the device.

Identify which poller is generating the failures from the trap source IP or syslog.
Verify the poller’s configured community string or SNMPv3 credentials against the device’s actual configuration.
For SNMPv3, check the auth protocol (MD5 vs SHA), auth password, and privacy password independently. A mismatch on any one produces auth failures. SNMPv3 USM (RFC 3414) requires auth passwords of at least 8 characters; shorter passwords fail silently on compliant agents.
Apply the correct credential to the poller. Do not change the device credential to match the poller unless the poller’s credential is the intended one.

Credential rotation mismatch

Credential rotation applied to one side but not the other.

Verify whether a credential rotation was recently scheduled or executed.
Check both the device and the monitoring system for the current credential.
Synchronize. Prefer rotating on the device first, then updating the poller, to minimize the failure window.

External reconnaissance or scanning

Failures from IPs outside the management subnet require a security response.

Check perimeter firewall and ACL logs for the source IP.
Block the source IP at the perimeter if policy permits.
Verify that SNMP access (UDP 161) is restricted to the management subnet via ACLs on every device. If it is reachable from outside, that is the configuration error that enabled the scanning.
Escalate to the security team. Correlate with SSH, HTTP, or other protocol auth failures from the same source.
If any device still uses community string “public” or “private”, remediate immediately. Scans against “public” succeed silently and never generate auth-failure traps.

SNMPv3 credential special-character issues

On some platforms, shell-special characters ($, backticks, !) in SNMPv3 credentials cause silent authentication failures when passed through CLI or configuration management tools. Test credentials that contain only alphanumerics first to isolate parsing from genuine auth failures.

CVE-2025-20352 exposure

CVE-2025-20352 (CVSS 7.7, disclosed September 2025) is a stack overflow vulnerability in the SNMP subsystem of Cisco IOS and IOS XE. It affects all SNMP versions (v1, v2c, v3) and exploitation requires valid SNMP credentials. An authenticated remote attacker can cause a device reload or execute code as root. If your estate includes Cisco IOS or IOS XE devices and you observe auth-failure spikes from external sources, treat this as a potential precursor to exploitation. Restrict SNMP access to trusted management IPs via ACLs and patch to the fixed release.

Prevention

Enable auth-failure traps on every device. Many vendors ship with this disabled. On Cisco IOS, use snmp-server enable traps snmp authentication. Without this, you have no event-level signal.
Eliminate default community strings. Any device still using “public” or “private” is a silent finding. Scans against “public” succeed without generating any auth-failure trap, so the absence of traps does not mean the absence of scans.
Restrict SNMP to the management subnet. SNMP on UDP 161 should never be reachable from outside the management network. Apply ACLs on every device.
Prefer SNMPv3 over v2c. SNMPv2c community strings are transmitted in cleartext and can be captured by passive sniffing on the management VLAN. SNMPv3 with authPriv provides both authentication and encryption.
Baseline the auth-failure rate. A healthy estate should have zero auth failures in steady state. Any nonzero value is abnormal. Track the per-source distribution so that a new source is immediately visible.
Monitor cumulative counters, not just traps. Traps can be dropped by a flooded receiver. UDP 162 is lossy under burst. The cumulative counters (snmpInBadCommunityNames, usmStatsUnknownUserNames, usmStatsWrongDigests) persist at the agent and survive trap loss.
Correlate with SSH and AAA auth failures. Multi-vector scanning probes multiple protocols from the same source. Join SNMP auth-failure events with syslog auth failures by source IP and timestamp.

How Netdata helps

Netdata’s SNMP collector can poll snmpInBadCommunityNames, usmStatsUnknownUserNames, and usmStatsWrongDigests at per-second resolution, giving rate-of-change visibility that 5-minute polling misses.
Trap receiver metrics expose per-trap-type rates, so an authenticationFailure spike is visible as a distinct signal alongside linkDown, coldStart, and enterprise-specific traps.
Correlate SNMP auth-failure spikes with syslog auth failures from the same source IP on the unified timeline, without joining across separate tools.
Anomaly detection on the auth-failure counter rate baselines the normal (zero) state and flags any deviation, including slow-rate probing that stays below fixed thresholds.
UDP socket buffer drop monitoring (Udp_RcvbufErrors) on the trap receiver surfaces when traps are lost under burst, so an auth-failure spike does not silently disappear at the receiver.

SNMP authentication-failure spikes: misconfiguration vs reconnaissance

SNMP authentication-failure spikes: misconfiguration vs reconnaissance

Signal types: traps vs cumulative counters

Common causes

Quick checks

How to diagnose it

Metrics and signals to monitor

Fixes

Misconfigured poller or stale credential

Credential rotation mismatch

External reconnaissance or scanning

SNMPv3 credential special-character issues

CVE-2025-20352 exposure

Prevention

How Netdata helps

Related guides