$ guides / network / network-syslog-severity-normalization ▌

Operations Guides

Normalizing syslog severity across vendors: why 'critical' isn't critical

Normalizing syslog severity across vendors: why ‘critical’ isn’t critical

RFC 5424 defines eight syslog severity levels, numbered 0 through 7: Emergency, Alert, Critical, Error, Warning, Notice, Informational, and Debug. Every major network vendor implements the same numeric scale. The integer that means “Critical” on a Cisco router means “Critical” on a Juniper switch.

But the severity a device assigns to a given event is not standardized. A BGP session reset might arrive as severity 5 (Notice) from one vendor and severity 3 (Error) from another. A hardware alarm that one platform logs as Critical (2), another logs as Alert (1) or Warning (4). Same operational condition, different severity label, same RFC.

Teams that forward raw vendor severity into a SIEM without a normalization layer build alerting rules on an unstable foundation. A rule that triggers on “severity at or above Critical” fires for some vendors and silently misses the same event class from others.

The production failure pattern

A firewall sends a “Warning” severity message about a threat detection. The SIEM alerting rule triggers only on “Critical” and “Alert.” It never fires. The threat goes undetected for hours.

Any multi-vendor network that feeds syslog into a SIEM, log aggregator, or alerting pipeline inherits this problem. The question is not whether severity mismatch exists. It is whether your normalization layer accounts for it before the data reaches your alert rules.

How it works

The RFC 5424 severity scale

The eight levels, from most to least severe:

Numeric	Label	Typical operational meaning
0	Emergency	System is unusable
1	Alert	Action must be taken immediately
2	Critical	Critical conditions
3	Error	Error conditions
4	Warning	Warning conditions
5	Notice	Normal but significant condition
6	Informational	Informational messages
7	Debug	Debug-level messages

All vendors agree on this mapping. No vendor assigns a different label to numeric 2. The disagreement is upstream: which events get which numbers. That decision is left to each vendor’s firmware developers.

Threshold filtering is inclusive upward

When you configure a severity threshold on a network device, the device sends that level and all numerically lower (more severe) levels. Configuring “error” (3) on a Juniper device captures Emergency (0), Alert (1), Critical (2), and Error (3). The same inclusive behavior holds on Cisco, Arista, and FortiOS.

Set the threshold too permissively and you flood the collector with debug and informational noise. Set it too restrictively and you silently suppress events the vendor assigned to Notice or Informational but that you consider operationally important. Device-side threshold filtering controls volume. It does not normalize severity across vendors.

Facility codes differ across vendors

Severity is only half of the PRI field. The other half is the facility code, which identifies the subsystem that generated the message. Facility codes also differ across vendors, and this breaks rules that filter on facility.

Juniper uses various facility codes for different subsystems, including kernel (0), user (1), daemon (3), authorization (4), and local facilities for firewall and forwarding-plane messages.

Cisco routing-plane messages commonly use local6 or local7 depending on platform and configuration. A SIEM rule that filters on a specific facility will silently drop identical-severity messages from a different vendor using a different facility code. If your normalization layer remaps severity but not facility, you have solved only half the problem.

Two message formats in the wild

RFC 5424 obsoleted RFC 3164 (the BSD syslog format). RFC 5424 added structured-data elements, a version field, and a strictly defined message structure. Most legacy network gear still emits RFC 3164-style messages. Both formats remain in active use, and collectors must parse both.

The PRI field (and therefore the severity encoding) is calculated the same way in both formats: PRI = facility * 8 + severity. What differs is the surrounding message structure. A parser that expects RFC 5424 and receives RFC 3164 will fail to extract fields correctly, not because the severity is encoded differently, but because the message layout differs. Test your collector against actual device output for both formats.

flowchart TD
    A[Multi-vendor syslog stream] --> B[Collector / forwarder]
    B --> C{Per-vendor severity normalization?}
    C -->|No: raw passthrough| D[SIEM rules match on raw severity]
    C -->|Yes: remap table| E[SIEM rules match on normalized severity]
    D --> F[Missed alerts on some vendors]
    E --> G[Consistent alerting across vendors]

Where it shows up in production

The SIEM alert gap

A SIEM rule pages on “critical” events from network devices. It matches on the severity keyword or numeric code in the syslog message. It works for the vendor the team tested against. It silently fails for other vendors that assign different severity to the same event class.

This is most dangerous for security events. A threat detection message logged as Warning (4) by one firewall vendor may represent the same operational urgency as Critical (2) from another. If the alert threshold is set at Critical, the Warning event passes through unflagged.

Compound hostnames breaking inventory matching

Junos OS Evolved appends the node name to the hostname by default, producing hostnames like “ptxhost-re0” or “ptxhost-fpc0” in syslog messages. Some monitoring systems fail to match these compound hostnames against their inventory database. The syslog message arrives with the correct severity but is orphaned because the SIEM cannot map it to a known device.

The workaround is set system syslog alternate-format, which prepends the node name to the process identifier instead, keeping the bare hostname intact. This is a Juniper-specific configuration.

Management instance changes after upgrade

Starting in Junos OS Release 24.2R1, syslog traffic no longer defaults to the dedicated management instance when management-instance is configured. You must explicitly configure mgmt_junos for system log traffic to use the management VRF. Operators upgrading from older Junos OS Evolved releases may find that syslog stops arriving after the upgrade. The cause is not a severity change but a routing-instance change that breaks the transport path. Verify syslog routing-instance configuration post-upgrade.

Windows Event Log does not map to syslog severity

If your estate includes Windows servers alongside network devices, the severity mismatch extends beyond network vendors. Windows Event Log uses a different taxonomy: Critical, Error, Warning, Information, Verbose. There is no Emergency or Alert level. A Windows “Critical” event maps approximately to syslog Error (3) or Critical (2) depending on the application, not to Emergency (0). Correlating Windows and network-device severity in the same SIEM dashboard requires explicit mapping logic.

Common misuses and normalization failures

Alert rules that match on keyword, not numeric code. A rule that matches the string “critical” will miss numeric severity 2 if the syslog message encodes severity numerically. Conversely, a rule that matches numeric severity at or below 2 will match Critical, Alert, and Emergency but will miss a vendor that logs the same event as Warning (4). Match on numeric codes where possible, and normalize the codes per vendor before the alert layer sees them.

Assuming facility consistency. Facility codes differ across vendors. A normalization layer that remaps severity but passes facility through unchanged will produce rules that work for some vendors and silently fail for others.

Relying on device-side threshold filtering as the only control. Device-side filtering reduces volume but cannot normalize across vendors. Two devices configured with the same threshold (“error”) will send different event sets because they assign different severities to the same events. Normalization must happen at the collector or SIEM layer.

Mixing timezones. Timestamps in syslog are device-local-time unless explicitly UTC (RFC 3339 / RFC 5424). Timestamps from devices in different timezones (or with NTP drift) will break event correlation even with perfect severity normalization. Normalize time alongside severity. Force UTC everywhere and monitor device NTP offset.

Ignoring parser format differences. If your parser handles only RFC 3164 or only RFC 5424, messages in the other format will be mis-parsed or dropped. Verify that the collector handles both and test with actual device output.

Building a normalization layer

Audit current severity assignments. Before building remapping rules, query your SIEM or log aggregator for the same event class (BGP peer down, interface down, fan failure) across vendors and compare raw severity values. This reveals the mismatch surface area in your specific estate.

Per-vendor severity remapping. Build a lookup table that maps (vendor, event-class, raw-severity) to a normalized severity. This requires per-vendor calibration: review the syslog output from each vendor in your estate, identify the event classes that matter operationally, and assign a normalized severity based on operational impact, not the vendor’s label.

Facility normalization. Alongside severity, remap facility codes to a normalized scheme. Group vendor-specific facilities into operational categories: routing-plane, firewall, authentication, hardware, config-change. Filter and alert on the normalized category, not the raw facility code.

Format-aware parsing. Ensure the collector handles both RFC 3164 and RFC 5424. Validate that severity extraction works for both formats. Test with actual device output from each vendor, not synthetic messages.

Signals to watch in production

Signal	Why it matters	Warning sign
Syslog severity distribution per vendor	Reveals vendor-specific severity patterns; baseline what “normal” looks like per vendor	Sudden shift in distribution for one vendor indicates config change or new event type
SIEM alert rate by source vendor	If one vendor generates most alerts, others may be under-alerting due to severity mismatch	One vendor disproportionately quiet compared to peers
UDP receive buffer errors (`RcvbufErrors` in `/proc/net/snmp`)	Drops during syslog storms lose the highest-priority messages first, including root-cause indicators	Counter incrementing during incidents means root-cause syslog likely lost
Syslog source count (devices actively sending)	A device that stops sending syslog is invisible to severity-based alerting	Drop to zero from one device while others still report indicates isolation or logging subsystem failure
Syslog parser error rate	Format mismatch (RFC 3164 vs 5424) produces parse errors, not silent drops	Rising parse errors after a firmware upgrade indicates format change
Timestamp skew between syslog and corroborating signals	Perfect severity normalization fails if timestamps do not align across sources	Events that should correlate appearing minutes or seconds apart indicates clock drift

How Netdata helps

Netdata collects syslog metrics alongside SNMP, flow, and system-level signals, letting you correlate severity spikes with interface state changes or BGP session drops.
Per-device syslog collection lets you baseline severity distribution per vendor and detect shifts that indicate config changes before they break alert rules.
Collector health metrics, including receive rate and buffer drops, surface silent syslog data loss at the collector.