The only agent that thinks for itself

Autonomous Monitoring with self-learning AI built-in, operating independently across your entire stack.

Unlimited Metrics & Logs
Machine learning & MCP
5% CPU, 150MB RAM
3GB disk, >1 year retention
800+ integrations, zero config
Dashboards, alerts out of the box
> Discover Netdata Agents

Centralized metrics streaming and storage

Aggregate metrics from multiple agents into centralized Parent nodes for unified monitoring across your infrastructure.

Stream from unlimited agents
Long-term data retention
High availability clustering
Data replication & backup
Scalable architecture
Enterprise-grade security
> Learn about Parents

Fully managed cloud platform

Access your monitoring data from anywhere with our SaaS platform. No infrastructure to manage, automatic updates, and global availability.

Zero infrastructure management
99.9% uptime SLA
Global data centers
Automatic updates & patches
Enterprise SSO & RBAC
SOC2 & ISO certified
> Explore Netdata Cloud

Deploy Netdata Cloud in your infrastructure

Run the full Netdata Cloud platform on-premises for complete data sovereignty and compliance with your security policies.

Complete data sovereignty
Air-gapped deployment
Custom compliance controls
Private network integration
Dedicated support team
Kubernetes & Docker support
> Learn about Cloud On-Premises

Powerful, intuitive monitoring interface

Modern, responsive UI built for real-time troubleshooting with customizable dashboards and advanced visualization capabilities.

Real-time chart updates
Customizable dashboards
Dark & light themes
Advanced filtering & search
Responsive on all devices
Collaboration features
> Explore Netdata UI

Monitor on the go

Native iOS and Android apps bring full monitoring capabilities to your mobile device with real-time alerts and notifications.

iOS & Android apps
Push notifications
Touch-optimized interface
Offline data access
Biometric authentication
Widget support
> Download apps

The future of infrastructure observability

See our strategic direction across AI-native observability, full-stack signals, operational intelligence, and enterprise platform maturity.

AI-native observability
Full-stack signal coverage
Operational intelligence
Enterprise platform maturity
Agent releases every 6 weeks
Cloud continuous delivery
> Explore Product Roadmap

Best energy efficiency

True real-time per-second

100% automated zero config

Centralized observability

Multi-year retention

High availability built-in

Zero maintenance

Always up-to-date

Enterprise security

Complete data control

Air-gap ready

Compliance certified

Millisecond responsiveness

Infinite zoom & pan

Works on any device

Native performance

Instant alerts

Monitor anywhere

AI-native observability

Continuous delivery

Open source foundation

80% Faster Incident Resolution

AI-powered troubleshooting from detection, to root cause and blast radius identification, to reporting.

True Real-Time and Simple, even at Scale

Linearly and infinitely scalable full-stack observability, that can be deployed even mid-crisis.

90% Cost Reduction, Full Fidelity

Instead of centralizing the data, Netdata distributes the code, eliminating pipelines and complexity.

Control Without Surrender

SOC 2 Type 2 certified with every metric kept on your infrastructure.

Integrations

800+ collectors and notification channels, auto-discovered and ready out of the box.

800+ data collectors
Auto-discovery & zero config
Cloud, infra, app protocols
Notifications out of the box
> Explore integrations
Real Results
46% Cost Reduction

Reduced monitoring costs by 46% while cutting staff overhead by 67%.

— Leonardo Antunez, Codyas

Zero Pipeline

No data shipping. No central storage costs. Query at the edge.

From Our Users
"Out-of-the-Box"

So many out-of-the-box features! I mostly don't have to develop anything.

— Simon Beginn, LANCOM Systems

No Query Language

Point-and-click troubleshooting. No PromQL, no LogQL, no learning curve.

Enterprise Ready
67% Less Staff, 46% Cost Cut

Enterprise efficiency without enterprise complexity—real ROI from day one.

— Leonardo Antunez, Codyas

SOC 2 Type 2 Certified

Zero data egress. Only metadata reaches the cloud. Your metrics stay on your infrastructure.

Full Coverage
800+ Collectors

Auto-discovered and configured. No manual setup required.

Any Notification Channel

Slack, PagerDuty, Teams, email, webhooks—all built-in.

Built for the People Who Get Paged

Because 3am alerts deserve instant answers, not hour-long hunts.

Every Industry Has Rules. We Master Them.

See how healthcare, finance, and government teams cut monitoring costs 90% while staying audit-ready.

Monitor Any Technology. Configure Nothing.

Install the agent. It already knows your stack.
From Our Users
"A Rare Unicorn"

Netdata gives more than you invest in it. A rare unicorn that obeys the Pareto rule.

— Eduard Porquet Mateu, TMB Barcelona

99% Downtime Reduction

Reduced website downtime by 99% and cloud bill by 30% using Netdata alerts.

— Falkland Islands Government

Real Savings
30% Cloud Cost Reduction

Optimized resource allocation based on Netdata alerts cut cloud spending by 30%.

— Falkland Islands Government

46% Cost Cut

Reduced monitoring staff by 67% while cutting operational costs by 46%.

— Codyas

Real Coverage
"Plugin for Everything"

Netdata has agent capacity or a plugin for everything, including Windows and Kubernetes.

— Eduard Porquet Mateu, TMB Barcelona

"Out-of-the-Box"

So many out-of-the-box features! I mostly don't have to develop anything.

— Simon Beginn, LANCOM Systems

Real Speed
Troubleshooting in 30 Seconds

From 2-3 minutes to 30 seconds—instant visibility into any node issue.

— Matthew Artist, Nodecraft

20% Downtime Reduction

20% less downtime and 40% budget optimization from out-of-the-box monitoring.

— Simon Beginn, LANCOM Systems

Pay per Node. Unlimited Everything Else.

One price per node. Unlimited metrics, logs, users, and retention. No per-GB surprises.

Free tier—forever
No metric limits or caps
Retention you control
Cancel anytime
> See pricing plans

What's Your Monitoring Really Costing You?

Most teams overpay by 40-60%. Let's find out why.

Expose hidden metric charges
Calculate tool consolidation
Customers report 30-67% savings
Results in under 60 seconds
> See what you're really paying

Your Infrastructure Is Unique. Let's Talk.

Because monitoring 10 nodes is different from monitoring 10,000.

On-prem & air-gapped deployment
Volume pricing & agreements
Architecture review for your scale
Compliance & security support
> Start a conversation

Monitoring That Sells Itself

Deploy in minutes. Impress clients in hours. Earn recurring revenue for years.

30-second live demos close deals
Zero config = zero support burden
Competitive margins & deal protection
Response in 48 hours
> Apply to partner

Per-Second Metrics at Homelab Prices

Same engine, same dashboards, same ML. Just priced for tinkerers.

Community: Free forever · 5 nodes · non-commercial
Homelab: $90/yr · unlimited nodes · fair usage
> Get the Homelab Plan

$1,000 Per Referral. Unlimited Referrals.

Your colleagues get 10% off. You get 10% commission. Everyone wins.

10% of subscriptions, up to $1,000 each
Track earnings inside Netdata Cloud
PayPal/Venmo payouts in 3-4 weeks
No caps, no complexity
> Get your referral link
Cost Proof
40% Budget Optimization

"Netdata's significant positive impact" — LANCOM Systems

Calculate Your Savings

Compare vs Datadog, Grafana, Dynatrace

Savings Proof
46% Cost Reduction

"Cut costs by 46%, staff by 67%" — Codyas

30% Cloud Bill Savings

"Reduced cloud bill by 30%" — Falkland Islands Gov

Enterprise Proof
"Better Than Combined Alternatives"

"Better observability with Netdata than combining other tools." — TMB Barcelona

Real Engineers, <24h Response

DPA, SLAs, on-prem, volume pricing

Why Partners Win
Demo Live Infrastructure

One command, 30 seconds, real data—no sandbox needed

Zero Tickets, High Margins

Auto-config + per-node pricing = predictable profit

Homelab Ready
"Absolutely Incredible"

"We tested every monitoring system under the sun." — Benjamin Gabler, CEO Rocket.Net

76k+ GitHub Stars

3rd most starred monitoring project

Worth Recommending
Product That Delivers

Customers report 40-67% cost cuts, 99% downtime reduction

Zero Risk to Your Rep

Free tier lets them try before they buy

AI Support Assistant, Available 24/7

Nedi has access to all official documentation, source code, and resources. Ask any question about Netdata—responds in your language.

Deployment & configuration
Troubleshooting & sizing
Alerts & notifications
Evidence-based answers
> Ask Nedi now

Never Fight Fires Alone

Docs, community, and expert help—pick your path to resolution.

Learn.netdata.cloud docs
Discord, Forums, GitHub
Premium support available
> Get answers now

60 Seconds to First Dashboard

One command to install. Zero config. 850+ integrations documented.

Linux, Windows, K8s, Docker
Auto-discovers your stack
> Read our documentation

Level Up Your Monitoring

Real problems. Real solutions. 112+ guides from basic monitoring to AI observability.

76,000+ Engineers Strong

615+ contributors. 1.5M daily downloads. One mission: simplify observability.

Per-Second. 90% Cheaper. Data Stays Home.

Side-by-side comparisons: costs, real-time granularity, and data sovereignty for every major tool.

See why teams switch from Datadog, Prometheus, Grafana, and more.

> Browse all comparisons
Edge-Native Observability, Born Open Source
Per-second visibility, ML on every metric, and data that never leaves your infrastructure.
Founded in 2016
615+ contributors worldwide
Remote-first, engineering-driven
Open source first
> Read our story
Promises We Publish—and Prove
12 principles backed by open code, independent validation, and measurable outcomes.
Open source, peer-reviewed
Zero config, instant value
Data sovereignty by design
Aligned pricing, no surprises
> See all 12 principles
Edge-Native, AI-Ready, 100% Open
76k+ stars. Full ML, AI, and automation—GPLv3+, not premium add-ons.
76,000+ GitHub stars
GPLv3+ licensed forever
ML on every metric, included
Zero vendor lock-in
> Explore our open source
Build Real-Time Observability for the World
Remote-first team shipping per-second monitoring with ML on every metric.
Remote-first, fully distributed
Open source (76k+ stars)
Challenging technical problems
Your code on millions of systems
> See open roles
Meet the Team Behind Netdata
Conferences, meetups, and tradeshows where you can see Netdata in action and talk to the engineers who build it.
Live demos and deep dives
Book 1-on-1 meetings
Talks and panel sessions
Event recaps and photos
> See all events
Talk to a Netdata Human in <24 Hours
Sales, partnerships, press, or professional services—real engineers, fast answers.
Discuss your observability needs
Pricing and volume discounts
Partnership opportunities
Media and press inquiries
> Book a conversation
Your Data. Your Rules.
On-prem data, cloud control plane, transparent terms.
Trust & Scale
76,000+ GitHub Stars

One of the most popular open-source monitoring projects

SOC 2 Type 2 Certified

Enterprise-grade security and compliance

Data Sovereignty

Your metrics stay on your infrastructure

Validated
University of Amsterdam

"Most energy-efficient monitoring solution" — ICSOC 2023, peer-reviewed

ADASTEC (Autonomous Driving)

"Doesn't miss alerts—mission-critical trust for safety software"

Community Stats
615+ Contributors

Global community improving monitoring for everyone

1.5M+ Downloads/Day

Trusted by teams worldwide

GPLv3+ Licensed

Free forever, fully open source agent

Why Join?
Remote-First

Work from anywhere, async-friendly culture

Impact at Scale

Your work helps millions of systems

Nagios Plugins icon

Nagios Plugins

Nagios Plugins

Plugin: scripts.d.plugin Module: nagios

Overview

This collector runs Nagios-compatible plugins and custom scripts. It provides:

  • Check state monitoring — tracks whether each check returns OK, WARNING, CRITICAL, or UNKNOWN
  • Execution metrics — measures run duration, CPU time, and memory usage of each check
  • Automatic performance data charts — any Nagios performance data in the check output is parsed and charted automatically
  • Threshold-based alerting — when performance data includes warning/critical thresholds, Netdata derives threshold state and creates built-in alerts

Netdata executes each configured command on a schedule, reads the process exit code to determine the check state, and parses the standard output for a status message and optional performance data. Any performance data is automatically converted into charts.

:::tip

You can use packaged Nagios plugins or write your own scripts — any executable that follows the Nagios plugin output format will work.

:::

Nagios Plugin Output Format

A Nagios-compatible plugin communicates through two channels: the process exit code and standard output. For the full specification, see the Nagios Plugin Development Guidelines.

Exit Codes

The exit code is the only thing that determines the check state — the output text is for display purposes only.

Exit CodeStateMeaning
0OKCheck passed
1WARNINGAbove warning threshold or degraded
2CRITICALAbove critical threshold or service down
3UNKNOWNInvalid arguments or internal error

Standard Output

The output follows this structure:

STATUS TEXT | perfdata1=val;warn;crit;min;max perfdata2=val
LONG OUTPUT LINE 1
LONG OUTPUT LINE 2 | more_perfdata=val
PartDescription
Status textText before the pipe on the first line. Shown as the job’s status message.
Performance dataText after the pipe on any line. Parsed into charts automatically.
Long outputLines 2+ before the pipe. Additional detail text.

Note: The pipe separator is optional. Without it, the entire first line is the status text and no performance data charts are created.

Performance Data Format

Each performance data metric uses this format:

'label'=value[UOM];[warn];[crit];[min];[max]
FieldRequiredDescription
labelYesMetric name. Quote with single quotes if it contains spaces.
valueYesNumeric value.
UOMNoUnit of measurement (see table below).
warnNoWarning threshold range.
critNoCritical threshold range.
minNoMinimum possible value.
maxNoMaximum possible value.

Separate multiple metrics with spaces.

Supported Units of Measurement (UOM):

UOMMeaningHow Netdata charts it
(none)Unitless numberCharted as-is
sSeconds (also ms, us, ns)Normalized to seconds
%PercentageCharted as percentage
BBytes (also KB, MB, GB, TB)Charted in bytes
bBits (also Kb, Mb, Gb, Tb)Charted in bits
cContinuous counterCharted as incremental rate

Threshold Ranges

Thresholds use the format [@]start:end, where a bare number like 10 is shorthand for 0:10 and ~ represents negative infinity (no lower bound). An alert triggers when the value falls outside the range (or inside with the @ prefix):

RangeAlert when…
10value < 0 or value > 10
10:value < 10
~:10value > 10
10:20value < 10 or value > 20
@10:2010 ≤ value ≤ 20

When warn and crit ranges are provided on non-counter metrics, Netdata automatically derives a threshold state (ok / warning / critical) and creates charts with built-in alerts.

Common threshold patterns:

I want to alert when…warncrit
Value exceeds a limit (e.g., response time > 2s)~:2~:5
Value drops below a floor (e.g., free space < 10%)10:5:
Value is outside a band (e.g., temperature 20–80)20:8010:90

Example

A minimal Nagios-compatible script:

#!/bin/sh
echo "OK - 85% free memory | free_pct=85%;20:;10:;0;100 used_kb=2380912KB;;;0;16380000"
exit 0

This produces:

  • Check state: OK (exit code 0)
  • Status text: OK - 85% free memory
  • Charts: free_pct (percentage with warning/critical thresholds) and used_kb (bytes)

:::info

Retry behavior: When a check returns a non-OK state, Netdata does not alert immediately. The check enters a soft state and retries at the retry_interval rate. Only after max_check_attempts consecutive failures does it become a hard state and trigger alerts. If the check recovers during retries, it returns to OK without alerting. The retry dimension on state charts indicates a soft state is in progress.

:::

This collector is supported on all platforms.

This collector supports collecting metrics from multiple instances of this integration, including remote instances.

No additional permissions are required by the collector itself. If a check needs access to protected files, sockets, or system commands, provide that access to the check command or helper it uses.

Default Behavior

Auto-Detection

No automatic detection is performed. Add one or more jobs explicitly and point each job to the script or executable you want Netdata to run.

Limits

Each job runs one configured command. Additional charts are created only when the check emits Nagios performance data.

Performance Impact

Each job starts an external command. The impact depends mostly on how often the job runs and how expensive the check command itself is.

Setup

Prerequisites

Install check commands

Install the Nagios plugins or other Nagios-compatible scripts that you want Netdata to run.

Most Linux distributions provide Nagios plugin packages:

# Debian/Ubuntu
apt install nagios-plugins

# RHEL/CentOS/Fedora
dnf install nagios-plugins-all

Make sure the configured command path exists and is executable by the netdata user.

Prepare custom check scripts

If you are writing your own check scripts instead of using packaged Nagios plugins:

  • Place scripts anywhere accessible to the netdata user (e.g., /usr/local/lib/netdata/checks/)
  • Make scripts executable: chmod +x /path/to/script.sh
  • Test as the netdata user to verify permissions and environment: sudo -u netdata /path/to/script.sh
  • Verify the exit code: echo $? (must be 0, 1, 2, or 3)
  • Verify the output matches the Nagios plugin output format described in the Overview above

Configuration

Options

Add jobs under jobs:. Each job runs one Nagios-compatible check command.

GroupOptionDescriptionDefaultRequired
Collectionupdate_everyHow often the collector’s internal scheduler ticks, in seconds. Controls chart granularity. In most cases you only need to set check_interval.10no
TargetpluginAbsolute path to the Nagios-compatible executable to run. This can be a packaged Nagios plugin or your own executable. If you need a script interpreter, point plugin to that interpreter and pass the script path in args. The command should return exit code 0, 1, 2, or 3 and may print performance data after |.yes
argsArguments passed to the command.no
arg_valuesValues exposed to $ARG1$ through $ARG32$ for macro expansion. The first value maps to $ARG1$, the second to $ARG2$, and so on.no
working_directoryWorking directory used when running the command.no
SchedulingtimeoutMaximum time allowed for one command run. If the check exceeds this limit, the job state becomes timeout.5sno
check_intervalInterval between regular checks.5mno
retry_intervalInterval between retries while a check remains in a non-OK soft state.1mno
max_check_attemptsNumber of attempts before a non-OK result becomes a hard state.3no
check_periodName of the time period that controls when the job is allowed to run. The built-in 24x7 period (always allowed) is the default. Outside the active period, the check does not execute and the job state becomes paused.24x7no
time_periodsCustom named time periods defined inside the same job. Supports weekly, nth_weekday, and date rule types.no
EnvironmentenvironmentExtra environment variables added on top of the collector’s limited execution baseline. The check does not inherit the full Netdata process environment.no
custom_varsCustom service variables exposed to the check as Nagios-style macros.no
Virtual NodevnodeAssociate the job with a virtual node so the check can use host-specific labels and macros.no
MiscnotesOptional notes for the job definition.no

environment

A key-value map of environment variables injected into the check’s process. Use this when your script depends on variables that are not part of the collector’s default environment.

jobs:
  - name: oracle_check
    plugin: /usr/local/bin/check_oracle.sh
    environment:
      ORACLE_HOME: /opt/oracle/product/19c
      LD_LIBRARY_PATH: /opt/oracle/product/19c/lib

custom_vars

A key-value map of custom service variables. Each entry is exposed as a NAGIOS__SERVICE<UPPERCASE_KEY> environment variable and can be referenced in args using the Nagios macro syntax $_SERVICE<KEY>$.

jobs:
  - name: check_db
    plugin: /usr/lib/nagios/plugins/check_pgsql
    args: ["-H", "$_SERVICEDBHOST$", "-d", "$_SERVICEDBNAME$"]
    custom_vars:
      DBHOST: db.example.com
      DBNAME: production

via File

The configuration file name for this integration is scripts.d/nagios.conf.

You can edit the configuration file using the edit-config script from the Netdata config directory.

cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config scripts.d/nagios.conf
Examples
Basic check

Run a Nagios check command on a fixed interval.

jobs:
  - name: ping_localhost
    plugin: /usr/lib/nagios/plugins/check_ping
    args: ["-H", "127.0.0.1", "-w", "100.0,20%", "-c", "200.0,40%"]
    timeout: 5s
    check_interval: 1m
    retry_interval: 30s
    max_check_attempts: 3
End-to-end custom script

Write a custom check script, then configure Netdata to run it.

1. Create the script (e.g., /usr/local/lib/netdata/checks/check_api.sh):

#!/bin/sh
# Check HTTP endpoint health
URL="http://localhost:8080/health"

response=$(curl -s -o /dev/null -w "%{http_code} %{time_total}" --max-time 5 "$URL" 2>/dev/null)
curl_exit=$?

if [ "$curl_exit" -ne 0 ]; then
    echo "UNKNOWN - Could not connect to $URL (curl exit code $curl_exit)"
    exit 3
fi

http_code=$(echo "$response" | cut -d' ' -f1)
response_time=$(echo "$response" | cut -d' ' -f2)

if [ "$http_code" -ge 500 ]; then
    echo "CRITICAL - $URL returned HTTP $http_code | response_time=${response_time}s;2;5;0;"
    exit 2
elif [ "$http_code" -ne 200 ]; then
    echo "WARNING - $URL returned HTTP $http_code | response_time=${response_time}s;2;5;0;"
    exit 1
fi

echo "OK - $URL returned HTTP $http_code | response_time=${response_time}s;2;5;0;"
exit 0

2. Make it executable and test it:

chmod +x /usr/local/lib/netdata/checks/check_api.sh
sudo -u netdata /usr/local/lib/netdata/checks/check_api.sh
echo "Exit code: $?"

3. Add the configuration below, then restart Netdata (sudo systemctl restart netdata). After restarting, look for nagios.job.execution_state and related charts in the Netdata dashboard.

jobs:
  - name: api_health
    plugin: /usr/local/lib/netdata/checks/check_api.sh
    timeout: 10s
    check_interval: 1m
    retry_interval: 30s
    max_check_attempts: 3
Custom script (minimal)

Run your own Nagios-compatible shell script with minimal configuration.

jobs:
  - name: custom_memory_check
    plugin: /opt/netdata/check_memory.sh
    timeout: 5s
    check_interval: 1m
Check with a job-local schedule

Run a check only during selected hours by defining time periods inside the job.

jobs:
  - name: business_hours_http
    plugin: /usr/lib/nagios/plugins/check_http
    args: ["-H", "example.com"]
    check_period: business_hours
    time_periods:
      - name: business_hours
        alias: Business hours
        rules:
          - type: weekly
            days: [monday, tuesday, wednesday, thursday, friday]
            ranges: ["09:00-18:00"]
Check with virtual node macros

Run a check against a virtual node and fill command arguments from Nagios-style macros.

jobs:
  - name: check_ssh
    plugin: /usr/lib/nagios/plugins/check_ssh
    args: ["-H", "$HOSTADDRESS$", "-p", "$ARG1$"]
    arg_values: ["22"]
    vnode: remote-server
    check_interval: 5m

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Each configured job produces execution state and resource usage charts. When a check emits Nagios performance data, additional charts are created automatically for each metric. Non-counter perfdata with warning/critical thresholds also get threshold state charts for alerting.

Per job

These metrics refer to each configured check job.

Labels:

LabelDescription
nagios_jobJob name as defined in the configuration.
perfdata_valueIdentifies which performance data metric a threshold state belongs to. Format is <unit_class>_<label>, where <unit_class> is derived from the UOM (time, bytes, bits, percent, or generic) and <label> is the sanitized metric label from the check output. For example, repl_lag=5s produces time_repl_lag.

Metrics:

MetricDimensionsUnit
nagios.job.execution_stateok, warning, critical, unknown, timeout, paused, retrystate
nagios.job.perfdata_threshold_stateno_threshold, ok, warning, critical, retrystate
nagios.job.execution_durationdurationseconds
nagios.job.execution_cpu_totaltotalseconds
nagios.job.execution_max_rssrssbytes

Alerts

The following alerts are available:

Alert nameOn metricDescription
nagios_job_execution_state_warnnagios.job.execution_stateNagios job ${label:nagios_job} is in WARNING state
nagios_job_execution_state_critnagios.job.execution_stateNagios job ${label:nagios_job} is in CRITICAL state
nagios_job_perfdata_threshold_state_warnnagios.job.perfdata_threshold_stateNagios job ${label:nagios_job} perfdata ${label:perfdata_value} is in WARNING threshold state
nagios_job_perfdata_threshold_state_critnagios.job.perfdata_threshold_stateNagios job ${label:nagios_job} perfdata ${label:perfdata_value} is in CRITICAL threshold state

Troubleshooting

The command cannot be executed

Confirm that the path in plugin exists, is executable, and can be accessed by the netdata user. If the check depends on external files or helpers, verify those paths and permissions too.

No performance-data charts appear

Performance-data charts are created only when the check prints Nagios performance data after the | separator. If the command returns only a status line without performance data, Netdata will still show the job state but no extra charts.

Some performance-data values are ignored

Check that each metric uses the Nagios performance-data format label=value[UOM];warn;crit;min;max and that multiple metrics are separated by spaces. If a label contains spaces, quote it. Netdata charts the main value for every perfdata metric, and for non-counter metrics it derives threshold state from warn and crit; it does not create separate charts for raw min, max, or raw threshold bounds.

The job state does not match the output text

The visible text does not decide the state. Netdata uses the process exit code instead: 0 for OK, 1 for WARNING, 2 for CRITICAL, and 3 for UNKNOWN. If the check exceeds the configured timeout, Netdata reports timeout even if the script never had a chance to print its own final state. If the current time is outside check_period, Netdata reports paused until the check is allowed to run again.

Only the first output line appears as the main status

This is expected. Netdata uses the first line as the summary shown for the job. Additional lines are kept as long output, and any | sections found on later lines are also parsed for performance data.

Macros are not expanded as expected

Check that positional values are provided in arg_values, custom service variables are defined in custom_vars, and any virtual-node labels needed for host macros are present on the selected vnode.

The script works in a shell but fails under Netdata

Nagios checks run with a limited execution environment rather than inheriting the full Netdata process environment. If the script depends on extra variables, set them explicitly in environment instead of relying on ambient shell state.

Built-in alerts cover warning and critical states only

This collector installs stock Netdata health alerts for the warning and critical states on nagios.job.execution_state and nagios.job.perfdata_threshold_state. Both stock alert families suppress soft retry states by checking that retry is not active. If you also want alerts for unknown, timeout, paused, or more specific perfdata behavior, build your own rules on top of these contexts. The nagios.job.perfdata_threshold_state chart uses the perfdata_value label to identify which perfdata metric each threshold state belongs to.

Configuration changes are not picked up

After editing scripts.d/nagios.conf, restart the Netdata Agent for changes to take effect: sudo systemctl restart netdata.

Script stderr output is not visible

Netdata captures the check’s standard output for status and performance data parsing. Standard error (stderr) is logged by the collector but not used for state or charts. If your script writes errors to stderr, check the Netdata error log for details.

Windows checks need an executable entry point

The collector runs the command named in plugin directly. On Windows, point plugin to an executable or to an interpreter such as powershell.exe and pass the script path in args.

The observability platform companies need to succeed

Sign up for free

Want a personalised demo of Netdata for your use case?

Book a Demo