The only agent that thinks for itself

Autonomous Monitoring with self-learning AI built-in, operating independently across your entire stack.

Unlimited Metrics & Logs
Machine learning & MCP
5% CPU, 150MB RAM
3GB disk, >1 year retention
800+ integrations, zero config
Dashboards, alerts out of the box
> Discover Netdata Agents

Centralized metrics streaming and storage

Aggregate metrics from multiple agents into centralized Parent nodes for unified monitoring across your infrastructure.

Stream from unlimited agents
Long-term data retention
High availability clustering
Data replication & backup
Scalable architecture
Enterprise-grade security
> Learn about Parents

Fully managed cloud platform

Access your monitoring data from anywhere with our SaaS platform. No infrastructure to manage, automatic updates, and global availability.

Zero infrastructure management
99.9% uptime SLA
Global data centers
Automatic updates & patches
Enterprise SSO & RBAC
SOC2 & ISO certified
> Explore Netdata Cloud

Deploy Netdata Cloud in your infrastructure

Run the full Netdata Cloud platform on-premises for complete data sovereignty and compliance with your security policies.

Complete data sovereignty
Air-gapped deployment
Custom compliance controls
Private network integration
Dedicated support team
Kubernetes & Docker support
> Learn about Cloud On-Premises

Powerful, intuitive monitoring interface

Modern, responsive UI built for real-time troubleshooting with customizable dashboards and advanced visualization capabilities.

Real-time chart updates
Customizable dashboards
Dark & light themes
Advanced filtering & search
Responsive on all devices
Collaboration features
> Explore Netdata UI

Monitor on the go

Native iOS and Android apps bring full monitoring capabilities to your mobile device with real-time alerts and notifications.

iOS & Android apps
Push notifications
Touch-optimized interface
Offline data access
Biometric authentication
Widget support
> Download apps

The future of infrastructure observability

See our strategic direction across AI-native observability, full-stack signals, operational intelligence, and enterprise platform maturity.

AI-native observability
Full-stack signal coverage
Operational intelligence
Enterprise platform maturity
Agent releases every 6 weeks
Cloud continuous delivery
> Explore Product Roadmap

Best energy efficiency

True real-time per-second

100% automated zero config

Centralized observability

Multi-year retention

High availability built-in

Zero maintenance

Always up-to-date

Enterprise security

Complete data control

Air-gap ready

Compliance certified

Millisecond responsiveness

Infinite zoom & pan

Works on any device

Native performance

Instant alerts

Monitor anywhere

AI-native observability

Continuous delivery

Open source foundation

80% Faster Incident Resolution

AI-powered troubleshooting from detection, to root cause and blast radius identification, to reporting.

True Real-Time and Simple, even at Scale

Linearly and infinitely scalable full-stack observability, that can be deployed even mid-crisis.

90% Cost Reduction, Full Fidelity

Instead of centralizing the data, Netdata distributes the code, eliminating pipelines and complexity.

Control Without Surrender

SOC 2 Type 2 certified with every metric kept on your infrastructure.

Integrations

800+ collectors and notification channels, auto-discovered and ready out of the box.

800+ data collectors
Auto-discovery & zero config
Cloud, infra, app protocols
Notifications out of the box
> Explore integrations
Real Results
46% Cost Reduction

Reduced monitoring costs by 46% while cutting staff overhead by 67%.

— Leonardo Antunez, Codyas

Zero Pipeline

No data shipping. No central storage costs. Query at the edge.

From Our Users
"Out-of-the-Box"

So many out-of-the-box features! I mostly don't have to develop anything.

— Simon Beginn, LANCOM Systems

No Query Language

Point-and-click troubleshooting. No PromQL, no LogQL, no learning curve.

Enterprise Ready
67% Less Staff, 46% Cost Cut

Enterprise efficiency without enterprise complexity—real ROI from day one.

— Leonardo Antunez, Codyas

SOC 2 Type 2 Certified

Zero data egress. Only metadata reaches the cloud. Your metrics stay on your infrastructure.

Full Coverage
800+ Collectors

Auto-discovered and configured. No manual setup required.

Any Notification Channel

Slack, PagerDuty, Teams, email, webhooks—all built-in.

Built for the People Who Get Paged

Because 3am alerts deserve instant answers, not hour-long hunts.

Every Industry Has Rules. We Master Them.

See how healthcare, finance, and government teams cut monitoring costs 90% while staying audit-ready.

Monitor Any Technology. Configure Nothing.

Install the agent. It already knows your stack.
From Our Users
"A Rare Unicorn"

Netdata gives more than you invest in it. A rare unicorn that obeys the Pareto rule.

— Eduard Porquet Mateu, TMB Barcelona

99% Downtime Reduction

Reduced website downtime by 99% and cloud bill by 30% using Netdata alerts.

— Falkland Islands Government

Real Savings
30% Cloud Cost Reduction

Optimized resource allocation based on Netdata alerts cut cloud spending by 30%.

— Falkland Islands Government

46% Cost Cut

Reduced monitoring staff by 67% while cutting operational costs by 46%.

— Codyas

Real Coverage
"Plugin for Everything"

Netdata has agent capacity or a plugin for everything, including Windows and Kubernetes.

— Eduard Porquet Mateu, TMB Barcelona

"Out-of-the-Box"

So many out-of-the-box features! I mostly don't have to develop anything.

— Simon Beginn, LANCOM Systems

Real Speed
Troubleshooting in 30 Seconds

From 2-3 minutes to 30 seconds—instant visibility into any node issue.

— Matthew Artist, Nodecraft

20% Downtime Reduction

20% less downtime and 40% budget optimization from out-of-the-box monitoring.

— Simon Beginn, LANCOM Systems

Pay per Node. Unlimited Everything Else.

One price per node. Unlimited metrics, logs, users, and retention. No per-GB surprises.

Free tier—forever
No metric limits or caps
Retention you control
Cancel anytime
> See pricing plans

What's Your Monitoring Really Costing You?

Most teams overpay by 40-60%. Let's find out why.

Expose hidden metric charges
Calculate tool consolidation
Customers report 30-67% savings
Results in under 60 seconds
> See what you're really paying

Your Infrastructure Is Unique. Let's Talk.

Because monitoring 10 nodes is different from monitoring 10,000.

On-prem & air-gapped deployment
Volume pricing & agreements
Architecture review for your scale
Compliance & security support
> Start a conversation

Monitoring That Sells Itself

Deploy in minutes. Impress clients in hours. Earn recurring revenue for years.

30-second live demos close deals
Zero config = zero support burden
Competitive margins & deal protection
Response in 48 hours
> Apply to partner

Per-Second Metrics at Homelab Prices

Same engine, same dashboards, same ML. Just priced for tinkerers.

Community: Free forever · 5 nodes · non-commercial
Homelab: $90/yr · unlimited nodes · fair usage
> Get the Homelab Plan

$1,000 Per Referral. Unlimited Referrals.

Your colleagues get 10% off. You get 10% commission. Everyone wins.

10% of subscriptions, up to $1,000 each
Track earnings inside Netdata Cloud
PayPal/Venmo payouts in 3-4 weeks
No caps, no complexity
> Get your referral link
Cost Proof
40% Budget Optimization

"Netdata's significant positive impact" — LANCOM Systems

Calculate Your Savings

Compare vs Datadog, Grafana, Dynatrace

Savings Proof
46% Cost Reduction

"Cut costs by 46%, staff by 67%" — Codyas

30% Cloud Bill Savings

"Reduced cloud bill by 30%" — Falkland Islands Gov

Enterprise Proof
"Better Than Combined Alternatives"

"Better observability with Netdata than combining other tools." — TMB Barcelona

Real Engineers, <24h Response

DPA, SLAs, on-prem, volume pricing

Why Partners Win
Demo Live Infrastructure

One command, 30 seconds, real data—no sandbox needed

Zero Tickets, High Margins

Auto-config + per-node pricing = predictable profit

Homelab Ready
Free Video Course

8-episode Netdata tutorial by LearnLinux.tv

76k+ GitHub Stars

3rd most starred monitoring project

Worth Recommending
Product That Delivers

Customers report 40-67% cost cuts, 99% downtime reduction

Zero Risk to Your Rep

Free tier lets them try before they buy

AI Support Assistant, Available 24/7

Nedi has access to all official documentation, source code, and resources. Ask any question about Netdata—responds in your language.

Deployment & configuration
Troubleshooting & sizing
Alerts & notifications
Evidence-based answers
> Ask Nedi now

Never Fight Fires Alone

Docs, community, and expert help—pick your path to resolution.

Learn.netdata.cloud docs
Discord, Forums, GitHub
Premium support available
> Get answers now

60 Seconds to First Dashboard

One command to install. Zero config. 850+ integrations documented.

Linux, Windows, K8s, Docker
Auto-discovers your stack
> Read our documentation

Level Up Your Monitoring

Real problems. Real solutions. 112+ guides from basic monitoring to AI observability.

76,000+ Engineers Strong

615+ contributors. 1.5M daily downloads. One mission: simplify observability.

Per-Second. 90% Cheaper. Data Stays Home.

Side-by-side comparisons: costs, real-time granularity, and data sovereignty for every major tool.

See why teams switch from Datadog, Prometheus, Grafana, and more.

> Browse all comparisons
Edge-Native Observability, Born Open Source
Per-second visibility, ML on every metric, and data that never leaves your infrastructure.
Founded in 2016
615+ contributors worldwide
Remote-first, engineering-driven
Open source first
> Read our story
Promises We Publish—and Prove
12 principles backed by open code, independent validation, and measurable outcomes.
Open source, peer-reviewed
Zero config, instant value
Data sovereignty by design
Aligned pricing, no surprises
> See all 12 principles
Edge-Native, AI-Ready, 100% Open
76k+ stars. Full ML, AI, and automation—GPLv3+, not premium add-ons.
76,000+ GitHub stars
GPLv3+ licensed forever
ML on every metric, included
Zero vendor lock-in
> Explore our open source
Build Real-Time Observability for the World
Remote-first team shipping per-second monitoring with ML on every metric.
Remote-first, fully distributed
Open source (76k+ stars)
Challenging technical problems
Your code on millions of systems
> See open roles
Meet the Team Behind Netdata
Conferences, meetups, and tradeshows where you can see Netdata in action and talk to the engineers who build it.
Live demos and deep dives
Book 1-on-1 meetings
Talks and panel sessions
Event recaps and photos
> See all events
Talk to a Netdata Human in <24 Hours
Sales, partnerships, press, or professional services—real engineers, fast answers.
Discuss your observability needs
Pricing and volume discounts
Partnership opportunities
Media and press inquiries
> Book a conversation
Your Data. Your Rules.
On-prem data, cloud control plane, transparent terms.
Trust & Scale
76,000+ GitHub Stars

One of the most popular open-source monitoring projects

SOC 2 Type 2 Certified

Enterprise-grade security and compliance

Data Sovereignty

Your metrics stay on your infrastructure

Validated
University of Amsterdam

"Most energy-efficient monitoring solution" — ICSOC 2023, peer-reviewed

ADASTEC (Autonomous Driving)

"Doesn't miss alerts—mission-critical trust for safety software"

Community Stats
615+ Contributors

Global community improving monitoring for everyone

1.5M+ Downloads/Day

Trusted by teams worldwide

GPLv3+ Licensed

Free forever, fully open source agent

Why Join?
Remote-First

Work from anywhere, async-friendly culture

Impact at Scale

Your work helps millions of systems

VMware vCenter Server icon

VMware vCenter Server

VMware vCenter Server

Plugin: go.d.plugin Module: vsphere

Overview

Monitors vSphere resources from vCenter servers.

Includes hosts, VMs, datastores, clusters, resource pools, and inventory counts.

Use the vcsa collector for vCenter Server Appliance health.

Use the snmp collector with the vmware-esx profile for ESXi hardware, HBA, and environment sensors.

Those surfaces are intentionally not duplicated here by default.

Warning

The vsphere collector cannot re-login and continue collecting metrics after a vCenter reboot. go.d.plugin needs to be restarted.

This collector is supported on all platforms.

This collector supports collecting metrics from multiple instances of this integration, including remote instances.

Default Behavior

Auto-Detection

This integration doesn’t support auto-detection.

Limits

The default configuration for this integration does not impose any limits on data collection.

Performance Impact

The default update_every is 20 seconds, and it doesn’t make sense to decrease the value. VMware real-time statistics are generated at the 20-second specificity.

Note: Datastore and cluster performance metrics use 300-second (5-minute) historical intervals because VMware does not support real-time statistics for these entity types. Datastore capacity/status, cluster properties, and resource pool statistics are updated every collection cycle. Host and VM metrics use real-time 20-second intervals.

It is likely that 20 seconds is not enough for big installations and the value should be tuned.

To size a job, run the collector in debug mode and compare the discovery and collection timing lines. Discovery runs in a separate goroutine, while collection timing must stay comfortably below update_every.

Useful log lines include:

  • discovering : discovered ... the whole process took ...
  • scraping : scraped metrics for ... hosts, process took ...
  • scraping : scraped metrics for ... vms, process took ...
  • metrics collected, process took ...

Adjust update_every and timeout based on those timings and on the number of enabled optional surfaces.

Setup

You can configure the vsphere collector in two ways:

MethodBest forHow to
UIFast setup without editing filesGo to Nodes → Configure this node → Collectors → Jobs, search for vsphere, then click + to add a job.
FileIf you prefer configuring via file, or need to automate deployments (e.g., with Ansible)Edit go.d/vsphere.conf and add a job.

Important

UI configuration requires paid Netdata Cloud plan.

Prerequisites

vCenter read-only access

Configure a vCenter account that can read inventory objects, properties, and performance counters for the datacenters, clusters, ESXi hosts, VMs, datastores, and resource pools selected by the include filters.

Optional vSphere metadata permissions

tag_categories requires access to the vSphere Automation/CIS tagging APIs for the selected categories. custom_attributes requires access to custom field definitions and values for the selected inventory objects.

Optional datastore cluster, vSAN, and network data

collect_datastore_clusters requires read access to StoragePod objects. collect_vsan requires vSAN Management API access and the vSAN Performance Service on the target clusters. collect_network_topology requires read access to Network and Distributed Virtual Port Group inventory objects.

Configuration

Options

The following options can be defined globally: update_every, autodetection_retry.

GroupOptionDescriptionDefaultRequired
Collectionupdate_everyData collection interval (seconds).20no
autodetection_retryAutodetection retry interval (seconds).60no
TargeturlTarget endpoint URL.https://vcenter.localyes
timeoutHTTP request timeout (seconds).20no
Discoverydiscovery_intervalHosts, VMs, datastores, clusters, and resource pools discovery interval (seconds).300no
Labelstag_categoriesvSphere tag category allowlist.no
custom_attributesvSphere custom attribute allowlist.no
High Cardinalitycollect_datastore_clustersCollect datastore cluster capacity and Storage DRS status.nono
datastore_cluster_includeDatastore cluster selector./*no
collect_vsanCollect vSAN metrics.nono
vsan_cluster_includevSAN cluster selector./*no
vsan_host_includevSAN host selector./*no
vsan_vm_includevSAN VM selector./*no
Collectioncollect_network_topologyDiscover networks for the vSphere Topology function.nono
Filtershost_includeHosts selector (filter)./*no
vm_includeVM selector (filter)./*no
datastore_includeDatastore selector (filter)./*no
cluster_includeCluster selector (filter). Resource pools follow their owning cluster./*no
HTTP AuthusernameUsername for Basic HTTP authentication.yes
passwordPassword for Basic HTTP authentication.yes
TLStls_skip_verifySkip TLS certificate and hostname verification (insecure).nono
tls_caPath to CA bundle used to validate the server certificate.no
tls_certPath to client TLS certificate (for mTLS).no
tls_keyPath to client TLS private key (for mTLS).no
Virtual NodevnodeAssociates this data collection job with a Virtual Node.no

tag_categories

Disabled by default because vSphere tags are user-defined metadata and can expose internal names, ownership, business unit, or environment details. Each list item is one glob pattern matching vSphere tag category names, so names with spaces are supported. Use * only when every tag category is intentional.

Matching categories are exposed as labels named vsphere_tag_<sanitized_category>. When a resource has multiple tags in the same category, values are sorted and joined with the pipe character.

tag_categories:
  - "Environment"
  - "Business Unit"

custom_attributes

Disabled by default because vSphere custom attributes are user-defined metadata and can expose internal names, ownership, business unit, operational data, or secrets stored by administrators. Custom attribute values are sent verbatim as labels. Each list item is one glob pattern matching custom attribute names, so names with spaces are supported. Use * only when every custom attribute is intentional and none of the matched values contain secrets.

Matching attributes are exposed as labels named vsphere_custom_attribute_<sanitized_name>.

custom_attributes:
  - "Owner"
  - "Cost Center"

collect_datastore_clusters

Disabled by default because it adds a separate vSphere resource class (StoragePod) to the collector output. When enabled, the collector emits aggregate datastore-cluster capacity, utilization, and Storage DRS status.

datastore_cluster_include

Applies only when collect_datastore_clusters is enabled. Values use Netdata simple patterns and match /Datacenter/DatastoreCluster, the datastore-cluster name, or the vSphere managed object ID. Matching datastore clusters are included in metrics, labels, cached discovery state, and topology function output.

datastore_cluster_include:
  - "/*"

collect_vsan

Disabled by default because it uses the vSAN Management API and vSAN Performance Service, and adds extra vCenter queries. When enabled, it emits vSAN cluster capacity, vSAN cluster health, and vSAN cluster, host, and VM performance metrics for discovered vSAN-enabled clusters. Use the vSAN selectors below to choose the concrete vSAN performance entity refs queried. vSAN events are not collected by this option.

vsan_cluster_include

Applies only when collect_vsan is enabled. Values use Netdata simple patterns and match /Datacenter/Cluster, the cluster name, the vSphere managed object ID, or vsan_uuid:<uuid>.

vsan_cluster_include:
  - "/*"
  - "vsan_uuid:52b..."

vsan_host_include

Applies only when collect_vsan is enabled. Values use Netdata simple patterns and match /Datacenter/Cluster/Host, the host name, the vSphere managed object ID, or vsan_node_uuid:<uuid>.

vsan_host_include:
  - "/*"
  - "vsan_node_uuid:52b..."

vsan_vm_include

Applies only when collect_vsan is enabled. Values use Netdata simple patterns and match /Datacenter/Cluster/Host/VM, the VM name, the vSphere managed object ID, or instance_uuid:<uuid>.

vsan_vm_include:
  - "/*"
  - "instance_uuid:52b..."

collect_network_topology

Disabled by default to avoid extra vCenter discovery calls for existing users. When enabled, the collector discovers vSphere Network and Distributed Virtual Port Group objects and includes their cached accessibility/status and host/VM relationships in the vSphere Topology function. It does not create charts or metrics.

host_include

Metrics of hosts matching the selector will be collected.

  • Include pattern syntax: “/Datacenter pattern/Cluster pattern/Host pattern”.

  • Match pattern syntax: simple patterns.

  • Syntax:

    host_include:
      - '/DC1/*'           # all hosts from datacenter DC1
      - '/DC2/*/!Host2 *'  # all hosts from datacenter DC2 except HOST2
      - '/DC3/Cluster3/*'  # all hosts from DC3, cluster Cluster3
    

vm_include

Metrics of VMs matching the selector will be collected.

  • Include pattern syntax: “/Datacenter pattern/Cluster pattern/Host pattern/VM pattern”.

  • Match pattern syntax: simple patterns.

  • Syntax:

    vm_include:
      - '/DC1/*'           # all VMs from datacenter DC1
      - '/DC2/*/*/!VM2 *'  # all VMs from DC2 except VM2
      - '/DC3/Cluster3/*'  # all VMs from DC3, cluster Cluster3
    

datastore_include

Metrics of datastores matching the selector will be collected.

  • Include pattern syntax: “/Datacenter pattern/Datastore pattern”.

  • Match pattern syntax: simple patterns.

  • Syntax:

    datastore_include:
      - '/DC1/*'           # all datastores from datacenter DC1
      - '/DC2/!DS2 *'      # all datastores from DC2 except DS2
    

cluster_include

Metrics of clusters and their resource pools matching the selector will be collected.

  • Include pattern syntax: “/Datacenter pattern/Cluster pattern”.

  • Match pattern syntax: simple patterns.

  • Syntax:

    cluster_include:
      - '/DC1/*'           # all clusters from datacenter DC1
      - '/DC2/!Cluster2 *' # all clusters from DC2 except Cluster2
    

via UI

Configure the vsphere collector from the Netdata web interface:

  1. Go to Nodes.
  2. Select the node where you want the vsphere data-collection job to run and click the :gear: (Configure this node). That node will run the data collection.
  3. The Collectors → Jobs view opens by default.
  4. In the Search box, type vsphere (or scroll the list) to locate the vsphere collector.
  5. Click the + next to the vsphere collector to add a new job.
  6. Fill in the job fields, then click Test to verify the configuration and Submit to save.
    • Test runs the job with the provided settings and shows whether data can be collected.
    • If it fails, an error message appears with details (for example, connection refused, timeout, or command execution errors), so you can adjust and retest.

via File

The configuration file name for this integration is go.d/vsphere.conf.

The file format is YAML. Generally, the structure is:

update_every: 1
autodetection_retry: 0
jobs:
  - name: some_name1
  - name: some_name2

You can edit the configuration file using the edit-config script from the Netdata config directory.

cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/vsphere.conf
Examples
Basic

A basic example configuration.

jobs:
  - name     : vcenter1
    url      : https://203.0.113.1
    username : [email protected]
    password : somepassword
Multi-instance

Note

When you define multiple jobs, their names must be unique.

Collecting metrics from local and remote instances.

jobs:
  - name     : vcenter1
    url      : https://203.0.113.1
    username : [email protected]
    password : somepassword

  - name     : vcenter2
    url      : https://203.0.113.10
    username : [email protected]
    password : somepassword

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Per inventory

These metrics refer to the discovered vSphere inventory for this collector job.

Labels:

LabelDescription
idStatic inventory instance ID

Metrics:

MetricDimensionsUnit
vsphere.inventory_objectsdatacenters, folders, clusters, hosts, vms, datastores, resource_poolsobjects

Per virtual machine

These metrics refer to the Virtual Machine.

Labels:

LabelDescription
idvSphere managed object reference ID
datacenterDatacenter name
clusterCluster name
hostHost name
vmVirtual Machine name
vsphere_tag_<category>vSphere tag label; present only for categories matched by tag_categories; category names are sanitized for label keys and multiple tags in one category are sorted and joined with the pipe character
vsphere_custom_attribute_<name>vSphere custom attribute label; present only for attributes matched by custom_attributes; attribute names are sanitized for label keys

Metrics:

MetricDimensionsUnit
vsphere.vm_cpu_utilizationusedpercentage
vsphere.vm_mem_utilizationusedpercentage
vsphere.vm_mem_usagegranted, consumed, active, sharedKiB
vsphere.vm_mem_swap_usageswappedKiB
vsphere.vm_mem_swap_ioin, outKiB/s
vsphere.vm_disk_ioread, writeKiB/s
vsphere.vm_disk_max_latencylatencymilliseconds
vsphere.vm_net_trafficreceived, sentKiB/s
vsphere.vm_net_packetsreceived, sentpackets
vsphere.vm_net_dropsreceived, sentdrops
vsphere.vm_overall_statusgreen, red, yellow, graystatus
vsphere.vm_power_statepowered_on, powered_off, suspendedstatus
vsphere.vm_connection_stateconnected, disconnected, orphaned, inaccessible, invalidstatus
vsphere.vm_tools_running_statusrunning, not_running, executing_scripts, unknownstatus
vsphere.vm_tools_version_statuscurrent, need_upgrade, not_installed, unmanaged, too_old, supported_old, supported_new, too_new, blacklisted, unknownstatus
vsphere.vm_consolidation_neededneeded, not_neededstatus
vsphere.vm_system_uptimeuptimeseconds
vsphere.vm_config_cpuvcpusvCPUs
vsphere.vm_config_memorymemoryMiB
vsphere.vm_config_devicesdisks, nicsdevices
vsphere.vm_storage_usagecommitted, uncommitted, unsharedbytes
vsphere.vm_snapshot_countcountsnapshots
vsphere.vm_snapshot_max_ageageseconds
vsphere.vm_snapshot_max_chain_depthdepthsnapshots

Per virtual machine power

These aggregate metrics refer to VM power and energy and are collected for discovered powered-on VMs when vSphere exposes the corresponding power counters.

Labels:

LabelDescription
idvSphere managed object reference ID of the VM
datacenterDatacenter name
clusterCluster name
hostHost name
vmVirtual Machine name
vsphere_tag_<category>vSphere tag label; present only for categories matched by tag_categories; category names are sanitized for label keys and multiple tags in one category are sorted and joined with the pipe character
vsphere_custom_attribute_<name>vSphere custom attribute label; present only for attributes matched by custom_attributes; attribute names are sanitized for label keys

Metrics:

MetricDimensionsUnit
vsphere.vm_power_usagepowerwatts
vsphere.vm_energy_usageenergyjoules

Per vSAN virtual machine

These optional metrics refer to VM vSAN performance and are collected only when collect_vsan is enabled.

Labels:

LabelDescription
idvSphere managed object reference ID of the VM
datacenterDatacenter name
clusterCluster name
hostHost name
vmVirtual Machine name
vm_instance_uuidVM instance UUID used by vSAN performance entity references
vsphere_tag_<category>vSphere tag label; present only for categories matched by tag_categories; category names are sanitized for label keys and multiple tags in one category are sorted and joined with the pipe character
vsphere_custom_attribute_<name>vSphere custom attribute label; present only for attributes matched by custom_attributes; attribute names are sanitized for label keys

Metrics:

MetricDimensionsUnit
vsphere.vsan_vm_operationsread, writeoperations/s
vsphere.vsan_vm_throughputread, writebytes/s
vsphere.vsan_vm_latencyread, writemicroseconds

Per host power

These aggregate metrics refer to ESXi host power, energy, and power capacity and are collected for discovered powered-on hosts when vSphere exposes the corresponding power counters.

Labels:

LabelDescription
idvSphere managed object reference ID of the host
datacenterDatacenter name
clusterCluster name
hostHost name
vsphere_tag_<category>vSphere tag label; present only for categories matched by tag_categories; category names are sanitized for label keys and multiple tags in one category are sorted and joined with the pipe character
vsphere_custom_attribute_<name>vSphere custom attribute label; present only for attributes matched by custom_attributes; attribute names are sanitized for label keys

Metrics:

MetricDimensionsUnit
vsphere.host_power_usagepower, capwatts
vsphere.host_power_capacity_usageused, usable, idle, system, vmwatts
vsphere.host_power_capacity_utilizationusedpercentage
vsphere.host_energy_usageenergyjoules

Per vSAN host

These optional metrics refer to ESXi host vSAN performance and are collected only when collect_vsan is enabled.

Labels:

LabelDescription
idvSphere managed object reference ID of the host
datacenterDatacenter name
clusterCluster name
hostHost name
vsan_node_uuidvSAN host node UUID used by vSAN performance entity references
vsphere_tag_<category>vSphere tag label; present only for categories matched by tag_categories; category names are sanitized for label keys and multiple tags in one category are sorted and joined with the pipe character
vsphere_custom_attribute_<name>vSphere custom attribute label; present only for attributes matched by custom_attributes; attribute names are sanitized for label keys

Metrics:

MetricDimensionsUnit
vsphere.vsan_host_operationsread, writeoperations/s
vsphere.vsan_host_throughputread, writebytes/s
vsphere.vsan_host_latencyread, writemicroseconds
vsphere.vsan_host_congestionscongestionscongestions/s
vsphere.vsan_host_cache_hit_ratehit_ratepercentage

Per host

These metrics refer to the ESXi host.

Labels:

LabelDescription
idvSphere managed object reference ID
datacenterDatacenter name
clusterCluster name
hostHost name
vsphere_tag_<category>vSphere tag label; present only for categories matched by tag_categories; category names are sanitized for label keys and multiple tags in one category are sorted and joined with the pipe character
vsphere_custom_attribute_<name>vSphere custom attribute label; present only for attributes matched by custom_attributes; attribute names are sanitized for label keys

Metrics:

MetricDimensionsUnit
vsphere.host_cpu_utilizationusedpercentage
vsphere.host_mem_utilizationusedpercentage
vsphere.host_mem_usagegranted, consumed, active, shared, sharedcommonKiB
vsphere.host_mem_swap_ioin, outKiB/s
vsphere.host_disk_ioread, writeKiB/s
vsphere.host_disk_max_latencylatencymilliseconds
vsphere.host_net_trafficreceived, sentKiB/s
vsphere.host_net_packetsreceived, sentpackets
vsphere.host_net_dropsreceived, sentdrops
vsphere.host_net_errorsreceived, senterrors
vsphere.host_overall_statusgreen, red, yellow, graystatus
vsphere.host_power_statepowered_on, powered_off, standby, unknownstatus
vsphere.host_connection_stateconnected, not_responding, disconnectedstatus
vsphere.host_maintenance_statusnormal, in_maintenancestatus
vsphere.host_system_uptimeuptimeseconds

Per datastore

These metrics refer to the Datastore.

Labels:

LabelDescription
idvSphere managed object reference ID
datacenterDatacenter name
datastoreDatastore name
typeDatastore type (VMFS, NFS, NFS41, vsan, VVOL, PMEM)
vsphere_tag_<category>vSphere tag label; present only for categories matched by tag_categories; category names are sanitized for label keys and multiple tags in one category are sorted and joined with the pipe character
vsphere_custom_attribute_<name>vSphere custom attribute label; present only for attributes matched by custom_attributes; attribute names are sanitized for label keys

Metrics:

MetricDimensionsUnit
vsphere.datastore_disk_ioread, writeKiB/s
vsphere.datastore_disk_iopsreads, writesoperations/s
vsphere.datastore_disk_latencyread, writemilliseconds
vsphere.datastore_space_utilizationusedpercentage
vsphere.datastore_space_usagecapacity, free, used, uncommittedbytes
vsphere.datastore_overall_statusgreen, red, yellow, graystatus
vsphere.datastore_accessibility_statusaccessible, inaccessiblestatus
vsphere.datastore_maintenance_statusnormal, entering_maintenance, in_maintenance, unknownstatus
vsphere.datastore_multiple_host_accessenabled, disabled, unknownstatus

Per datastore cluster

These optional metrics refer to datastore clusters (StoragePod objects) and are collected only when collect_datastore_clusters is enabled.

Labels:

LabelDescription
idvSphere managed object reference ID
datacenterDatacenter name
datastore_clusterDatastore cluster name
vsphere_tag_<category>vSphere tag label; present only for categories matched by tag_categories; category names are sanitized for label keys and multiple tags in one category are sorted and joined with the pipe character
vsphere_custom_attribute_<name>vSphere custom attribute label; present only for attributes matched by custom_attributes; attribute names are sanitized for label keys

Metrics:

MetricDimensionsUnit
vsphere.datastore_cluster_space_utilizationusedpercentage
vsphere.datastore_cluster_space_usagecapacity, free, usedbytes
vsphere.datastore_cluster_storage_drs_statusenabled, disabledstatus
vsphere.datastore_cluster_overall_statusgreen, red, yellow, graystatus

Per vSAN cluster

These optional metrics refer to vSAN cluster capacity, health, and performance and are collected only when collect_vsan is enabled.

Labels:

LabelDescription
idvSphere managed object reference ID of the cluster
datacenterDatacenter name
clusterCluster name
vsan_uuidvSAN cluster UUID used by vSAN performance entity references
vsphere_tag_<category>vSphere tag label; present only for categories matched by tag_categories; category names are sanitized for label keys and multiple tags in one category are sorted and joined with the pipe character
vsphere_custom_attribute_<name>vSphere custom attribute label; present only for attributes matched by custom_attributes; attribute names are sanitized for label keys

Metrics:

MetricDimensionsUnit
vsphere.vsan_cluster_space_usageused, free, totalbytes
vsphere.vsan_cluster_space_utilizationusedpercentage
vsphere.vsan_cluster_health_statusgreen, yellow, red, unknownstatus
vsphere.vsan_cluster_operationsread, writeoperations/s
vsphere.vsan_cluster_throughputread, writebytes/s
vsphere.vsan_cluster_latencyread, writemicroseconds
vsphere.vsan_cluster_congestionscongestionscongestions/s

Per cluster

These metrics refer to the vSphere Cluster.

Labels:

LabelDescription
idvSphere managed object reference ID
datacenterDatacenter name
clusterCluster name
vsphere_tag_<category>vSphere tag label; present only for categories matched by tag_categories; category names are sanitized for label keys and multiple tags in one category are sorted and joined with the pipe character
vsphere_custom_attribute_<name>vSphere custom attribute label; present only for attributes matched by custom_attributes; attribute names are sanitized for label keys

Metrics:

MetricDimensionsUnit
vsphere.cluster_hoststotal, effectivehosts
vsphere.cluster_cpu_capacitytotal, effectiveMHz
vsphere.cluster_mem_capacitytotal, effectivebytes
vsphere.cluster_cpu_topologycores, threadscount
vsphere.cluster_drs_configenabledstatus
vsphere.cluster_drs_modemanual, partially_automated, fully_automated, unknownstatus
vsphere.cluster_drs_vmotion_rateratelevel
vsphere.cluster_ha_configenabled, admission_controlstatus
vsphere.cluster_ha_host_monitoringenabled, disabled, unknownstatus
vsphere.cluster_ha_vm_monitoringdisabled, vm_monitoring_only, vm_and_app_monitoring, unknownstatus
vsphere.cluster_ha_vm_component_protectionenabled, disabled, unknownstatus
vsphere.cluster_overall_statusgreen, red, yellow, graystatus
vsphere.cluster_vmotionsvmotionsmigrations
vsphere.cluster_drs_scorescorepercentage
vsphere.cluster_drs_balancecurrent, targetscore
vsphere.cluster_vm_counttotal, powered_offVMs
vsphere.cluster_usage_cpudemand, entitled, reservedMHz
vsphere.cluster_usage_memdemand, entitled, reservedMB
vsphere.cluster_cpu_utilizationusedpercentage
vsphere.cluster_cpu_usageused, totalMHz
vsphere.cluster_mem_utilizationusedpercentage
vsphere.cluster_mem_usageconsumed, active, granted, shared, overhead, swap_usedKiB
vsphere.cluster_services_fairnesscpu, memoryscore
vsphere.cluster_services_effective_cpueffective_cpuMHz
vsphere.cluster_services_effective_memeffective_memMB
vsphere.cluster_services_failoverfailures_tolerablefailures
vsphere.cluster_vm_migrationsvmotion, svmotion, xvmotionoperations
vsphere.cluster_vm_lifecyclepoweron, poweroff, create, destroy, clone, deployoperations
vsphere.cluster_vm_managementreconfigure, reset, suspend, register, unregisteroperations
vsphere.cluster_vm_guest_opsreboot, shutdown, standbyoperations
vsphere.cluster_vm_cold_migrationschange_ds, change_host, change_host_dsoperations

Per resource pool

These metrics refer to the vSphere Resource Pool.

Labels:

LabelDescription
idvSphere managed object reference ID
datacenterDatacenter name
clusterCluster name
resource_poolResource Pool name
vsphere_tag_<category>vSphere tag label; present only for categories matched by tag_categories; category names are sanitized for label keys and multiple tags in one category are sorted and joined with the pipe character
vsphere_custom_attribute_<name>vSphere custom attribute label; present only for attributes matched by custom_attributes; attribute names are sanitized for label keys

Metrics:

MetricDimensionsUnit
vsphere.resource_pool_cpu_usageusage, demandMHz
vsphere.resource_pool_cpu_entitlementdistributedMHz
vsphere.resource_pool_cpu_allocationreservation_used, unreserved_for_vm, max_usageMHz
vsphere.resource_pool_mem_usagehost, guestMB
vsphere.resource_pool_mem_entitlementdistributedMB
vsphere.resource_pool_mem_allocationreservation_used, unreserved_for_vm, max_usagebytes
vsphere.resource_pool_mem_breakdownprivate, shared, swapped, ballooned, overhead, consumed_overhead, compressedMB
vsphere.resource_pool_cpu_configreservation, limitMHz
vsphere.resource_pool_mem_configreservation, limitMB
vsphere.resource_pool_overall_statusgreen, red, yellow, graystatus

Alerts

The following alerts are available:

Alert nameOn metricDescription
vsphere_vm_cpu_utilizationvsphere.vm_cpu_utilizationVirtual Machine CPU utilization
vsphere_vm_mem_utilizationvsphere.vm_mem_utilizationVirtual Machine memory utilization
vsphere_vm_snapshot_chain_depthvsphere.vm_snapshot_max_chain_depthVirtual Machine snapshot maximum chain depth
vsphere_vm_snapshot_agevsphere.vm_snapshot_max_ageVirtual Machine oldest snapshot age
vsphere_host_cpu_utilizationvsphere.host_cpu_utilizationESXi Host CPU utilization
vsphere_host_mem_utilizationvsphere.host_mem_utilizationESXi Host memory utilization

Troubleshooting

Debug Mode

Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.

To troubleshoot issues with the vsphere collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn’t working.

  • Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that’s not the case on your system, open netdata.conf and look for the plugins setting under [directories].

    cd /usr/libexec/netdata/plugins.d/
    
  • Switch to the netdata user.

    sudo -u netdata -s
    
  • Run the go.d.plugin to debug the collector:

    ./go.d.plugin -d -m vsphere
    

    To debug a specific job:

    ./go.d.plugin -d -m vsphere -j jobName
    

Getting Logs

If you’re encountering problems with the vsphere collector, follow these steps to retrieve logs and identify potential issues:

  • Run the command specific to your system (systemd, non-systemd, or Docker container).
  • Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.

System with systemd

Use the following command to view logs generated since the last Netdata service restart:

journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep vsphere

System without systemd

Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector’s name:

grep vsphere /var/log/netdata/collector.log

Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.

Docker Container

If your Netdata runs in a Docker container named “netdata” (replace if different), use this command:

docker logs netdata 2>&1 | grep vsphere

Missing performance samples

If the logs show vsphere:host-no-perf-samples or vsphere:vm-no-perf-samples, verify that the configured account can read vCenter performance counters for the selected hosts and VMs, and that the entities are powered on when performance metrics are expected.

Periodic discovery errors

If the logs show vsphere:periodic-discovery-error, check vCenter reachability, account permissions for the enabled optional surfaces, and whether the configured timeout is large enough for the inventory size.

vCenter reboot recovery

The collector cannot always recover an existing session after a vCenter reboot. Restart go.d.plugin if collection does not resume after vCenter becomes available again.

The observability platform companies need to succeed

Sign up for free

Want a personalised demo of Netdata for your use case?

Contact Sales