If you are a Windows System Administrator or developer you know how important it is to monitor your Windows Servers and make sure they’re up and running, smoothly.
And you also know that sometimes things go south and your servers go kaput leaving you in the dark as to what really went wrong.
Effective Windows server monitoring requires the following:
Effective Windows server monitoring can help you improve server reliability and availability by reducing crashes and outages. It also enables you to monitor resource utilization and application performance to ensure that your server is not overloaded or underutilized, and also compare different servers or time periods to identify any opportunities for improvement
Now that you have learned about the benefits and best practices of effective Windows server monitoring, you might be wondering how to implement it in practice. This is where Netdata can help you.
Netdata is a comprehensive monitoring solution that can be used to monitor and troubleshoot various aspects of your infrastructure including servers, VMs, Network, Disks, K8s and a wide variety of applications (databases, gateways, web servers etc.) In addition to Linux, Unix based systems and MacOS, Netdata can also monitor Windows Servers.
System level metrics: Netdata can automatically gather system level metrics related to the operating system, network, storage, processes and more.
Packaged application metrics: Netdata can automatically gather operational and other performance metrics from packaged application running on the server, including popular applications such as SQL Server, IIS, Active Directory, .NET Framework and more.
Ready to use dashboards: Netdata automatically organizes and correlates all the information in ready to use dashboards.
Data retention: Netdata maintains a long history of all this data, automatically applying tiering (recent data are high-fidelity - per second - losing granularity as time passes) to keep storage costs low.
Machine Learning: Netdata trains a machine learning model for every single metric it collects, this allows Netdata to predict the expected range of response time values in the next data collection.
The recommended way to monitor Windows Servers using Netdata is through the Prometheus Windows Exporter tool, which is a native agent that runs on each host and exports metrics which Netdata then collects, stores and visualizes.
To set up Netdata to monitor one or more Windows servers follow these steps:
Sign up for a free account at Netdata Cloud and copy the installation command.
Install Netdata agent on a Linux node.
Configure Netdata to collect data remotely from your Windows hosts by adding one job per host to windows.conf file. See the configuration section for details. Here’s an example:
jobs:
- name: win_server1
url: http://203.0.113.10:9182/metrics
Netdata’s new virtual nodes functionality allows you to define nodes in configuration files and have them be treated as regular nodes in all of the UI, dashboards, tabs, filters etc. For example, you can create a virtual node each for all your Windows machines and monitor them as discrete entities. Virtual nodes can help you simplify your infrastructure monitoring and focus on the individual node that matters.
To define your Windows Server as a virtual node you need to:
Define virtual nodes in /etc/netdata/vnodes/vnodes.conf
- hostname: win_server1
guid: <value>
Just remember to use a valid guid (On Linux you can use uuidgen command to generate one, On Windows just use the [guid]::NewGuid() command in PowerShell)
Add the vnode config to the Windows monitoring job we created earlier, see the higlighted line below:
jobs:
- name: win_server1
vnode: win_server1
url: http://203.0.113.10:9182/metrics
That’s it! You can now enjoy real-time charts and alerts for your entire Windows infrastructure. You can also identify each Windows host as a separate node in Netdata Cloud.
Optionally:
For more information on configuration or the metrics collected, please refer to the documentation.
If you have followed the steps so far, you should now see Windows monitoring on the Netdata UI:
Note: The only currently native way to install Netdata on Windows is to use the Netdata MSI installer which runs Netdata in a custom WSL deployment. However WSL was not designed for production environments, so we do not recommend using the installer or WSL in production.
Let’s dive a little deeper into some of the key metrics to monitor on your Windows Server. To do the monitoring you can use tools like Task Manager, Performance Monitor, or if you want a single pane of glass for all your monitoring needs you can just use Netdata.
Resource usage can vary depending on factors such as:
Monitoring resource usage can help you:
Here are just some of the important server metrics you should keep an eye on:
CPU usage is a measure of how much of the CPU’s processing power is being used by the server. The CPU is responsible for executing instructions and calculations for various processes and applications running on the server. The higher the CPU usage, the more work the CPU is doing.
Memory usage is a measure of how much of the physical memory (RAM) is being used by the server. The memory is responsible for storing data and instructions for various processes and applications running on the server. The higher the memory usage, the more data and instructions are stored in memory.
Network usage is a measure of how much of the network bandwidth is being used by the server. The network bandwidth is responsible for transmitting and receiving data between the server and other devices on the network. The higher the network usage, the more data is being transferred over the network.
Monitoring the TCP stack on a Windows server involves checking the status and performance of the network connections and protocols that enable communication between the server and other devices. TCP monitoring can help identify network issues, such as latency, packet loss, congestion, or errors.
Monitoring processes on a Windows server involves checking the activity and resource consumption of the programs that run on the server. Process monitoring can help optimize the performance and efficiency of the server and detect any anomalies or malfunctions.
Netdata monitors all of the metrics we mentioned and a lot more, to see the full list of metrics please check out the documentation.
If you are more of a visual learner and want to try it out yourself, check out the Windows monitoring rooms on Netdata’s demo space (no login required).
While monitoring and understanding server metrics is the foundation of effective Windows server monitoring, it is not sufficient on its own. The next step is arguably more important and that is to monitor application performance because, at the end of the day, the Windows server is a platform for running various applications that provide essential services and functions for your business.
Monitoring lets you ensure that your applications are running fast and smoothly without any delays or errors. You want to identify any bottlenecks or issues that may affect the user experience or the business outcomes. You also want to ensure that your applications are always up and running without any downtime or interruptions. You want to detect any failures or outages that may affect the service delivery or the business continuity.
Some applications that are commonly run on Windows Servers are:
Netdata enables you to monitor all of these applications on Windows Server effectively, but we will not be going into their details in this article. If you want to learn more about how to monitor these applications with Netdata, please click on the links above.
Netdata uses unsupervised machine learning to detect anomalies across all metrics, of all nodes, out of the box. If you want to know if any of your Windows metrics or applications are (or were) behaving abnormally just enable the anomaly view on any Netdata chart OR visit the Anomalies tab to explore anomalies across your infrastructure.
Here is a quick video walkthrough of how to get started using the Anomaly Advisor. You can find more related videos on this playlist from our YouTube channel.
One of the benefits of monitoring both application metrics (such as requests response time errors etc.) and server metrics (such as CPU, memory, disk, network etc.) is that you can correlate them to get a deeper understanding of your Windows infrastructure.
For example you can see how CPU utilization affects response time, how disk throughput affects database queries, how network bandwidth affects web requests etc.
You can also see how different applications interact with each other on the same server or across different servers.
Correlating application metrics and server metrics help you identify root causes,troubleshoot problems, optimize performance and improve availability of your Windows infrastructure.
By using Netdata for Windows monitoring and troubleshooting, you can quickly identify and resolve issues and optimize your Windows application performance