Skip to main content

Node Exporter

Introduction

Node Exporter is a critical component in the Prometheus monitoring ecosystem that enables system-level metrics collection from Unix-like operating systems. As part of the Prometheus Exporters family, Node Exporter specifically focuses on hardware and OS metrics such as CPU usage, memory, disk I/O, filesystem fullness, and network statistics.

Unlike application-specific exporters, Node Exporter provides visibility into the underlying infrastructure running your applications. This makes it an essential tool for identifying performance bottlenecks, predicting resource exhaustion, and understanding system behavior during incidents.

What is Node Exporter?

Node Exporter is an official Prometheus exporter designed to expose a wide variety of hardware and kernel-related metrics about the host machine. The name "Node" refers to a host machine or node in your infrastructure.

Node Exporter runs as a daemon on the target systems you want to monitor, collecting metrics that aren't available to Prometheus by default. These metrics are exposed via an HTTP endpoint (typically on port 9100) that Prometheus can scrape at regular intervals.

Key Features of Node Exporter

  • Comprehensive metrics collection: Collects hundreds of metrics across various subsystems
  • Modular collector design: Enables enabling/disabling specific collectors based on needs
  • Low resource footprint: Minimal impact on system performance
  • Cross-platform support: Works on various Unix-like systems (Linux, FreeBSD, macOS, etc.)
  • Standardized metric naming: Follows Prometheus naming conventions

Installation and Setup

Prerequisites

  • A Unix-like operating system (Linux, FreeBSD, Darwin, etc.)
  • Root or sudo access (for some collectors)
  • Basic understanding of Prometheus concepts

Installing Node Exporter on Linux

Method 1: Using Binary Release

  1. Download the latest release from the Prometheus downloads page:
bash
wget https://github.com/prometheus/node_exporter/releases/download/v1.6.1/node_exporter-1.6.1.linux-amd64.tar.gz
  1. Extract the downloaded archive:
bash
tar xvfz node_exporter-1.6.1.linux-amd64.tar.gz
  1. Move to the extracted directory:
bash
cd node_exporter-1.6.1.linux-amd64
  1. Run Node Exporter:
bash
./node_exporter

You should see output similar to:

level=info ts=2023-06-16T14:30:25.042Z caller=node_exporter.go:182 msg="Starting node_exporter" version="1.6.1"
level=info ts=2023-06-16T14:30:25.042Z caller=node_exporter.go:183 msg="Build context" build_context="go=1.20.4 user=root date=20230607-15:47:21 sha=9a51a674eb32454e9aa91855e2a03cb1"
level=info ts=2023-06-16T14:30:25.042Z caller=node_exporter.go:185 msg="Enabled collectors"
level=info ts=2023-06-16T14:30:25.042Z caller=node_exporter.go:197 collector=arp
level=info ts=2023-06-16T14:30:25.042Z caller=node_exporter.go:197 collector=bcache
...
level=info ts=2023-06-16T14:30:25.043Z caller=tls_config.go:195 msg="TLS is disabled." http2=false
level=info ts=2023-06-16T14:30:25.043Z caller=node_exporter.go:1375 msg="Listening on" address=:9100
level=info ts=2023-06-16T14:30:25.043Z caller=node_exporter.go:1376 msg="Listening on" address=[::]:9100

Method 2: Using Docker

If you prefer using Docker, you can run Node Exporter as a container:

bash
docker run -d \
--net="host" \
--pid="host" \
-v "/:/host:ro,rslave" \
--name node_exporter \
prom/node_exporter:latest \
--path.rootfs=/host

Method 3: Using Package Managers

On Debian/Ubuntu:

bash
sudo apt-get update
sudo apt-get install prometheus-node-exporter

On RHEL/CentOS:

bash
sudo yum install prometheus-node-exporter

Running Node Exporter as a Service

For production environments, it's recommended to run Node Exporter as a systemd service:

  1. Create a Node Exporter user:
bash
sudo useradd --no-create-home --shell /bin/false node_exporter
  1. Move the binary to a standard location:
bash
sudo cp node_exporter /usr/local/bin/
sudo chown node_exporter:node_exporter /usr/local/bin/node_exporter
  1. Create a systemd service file:
bash
sudo nano /etc/systemd/system/node_exporter.service
  1. Add the following content:
[Unit]
Description=Node Exporter
Wants=network-online.target
After=network-online.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/usr/local/bin/node_exporter

[Install]
WantedBy=multi-user.target
  1. Enable and start the service:
bash
sudo systemctl daemon-reload
sudo systemctl enable node_exporter
sudo systemctl start node_exporter
  1. Check the status:
bash
sudo systemctl status node_exporter

Configuring Prometheus to Scrape Node Exporter

Once Node Exporter is running, you need to configure Prometheus to scrape metrics from it:

  1. Open your Prometheus configuration file (prometheus.yml):
bash
sudo nano /etc/prometheus/prometheus.yml
  1. Add a scrape configuration for Node Exporter:
yaml
scrape_configs:
# Other scrape configs...

- job_name: 'node_exporter'
static_configs:
- targets: ['localhost:9100']
  1. If you're monitoring multiple nodes, you can add multiple targets:
yaml
scrape_configs:
- job_name: 'node_exporter'
static_configs:
- targets:
- 'node1.example.com:9100'
- 'node2.example.com:9100'
- 'node3.example.com:9100'
  1. Reload Prometheus to apply changes:
bash
curl -X POST http://localhost:9090/-/reload

Understanding Node Exporter Metrics

Node Exporter exposes a wide variety of metrics. Here are some of the most important categories:

CPU Metrics

  • node_cpu_seconds_total: Seconds the CPUs spent in each mode
  • node_load1, node_load5, node_load15: System load averages

Memory Metrics

  • node_memory_MemTotal_bytes: Total memory
  • node_memory_MemFree_bytes: Free memory
  • node_memory_MemAvailable_bytes: Available memory

Disk Metrics

  • node_filesystem_avail_bytes: Filesystem space available
  • node_filesystem_size_bytes: Filesystem size
  • node_disk_io_time_seconds_total: Total seconds spent doing I/O

Network Metrics

  • node_network_receive_bytes_total: Network bytes received
  • node_network_transmit_bytes_total: Network bytes transmitted
  • node_network_up: Network interface up (1) or down (0)

Working with Node Exporter Metrics

Let's explore some practical examples of how to use Node Exporter metrics with PromQL (Prometheus Query Language).

Example 1: CPU Usage Percentage

To calculate the CPU usage percentage:

100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[1m])) * 100)

This query:

  1. Takes the rate of increase in idle CPU time over 1 minute
  2. Multiplies by 100 to get a percentage
  3. Subtracts from 100 to get the usage percentage rather than idle percentage

Example 2: Memory Usage Percentage

To calculate memory usage percentage:

(1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100

Example 3: Disk Space Usage Percentage

For disk space usage percentage:

(1 - node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100

Example 4: Network I/O

For network traffic rate:

rate(node_network_receive_bytes_total[5m])
rate(node_network_transmit_bytes_total[5m])

Enabling/Disabling Specific Collectors

Node Exporter has a modular architecture with different collectors for various metrics. You can enable or disable specific collectors based on your needs.

To see available collectors:

bash
./node_exporter --help

To run Node Exporter with only specific collectors:

bash
./node_exporter --collector.disable-defaults --collector.cpu --collector.meminfo --collector.loadavg

To run Node Exporter with all default collectors except a few:

bash
./node_exporter --no-collector.wifi --no-collector.hwmon

Common Collectors

CollectorDescriptionDefault
cpuCPU statisticsEnabled
diskstatsDisk I/O statisticsEnabled
filesystemFilesystem statisticsEnabled
loadavgSystem load averageEnabled
meminfoMemory statisticsEnabled
netdevNetwork interface statisticsEnabled
netstatNetwork statistics from /proc/net/netstatEnabled
statKernel statistics from /proc/statEnabled
timeCurrent system timeEnabled
unameSystem informationEnabled

Creating Dashboards with Node Exporter Metrics

Grafana is commonly used to visualize Prometheus metrics. Here's a simple example of how to create a basic system monitoring dashboard:

  1. Add Prometheus as a data source in Grafana
  2. Create a new dashboard
  3. Add panels for key metrics:
    • CPU Usage
    • Memory Usage
    • Disk Usage
    • Network I/O

For example, to create a CPU usage panel:

  1. Add a new panel
  2. Use this query:
    100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[1m])) * 100)
  3. Set the visualization type to Graph or Gauge
  4. Set appropriate thresholds (e.g., yellow at 70%, red at 90%)

Creating Alerts with Node Exporter Metrics

You can set up alerts based on Node Exporter metrics to be notified of potential issues:

Example Alert Rule in Prometheus

yaml
groups:
- name: node_exporter_alerts
rules:
- alert: HighCPULoad
expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High CPU load (instance {{ $labels.instance }})"
description: "CPU load is above 80% for 5 minutes
VALUE = {{ $value }}
LABELS: {{ $labels }}"

- alert: HighMemoryLoad
expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) * 100 > 80
for: 5m
labels:
severity: warning
annotations:
summary: "High memory usage (instance {{ $labels.instance }})"
description: "Memory usage is above 80% for 5 minutes
VALUE = {{ $value }}
LABELS: {{ $labels }}"

- alert: DiskSpaceRunningOut
expr: (node_filesystem_avail_bytes / node_filesystem_size_bytes) * 100 < 20
for: 5m
labels:
severity: warning
annotations:
summary: "Disk space running out (instance {{ $labels.instance }})"
description: "Disk space is below 20% for 5 minutes
VALUE = {{ $value }}
LABELS: {{ $labels }}"

Advanced Node Exporter Usage

Custom Textfile Collector

Node Exporter includes a "textfile" collector that can read metrics from files in a directory. This allows you to extend Node Exporter with custom metrics:

  1. Run Node Exporter with the textfile collector enabled:
bash
./node_exporter --collector.textfile.directory=/var/lib/node_exporter/textfile_collector
  1. Create a directory for the textfile collector:
bash
sudo mkdir -p /var/lib/node_exporter/textfile_collector
  1. Create a file with metrics in the Prometheus format:
bash
echo '# HELP custom_metric_example Example of a custom metric
# TYPE custom_metric_example gauge
custom_metric_example{label="example"} 1.0' > /var/lib/node_exporter/textfile_collector/custom_metrics.prom
  1. Prometheus will now scrape these custom metrics along with the standard Node Exporter metrics.

Monitoring NVIDIA GPUs

If you need to monitor NVIDIA GPUs, you can use the nvidia_gpu_exporter alongside Node Exporter:

bash
docker run --privileged --rm -e NVIDIA_VISIBLE_DEVICES=all -p 9835:9835 nvcr.io/nvidia/k8s/dcgm-exporter:3.1.7-3.1.4-ubuntu20.04

Then add it to your Prometheus configuration:

yaml
scrape_configs:
- job_name: 'nvidia_gpu'
static_configs:
- targets: ['localhost:9835']

Troubleshooting Node Exporter

Common Issues and Solutions

  1. Node Exporter won't start

    • Check for permission issues
    • Verify the binary is executable
    • Check if another process is using port 9100
  2. Metrics not appearing in Prometheus

    • Verify Node Exporter is running: curl http://localhost:9100/metrics
    • Check Prometheus configuration
    • Check network connectivity and firewall rules
  3. High resource usage

    • Disable collectors you don't need
    • Increase scrape interval in Prometheus

Debugging Tips

  1. Run Node Exporter with debug logging:
bash
./node_exporter --log.level=debug
  1. Check which collectors are enabled:
bash
./node_exporter --collector.disable-defaults --collector.cpu
  1. Test the metrics endpoint manually:
bash
curl http://localhost:9100/metrics

Node Exporter Architecture Diagram

To better understand how Node Exporter fits into the Prometheus ecosystem:

Summary

Node Exporter is a powerful tool for monitoring system-level metrics in Prometheus. Key takeaways include:

  • Node Exporter provides comprehensive hardware and OS metrics collection
  • It's easy to install and configure with minimal resource overhead
  • The modular collector design allows for customization based on your needs
  • Node Exporter metrics can be used to build effective dashboards and alerts
  • The textfile collector enables extending Node Exporter with custom metrics

By monitoring system-level metrics with Node Exporter, you can gain valuable insights into your infrastructure's health and performance, enabling proactive monitoring and faster troubleshooting.

Additional Resources

Exercises

  1. Install Node Exporter on a test system and configure Prometheus to scrape it.
  2. Create a Grafana dashboard showing CPU, memory, disk, and network metrics.
  3. Set up alert rules for high CPU usage, low disk space, and high memory usage.
  4. Use the textfile collector to create a custom metric that tracks the number of users logged into the system.
  5. Compare the resource usage of different collectors and determine which ones you need for your specific monitoring requirements.


If you spot any mistakes on this website, please let me know at [email protected]. I’d greatly appreciate your feedback! :)