System monitor

  • Last Updated 4/15/2024, 2:02:13 PM UTC
  • About 5 min read

Plugin info

name: sys-mon

Collects system level metrics on a node:

  • Block device IO
  • Network IO
  • Virtual Memory
  • CPU
  • CPU load averages
  • Top processes by CPU and memory
  • Filesystem utilizations

# Prerequisites

# Events

None

# Metrics

# Disk IO

The following metrics are collected for each disk block device on the system identified by dimension device

Metric Description
system/disk/reads_per_sec reads per second
system/disk/writes_per_sec writes per second
system/disk/iops io per second
system/disk/read_kb_sec read kb per second
system/disk/write_kb_sec write kb per second
system/disk/avg_queue_size avg number of operations in progress (queued or servicing)
system/disk/avg_read_latency avg read response time in milliseconds
system/disk/avg_write_latency avg write response time in milliseconds
system/disk/avg_io_latency avg io response time in milliseconds
system/disk/util percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device)
system/disk/avg_read_sz_kb avg read request size in kb
system/disk/avg_write_sz_kb avg write request size in kb

# Network IO

The following metrics are collected for each network interface on the system identified by dimension device

Metric Description
system/net/rx_kb_sec receive kb per second
system/net/rx_err_sec receive errors per second
system/net/rx_drop_sec receive drop packets per second
system/net/rx_fifo_sec receive fifo overruns per second
system/net/tx_kb_sec transmit kb per second
system/net/tx_err_sec transmit errors per second
system/net/tx_drop_sec transmit drop packets per second
system/net/tx_fifo_sec transmit fifo overruns per second

# Virtual Memory

Metric Description
system/vm/used used mem kb
system/vm/avail estimation of how much memory in kb is available for starting new applications without swapping
system/vm/used_pct percent memory used. (total mem - avail mem) / total mem
system/vm/swap_free estimate of how much swap memory is free
system/vm/swap_used_pct percent memory swap used. (total swap - free swap) / total swap
system/vm/p_swap_sec pages swaped in per sec + pages swaped out per sec

# CPU

Statistics averaged across all CPUs:

Metric Description
system/cpu/user user + nice utilization percent
system/cpu/system system utilization percent
system/cpu/busy system + user + nice utilization percent
system/cpu/idle idle utilization percent
system/cpu/iowait percent of time waiting for io
system/cpu/steal percent of time waiting for cpu time inside a VM that should have otherwise been allocated by the hypervisor to the VM

# CPU Load Averages

Metric Description
system/load/cpu_1m cpu load 1m average
system/load/cpu_5m cpu load 5m average
system/load/cpu_15m cpu load 15m average

# Top Processes

All metrics are aggregates per process name identified by dimension cmd

Metric Description
system/process/cpu_time_pct percent of total available cpu time over all CPUs
system/process/cpu_pct percent of cpu time relative to 1 cpu
system/process/mem_pct percent of total available memory
system/process/mem memory utilization (kb)

# Filesystem Mounts

Utilization statistics per mount point.

  • Any space reserved by the filesystem is excluded from the statistics
  • On windows systems i-node statistics are not provided
  • On linux only mounts on block devices are reported
  • Mount points are identified by the mount dimension
Metric Description
system/fs/total_kb total available kb
system/fs/used_kb total kb used
system/fs/free_kb total kb free
system/fs/used_pct percent utilization
system/fs/total_inodes total available i-nodes
system/fs/used_inodes total i-nodes used
system/fs/free_inodes total i-nodes free
system/fs/used_inodes_pct i-node utilization

# Shared Filesystems

Utilization statistics per shared (nfs, cifs) filesystem.

  • On windows systems i-node statistics are not provided
  • Shared file systems are identified by the service dimension
Metric Description
system/share_fs/total_kb total available kb
system/share_fs/used_kb total kb used
system/share_fs/free_kb total kb free
system/share_fs/used_pct percent utilization
system/share_fs/total_inodes total available i-nodes
system/share_fs/used_inodes total i-nodes used
system/share_fs/free_inodes total i-nodes free
system/share_fs/used_inodes_pct i-node utilization

# Configuration

This section describes the configuration settings for this plugin.

Name Type Required Default Description
fs.ignore_fstypes []string No List of filesystem types to ignore
fs.remotes []string No List of remote shares to track in host:/remote/export format. By default no shared filesystems are tracked
procs.top_percentile float No top percentile of processes to export. default is 90
stats []string No Define which stats are exported by category. Default is all.

The plugin samples system counters and calculates statistics at a frequency independent from the scheduling interval of the task that extracts the corresponding metrics. The extracted metric values are the averages of the collected samples between extraction runs.

# Exported Metrics

All metrics are exported by default. The stats configuration allows you to define specific metrics that you want to export.

  • io
  • net
  • vm
  • cpu
  • fs
  • load
  • procs

Export all stats:

stats:

Export specific stats categories:

# all block device io and all cpu stats
stats:
  - io
  - cpu
Last Updated: 4/15/2024, 2:02:13 PM