System monitor
- Last Updated 3/31/2023, 12:34:01 PM UTC
- About 5 min read
Plugin info
name: sys-mon
Collects system level metrics on a node:
- Block device IO
- Network IO
- Virtual Memory
- CPU
- CPU load averages
- Top processes by CPU and memory
- Filesystem utilizations
# Prerequisites
- Linux: read access to /procfilesystem. See proc(5) man page (opens new window) for information abouthidepidandgidmount options.
# Events
None
# Metrics
# Disk IO
The following metrics are collected for each disk block device on the system identified by dimension device
| Metric | Description | 
|---|---|
| system/disk/reads_per_sec | reads per second | 
| system/disk/writes_per_sec | writes per second | 
| system/disk/iops | io per second | 
| system/disk/read_kb_sec | read kb per second | 
| system/disk/write_kb_sec | write kb per second | 
| system/disk/avg_queue_size | avg number of operations in progress (queued or servicing) | 
| system/disk/avg_read_latency | avg read response time in milliseconds | 
| system/disk/avg_write_latency | avg write response time in milliseconds | 
| system/disk/avg_io_latency | avg io response time in milliseconds | 
| system/disk/util | percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device) | 
| system/disk/avg_read_sz_kb | avg read request size in kb | 
| system/disk/avg_write_sz_kb | avg write request size in kb | 
# Network IO
The following metrics are collected for each network interface on the system identified by dimension device
| Metric | Description | 
|---|---|
| system/net/rx_kb_sec | receive kb per second | 
| system/net/rx_err_sec | receive errors per second | 
| system/net/rx_drop_sec | receive drop packets per second | 
| system/net/rx_fifo_sec | receive fifo overruns per second | 
| system/net/tx_kb_sec | transmit kb per second | 
| system/net/tx_err_sec | transmit errors per second | 
| system/net/tx_drop_sec | transmit drop packets per second | 
| system/net/tx_fifo_sec | transmit fifo overruns per second | 
# Virtual Memory
| Metric | Description | 
|---|---|
| system/vm/used | used mem kb | 
| system/vm/avail | estimation of how much memory in kb is available for starting new applications without swapping | 
| system/vm/used_pct | percent memory used. (total mem - avail mem) / total mem | 
| system/vm/p_swap_sec | pages swaped in per sec + pages swaped out per sec | 
# CPU
Statistics averaged across all CPUs:
| Metric | Description | 
|---|---|
| system/cpu/user | user + nice utilization percent | 
| system/cpu/system | system utilization percent | 
| system/cpu/busy | system + user + nice utilization percent | 
| system/cpu/idle | idle utilization percent | 
| system/cpu/iowait | percent of time waiting for io | 
| system/cpu/steal | percent of time waiting for cpu time inside a VM that should have otherwise been allocated by the hypervisor to the VM | 
# CPU Load Averages
| Metric | Description | 
|---|---|
| system/load/cpu_1m | cpu load 1m average | 
| system/load/cpu_5m | cpu load 5m average | 
| system/load/cpu_15m | cpu load 15m average | 
# Top Processes
All metrics are aggregates per process name identified by dimension cmd
| Metric | Description | 
|---|---|
| system/process/cpu_time_pct | percent of total available cpu time over all CPUs | 
| system/process/cpu_pct | percent of cpu time relative to 1 cpu | 
| system/process/mem_pct | percent of total available memory | 
| system/process/mem | memory utilization (kb) | 
# Filesystem Mounts
Utilization statistics per mount point.
- Any space reserved by the filesystem is excluded from the statistics
- On windows systems i-nodestatistics are not provided
- On linux only mounts on block devices are reported
- Mount points are identified by the mountdimension
| Metric | Description | 
|---|---|
| system/fs/total_kb | total available kb | 
| system/fs/used_kb | total kb used | 
| system/fs/free_kb | total kb free | 
| system/fs/used_pct | percent utilization | 
| system/fs/total_inodes | total available i-nodes | 
| system/fs/used_inodes | total i-nodes used | 
| system/fs/free_inodes | total i-nodes free | 
| system/fs/used_inodes_pct | i-node utilization | 
# Shared Filesystems
Utilization statistics per shared (nfs, cifs) filesystem.
- On windows systems i-nodestatistics are not provided
- Shared file systems are identified by the servicedimension
| Metric | Description | 
|---|---|
| system/share_fs/total_kb | total available kb | 
| system/share_fs/used_kb | total kb used | 
| system/share_fs/free_kb | total kb free | 
| system/share_fs/used_pct | percent utilization | 
| system/share_fs/total_inodes | total available i-nodes | 
| system/share_fs/used_inodes | total i-nodes used | 
| system/share_fs/free_inodes | total i-nodes free | 
| system/share_fs/used_inodes_pct | i-node utilization | 
# Configuration
This section describes the configuration settings for this plugin.
| Name | Type | Required | Default | Description | 
|---|---|---|---|---|
| fs.ignore_fstypes | []string | No | List of filesystem types to ignore | |
| fs.remotes | []string | No | List of remote shares to track in host:/remote/exportformat. By default no shared filesystems are tracked | |
| procs.top_percentile | float | No | top percentile of processes to export. default is 90 | |
| stats | []string | No | Define which stats are exported by category. Default is all. | 
The plugin samples system counters and calculates statistics at a frequency independent from the scheduling interval of the task that extracts the corresponding metrics. The extracted metric values are the averages of the collected samples between extraction runs.
# Exported Metrics
All metrics are exported by default. The stats configuration allows you to define specific metrics that you want to export.
- io
- net
- vm
- cpu
- fs
- load
- procs
Export all stats:
stats:
Export specific stats categories:
# all block device io and all cpu stats
stats:
  - io
  - cpu
