Wednesday, December 16, 2015

Environmental metrics with Cumulus Linux

Custom metrics with Cumulus Linux describes how to extend the set of metrics exported by the sFlow agent and used the export of BGP metrics as an example. This article demonstrates how environmental metrics (power supplies, temperatures, fan speeds etc.) can be exported.

The smonctl command can be used to dump sensor data as JSON formatted text:
cumulus@cumulus$ smonctl -j
[
    {
        "pwm_path": "/sys/devices/soc.0/ffe03100.i2c/i2c-1/1-004d", 
        "all_ok": "1", 
        "driver_hwmon": [
            "fan1"
        ], 
        "min": 2500, 
        "cpld_path": "/sys/devices/ffe05000.localbus/ffb00000.CPLD", 
        "state": "OK", 
        "prev_state": "OK", 
        "msg": null, 
        "input": 8998, 
        "type": "fan", 
        "pwm1": 121, 
        "description": "Fan1", 
        "max": 29000, 
        "start_time": 1450228330, 
        "var": 15, 
        "pwm1_enable": 0, 
        "prev_msg": null, 
        "log_time": 1450228330, 
        "present": "1", 
        "target": 0, 
        "name": "Fan1", 
        "fault": "0", 
        "pwm_hwmon": [
            "pwm1"
        ], 
        "driver_path": "/sys/devices/soc.0/ffe03100.i2c/i2c-1/1-004d", 
        "div": "4", 
        "cpld_hwmon": [
            "fan1"
        ]
    },
    ... 
The following Python script, smon_sflow.py, invokes the command, parses the output, and posts a set of custom sFlow metrics:
#!/usr/bin/env python
import json
import socket
from subprocess import check_output

res = check_output(["/usr/sbin/smonctl","-j"])
smon = json.loads(res)
fan_maxpc = 0
fan_down = 0
fan_up = 0
psu_down = 0
psu_up = 0
temp_maxpc = 0
temp_up = 0
temp_down = 0
for s in smon:
  type = s["type"]
  if(type == "fan"):
    if "OK" == s["state"]:
      fan_maxpc = max(fan_maxpc, 100 * s["input"]/s["max"])
      fan_up = fan_up + 1
    else:
      fan_down = fan_down + 1
  elif(type == "power"):
    if "OK" == s["state"]:
      psu_up = psu_up + 1
    else:
      psu_down = psu_down + 1
  elif(type == "temp"):
    if "OK" == s["state"]:
      temp_maxpc = max(temp_maxpc, 100 * s["input"]/s["max"])
      temp_up = temp_up + 1
    else:
      temp_down = temp_down + 1

metrics = {
  "datasource":"smon",
  "fans-max-pc" : {"type":"gauge32", "value":int(fan_maxpc)},
  "fans-up-pc"  : {"type":"gauge32", "value":int(100 * fan_up / (fan_down + fan_up))},
  "psu-up-pc"   : {"type":"gauge32", "value":int(100 * psu_up / (psu_down + psu_up))},
  "temp-max-pc" : {"type":"gauge32", "value":int(temp_maxpc)},
  "temp-up-pc"  : {"type":"gauge32", "value":int(100.0 * temp_up / (temp_down + temp_up))}
}
msg = {"rtmetric":metrics}
sock = socket.socket(socket.AF_INET,socket.SOCK_DGRAM)
sock.sendto(json.dumps(msg),("127.0.0.1",36343))
Note: Make sure the following line is uncommented in the /etc/hsflowd.conf file in order to receive custom metrics. If the file is modified, restart hsflowd for the changes to take effect.
  jsonPort = 36343
Adding the following cron entry runs the script every minute:
* * * * * /home/cumulus/smon_sflow.py > /dev/null 2>&1
This example requires Host sFlow version 1.28.3 or later. This is newer than the version of Host sFlow that currently ships with Cumulus Linux 2.5.5. However, Cumulus Linux is an open platform, so the software can be compiled from sources in just the same way you would on a server:
sudo sh -c 'echo "deb http://ftp.us.debian.org/debian wheezy main contrib" > /etc/apt/sources.list.d/deb.list'
sudo apt-get update
sudo apt-get install gcc make libc-dev
wget https://github.com/sflow/host-sflow/archive/v1.28.3.tar.gz
tar -xvzf v1.28.3.tar.gz
cd host-sflow-1.28.3
make CUMULUS=yes
make deb CUMULUS=yes
sudo dpkg -i hsflowd_1.28.3-1_ppc.deb
The sFlow-RT chart at the top of this page shows a trend chart of the environment metrics. Each metrics has been constructed as a percentage, so they can all be combined on the chart.

While custom metrics are useful, they don't capture the semantics of the data and will vary in form and content. In the case of environmental metrics, a standard set of metrics would add significant value since many different types of device include environmental sensors and a common set of measurements from all networked devices would provide a comprehensive view of power, temperature, humidity, and cooling. Anyone interested in developing a standard sFlow export for environmental metrics can contribute ideas on the sFlow.org mailing list.

1 comment:

  1. Hi Peter,

    I've been looking in all kinds of places to find a mail address for you. I didn't have luck.
    Could you please reach out to me instead? (http://www.florianheigl.me and then just hit the contact form)

    I'm planning some pretty fun infrastructure hackathon and got you on the list of people to invite, but so far couldn't get a hold.

    ReplyDelete