Host sFlow version 1.28.3 adds support for Custom Metrics. This article demonstrates how the extensive set of standard sFlow measurements can be augmented using custom metrics.
Recent releases of Cumulus Linux simplify the task by making machine readable JSON a supported output in command line tools. For example, the cl-bgp tool can be used to dump BGP summary statistics:
cumulus@leaf1$ sudo cl-bgp summary show json { "router-id": "192.168.0.80", "as": 65080, "table-version": 5, "rib-count": 9, "rib-memory": 1080, "peer-count": 2, "peer-memory": 34240, "peer-group-count": 1, "peer-group-memory": 56, "peers": { "swp1": { "remote-as": 65082, "version": 4, "msgrcvd": 52082, "msgsent": 52084, "table-version": 0, "outq": 0, "inq": 0, "uptime": "05w1d04h", "prefix-received-count": 2, "prefix-advertised-count": 5, "state": "Established", "id-type": "interface" }, "swp2": { "remote-as": 65083, "version": 4, "msgrcvd": 52082, "msgsent": 52083, "table-version": 0, "outq": 0, "inq": 0, "uptime": "05w1d04h", "prefix-received-count": 2, "prefix-advertised-count": 5, "state": "Established", "id-type": "interface" } }, "total-peers": 2, "dynamic-peers": 0 }The following Python script, bgp_sflow.py, invokes the command, parses the output, and posts a set of custom sFlow metrics:
#!/usr/bin/env python import json import socket from subprocess import check_output res = check_output(["/usr/bin/cl-bgp","summary","show","json"]) bgp = json.loads(res) metrics = { "datasource":"bgp", "bgp-router-id" : {"type":"string", "value":bgp["router-id"]}, "bgp-as" : {"type":"string", "value":str(bgp["as"])}, "bgp-total-peers" : {"type":"gauge32", "value":bgp["total-peers"]}, "bgp-peer-count" : {"type":"gauge32", "value":bgp["peer-count"]}, "bgp-dynamic-peers": {"type":"gauge32", "value":bgp["dynamic-peers"]}, "bgp-rib-memory" : {"type":"gauge32", "value":bgp["rib-memory"]}, "bgp-rib-count" : {"type":"gauge32", "value":bgp["rib-count"]}, "bgp-peer-memory" : {"type":"gauge32", "value":bgp["peer-memory"]}, "bgp-msgsent" : {"type":"counter32", "value":sum(bgp["peers"][c]["msgsent"] for c in bgp["peers"])}, "bgp-msgrcvd" : {"type":"counter32", "value":sum(bgp["peers"][c]["msgrcvd"] for c in bgp["peers"])} } msg = {"rtmetric":metrics} sock = socket.socket(socket.AF_INET,socket.SOCK_DGRAM) sock.sendto(json.dumps(msg),("127.0.0.1",36343))Adding the following cron entry runs the script every minute:
* * * * * /home/cumulus/bgp_sflow.py > /dev/null 2>&1The new metrics will now arrive at the sFlow collector. The following sflowtool output verifies that the metrics are being received:
startSample ---------------------- sampleType_tag 4300:1002 sampleType RTMETRIC rtmetric_datasource_name bgp rtmetric bgp-as = (string) "65080" rtmetric bgp-rib-count = (gauge32) 9 rtmetric bgp-dynamic-peers = (gauge32) 0 rtmetric bgp-rib-memory = (gauge32) 1080 rtmetric bgp-peer-count = (gauge32) 2 rtmetric bgp-router-id = (string) "192.168.0.80" rtmetric bgp-total-peers = (gauge32) 2 rtmetric bgp-msgrcvd = (counter32) 104648 rtmetric bgp-msgsent = (counter32) 104651 rtmetric bgp-peer-memory = (gauge32) 34240 endSample ----------------------A more interesting way to consume this data is using sFlow-RT. The diagram above shows a leaf and spine network built using CumuluxVX virtual machines that was used for a Network virtualization visibility demo. Installing the bgp_sflow.py script on each switch adds centralized visibility into fabric wide BGP statistics.
For example, the following sFlow-RT REST API command returns the total bgp messages sent and received summed across all switches:
$ curl http://10.0.0.86:8008/metric/ALL/sum:bgp-msgrcvd,sum:bgp-msgsent/json [ { "lastUpdateMax": 20498, "lastUpdateMin": 20359, "metricN": 4, "metricName": "sum:bgp-msgrcvd", "metricValue": 0.10000302901465385 }, { "lastUpdateMax": 20498, "lastUpdateMin": 20359, "metricN": 4, "metricName": "sum:bgp-msgsent", "metricValue": 0.10000302901465385 } ]The custom metrics are fully integrated with all the other sFlow metrics, for example, the following query returns the host_name, bgp-as and load_one metrics associated with bgp-router-id 192.168.0.80:
$ curl http://10.0.0.86:8008/metric/ALL/host_name,bgp-as,load_one/json?bgp-router-id=192.168.0.80 [ { "agent": "10.0.0.80", "lastUpdate": 12194, "lastUpdateMax": 12194, "lastUpdateMin": 12194, "metricN": 1, "metricName": "host_name", "metricValue": "leaf1" }, { "agent": "10.0.0.80", "dataSource": "bgp", "lastUpdate": 22232, "lastUpdateMax": 22232, "lastUpdateMin": 22232, "metricN": 1, "metricName": "bgp-as", "metricValue": "65080" }, { "agent": "10.0.0.80", "lastUpdate": 12194, "lastUpdateMax": 12194, "lastUpdateMin": 12194, "metricN": 1, "metricName": "load_one", "metricValue": 0 } ]The article, Cluster performance metrics, describes the metric API in more detail. Additional sFlow-RT APIs can be used to send data to a variety of DevOps tools, including: Ganglia, Graphite, InfluxDB and Grafana, Logstash, Splunk, cloud analytics services.
Software, documentation, applications, and community support is available on sFlow-RT.com. For example, the sFlow-RT Fabric View application shown in the screen capture calculates and displays fabric wide traffic analytics.
Just a note here the current install of hsflowd on Cumulus is 1.27.3-1 so 1.28.3 will need to be compiled.
ReplyDeleteThanks for mentioning the need to upgrade. One of the nice things about Cumulus Linux being an open Linux platform is that you can recompile from sources on the switch.
DeleteThe easiest thing to do is build a .deb package on one switch and install the package on the remaining switches.