The following telegraf.js script instructs sFlow-RT to periodically export host metrics to InfluxDB:
var influxdb = "http://10.0.0.56:8086/write?db=telegraf"; function sendToInfluxDB(msg) { if(!msg || !msg.length) return; var req = { url:influxdb, operation:'POST', headers:{"Content-Type":"text/plain"}, body:msg.join('\n') }; req.error = function(e) { logWarning('InfluxDB POST failed, error=' + e); } try { httpAsync(req); } catch(e) { logWarning('bad request ' + req.url + ' ' + e); } } var metric_names = [ 'host_name', 'load_one', 'load_five', 'load_fifteen', 'cpu_num', 'uptime', 'cpu_user', 'cpu_system', 'cpu_idle', 'cpu_nice', 'cpu_wio', 'cpu_intr', 'cpu_sintr', 'cpu_steal', 'cpu_guest', 'cpu_guest_nice' ]; var ntoi; function mVal(row,name) { if(!ntoi) { ntoi = {}; for(var i = 0; i < metric_names.length; i++) { ntoi[metric_names[i]] = i; } } return row[ntoi[name]].metricValue; } setIntervalHandler(function() { var i,r,msg = []; var vals = table('ALL',metric_names); for(i = 0; i < vals.length; i++) { r = vals[i]; // Telegraf System plugin metrics msg.push('system,host=' +mVal(r,'host_name') +' load1='+mVal(r,'load_one') +',load5='+mVal(r,'load_five') +',load15='+mVal(r,'load_fifteen') +',n_cpus='+mVal(r,'cpu_num')+'i'); msg.push('system,host=' +mVal(r,'host_name') +' uptime='+mVal(r,'uptime')+'i'); // Telegraf CPU plugin metrics msg.push('cpu,cpu=cpu-total,host=' +mVal(r,'host_name') +' usage_user='+(mVal(r,'cpu_user')||0) +',usage_system='+(mVal(r,'cpu_system')||0) +',usage_idle='+(mVal(r,'cpu_idle')||0) +',usage_nice='+(mVal(r,'cpu_nice')||0) +',usage_iowait='+(mVal(r,'cpu_wio')||0) +',usage_irq='+(mVal(r,'cpu_intr')||0) +',usage_softirq='+(mVal(r,'cpu_sintr')||0) +',usage_steal='+(mVal(r,'cpu_steal')||0) +',usage_guest='+(mVal(r,'cpu_guest')||0) +',usage_guest_nice='+(mVal(r,'cpu_guest_nice')||0)); } sendToInfluxDB(msg); },15);Some notes on the script:
- The sentToInfluxDB() function uses the Writing data using the HTTP API to POST metrics to InfluxDB.
- The setIntervalHandler function retrieves a table of metrics from sFlow-RT every 15 seconds and formats them to use the same names and tags as Telegraf.
- The script implements Telegraf System and CPU plugin functionality.
- Additional metrics can easily be added to proxy additional Telegraf plugins.
- Writing applications provides an overview of the sFlow-RT APIs.
docker run -v `pwd`/telegraf.js:/sflow-rt/telegraf.js \ -e "RTPROP=-Dscript.file=telegraf.js" \ -p 8008:8008 -p 6343:6343/udp sflow/sflow-rtAccessing the Chronograf home page brings up a table of hosts with their status and CPU load:
Clicking on the leaf1 host displays a dashboard trending key performance metrics:
Pre-processing the metrics using sFlow-RT's real-time streaming analytics engine can greatly increase scaleability by selectively exporting metrics and calculating higher level summary statistics in order to reduce the amount of data logged to the time series database. The analytics pipeline can also augment the metrics with additional metadata.
For example, Collecting Docker Swarm service metrics demonstrates how sFlow-RT can monitor dynamic service pools running under Docker Swarm and write summary statistics to InfluxDB. In this case Grafana was used to build metrics dashboard instead of Chronograf.
The open source Host sFlow agent exports an extensive range of standard sFlow metrics and has been ported to a wide range of platforms. Standard metrics describes how standardization helps reduce operational complexity. The overlap between standard sFlow metrics and Telegraf base plugin metrics makes the task of proxying straightforward.
The Host sFlow agent (and sFlow agents embedded in network switches and routers) goes beyond simple metrics export to provide detailed visibility into network traffic and articles on this blog demonstrate how sFlow-RT analytics software can be configured to generate detailed traffic flow metrics that can be streamed into InfluxDB, logged (e.g. Exporting events using syslog), or trigger control actions (e.g. DDoS mitigation, Docker 1.12 swarm mode elastic load balancing).
No comments:
Post a Comment