Monday, March 19, 2018

Prometheus and Grafana

Prometheus is an open source time series database optimized to collect large numbers of metrics from cloud infrastructure. This article will explore how industry standard sFlow telemetry streaming supported by network devices and Host sFlow agents (Linux, Windows, FreeBSD, AIX, Solaris, Docker, Systemd, Hyper-V, KVM, Nutanix AHV, Xen) can be integrated with Prometheus.

The diagram above shows the elements of the solution: sFlow telemetry streams from hosts and switches to an instance of sFlow-RT. The sFlow-RT analytics software converts the raw measurements into metrics that are accessible through a REST API.

The following prometheus.php script mediates between the Prometheus metrics export protocol and the sFlow-RT REST API.  HTTP queries from Prometheus are translated into calls to the sFlow-RT REST API and JSON responses are converted into Prometheus metrics.
header('Content-Type: text/plain');
if(isset($_GET['labels'])) {
  $keys = htmlspecialchars($_GET["labels"]);
$vals = htmlspecialchars($_GET["values"]);
if(isset($keys)) {
  $cols = $keys.','.$vals;
} else {
  $cols = $vals;
$key_arr = explode(",",$keys);
$result = file_get_contents('http://localhost:8008/table/ALL/'.$cols.'/json');
$obj = json_decode($result,true);
foreach ($obj as $row) {
  foreach ($row as $cell) {
    if(!isset($labels)) {
      $labels = 'agent="'.$cell['agent'].'",datasource="'.$cell['dataSource'].'"';
    $name = $cell['metricName'];
    $val = $cell['metricValue'];
    if(in_array($name,$key_arr)) {
      $labels .= ','.$name.'="'.$val.'"';
    } else {
      print $name."{".$labels."} ".$val."\n";
Install prometheus.php under the web server home directory (e.g. /var/www/html) on the system running sFlow-RT and verify functionality using cURL:
$ curl "http://localhost/prometheus.php?labels=host_name&values=load_one"
load_one{agent="",datasource="2.1",host_name="server3"} 0.06
load_one{agent="",datasource="2.1",host_name="server2"} 0
load_one{agent="",datasource="2.1",host_name="spine1"} 0
load_one{agent="",datasource="2.1",host_name="spine2"} 0
load_one{agent="",datasource="2.1",host_name="leaf1"} 0
load_one{agent="",datasource="2.1",host_name="leaf2"} 0
Define metrics "scraping" jobs in the Prometheus configuration file, prometheus.yml. In this example sFlow-RT is running on host and two sets of metrics have been defined using prometheus.php interface. The sflow-rt-hosts job retrieves host metrics labeled by host_name and the sflow-rt-ifstats job retrieves network interface metrics labeled by host_name and ifname. The metric values in this example are a small selection from the extensive set of standard sFlow metrics available from sFlow-RT, see Metrics.
  scrape_interval:     15s
  evaluation_interval: 15s

  # - "first.rules"
  # - "second.rules"

  - job_name: 'sflow-rt-hosts'
    scrape_interval: 30s
    metrics_path: /prometheus.php
      labels: ['host_name']
      values: ['load_one,cpu_utilization,proc_run']
      - targets: ['']
  - job_name: 'sflow-rt-ifstats'
    scrape_interval: 30s
    metrics_path: /prometheus.php
      labels: ['host_name,ifname']
      values: ['ifinutilization,ifoututilization,ifindiscards,ifoutdiscards']
      - targets: ['']
Start Prometheus. For example, the following command shows how to run Prometheus under Docker:
docker run --name prometheus --rm -v $PWD/data:/prometheus \
-v $PWD/prometheus.yml:/etc/prometheus/prometheus.yml \
-p 9090:9090 prom/prometheus
The screen capture above shows the Prometheus web interface, accessible on port 9090.
Grafana is open source time series analysis software. The ability to pull data from many data sources and the extensive range of charting options makes Grafana an attractive tool for building operations dashboards.

The following command shows how to run Grafana under Docker:
docker run --name grafana --rm -v $PWD/data:/var/lib/grafana \
-p 3000:3000 grafana/grafana
Access the Grafana web interface on port 3000, configure a data source for the Prometheus database, and start building dashboards. The screen capture above shows the same chart built earlier using the native Prometheus interface.

Standard sFlow telemetry provides a unified method of monitoring large scale network and compute infrastructure. This example focussed on sFlow counter metrics, but sFlow also provides real-time flow information that can be used to generate flow metrics, for example, reporting on interactions between microservices.

Additional examples on this blog include:

No comments:

Post a Comment