Thursday, October 21, 2021

InfluxDB Cloud


InfluxDB Cloud is a cloud hosted version of InfluxDB. The free tier makes it easy to try out the service and has enough capability to satisfy simple use cases. In this article we will explore how metrics based on sFlow streaming telemetry can be pushed into InfluxDB Cloud.

The diagram shows the elements of the solution. Agents in host and network devices are configured to stream sFlow telemetry to an sFlow-RT real-time analytics engine instance. The Telegraf Agent queries sFlow-RT's REST API for metrics and pushes them to InfluxDB Cloud.

docker run -p 8008:8008 -p 6343:6343/udp --name sflow-rt -d sflow/prometheus

Use Docker to run the pre-built sflow/prometheus image which packages sFlow-RT with the sflow-rt/prometheus application. Configure sFlow agents to stream data to this instance.

Create an InfluxDB Cloud account. Click the Data tab. Click on the Telegraf option and the InfluxDB Output Plugin button to get the URL to post data. Click the API Tokens option and generate a token.
[agent]
  interval = "15s"
  round_interval = true
  metric_batch_size = 5000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = "1s"
  hostname = ""
  omit_hostname = true

[[outputs.influxdb_v2]]
  urls = ["INFLUXDB_CLOUD_URL"]
  token = "INFLUXDB_CLOUD_TOKEN"
  organization = "INFLUXDB_CLOUD_USER"
  bucket = "sflow"

[[inputs.prometheus]]
  urls = ["http://host.docker.internal:8008/prometheus/metrics/ALL/ifinutilization,ifoututilization/txt"]
  metric_version = 2

Create a telegraf.conf file. Substitute INFLUXDB_CLOUD_URL, INFLUXDB_CLOUD_TOKEN, and INFLUXDB_CLOUD_USER with values retrieved from the InfluxDB Cloud account.

docker run -v $PWD/telegraf.conf:/etc/telegraf/telegraf.conf:ro \
-d --name telegraf telegraf

Use Docker to run the telegraf agent.

Data should start appearing in InfluxDB Cloud. Use the Explore tab to see what data is available and to create charts. In this case we are plotting ingress / egress utilization for each switch port in the network.

Telegraf sFlow input plugin describes why you would normally bypass Telegraf and have InfluxDB directly retrieve metrics from sFlow-RT. However, in the case of InfluxDB Cloud, Telegraf acts as a secure gateway, retrieving metrics locally using the inputs.prometheus module, and forwarding to the InfluxDB Cloud using the outputs.influxdb_v2 module. InfluxDB 2.0 released describes the settings used in the inputs.prometheus module.

Modify the urls setting in the inputs.prometheus section of the telegraf.conf file to add additional metrics and/or define flows.

There are important scaleability and cost advantages to placing the sFlow-RT analytics engine in front of the metrics collection service. For example, in large scale cloud environments the metrics for each member of a dynamic pool isn't necessarily worth trending since virtual machines / containers are frequently added and removed. Instead, sFlow-RT can be instructed to track all the members of the pool, calculates summary statistics for the pool, and log the summary statistics. This pre-processing can significantly reduce storage requirements, lowering costs and increasing query performance. 

Host, Docker, Swarm and Kubernetes monitoring describes how to deploy sFlow agents to monitor compute infrastructure.

The sFlow-RT Prometheus Exporter application exposes a REST API that allows metrics to be summarized, filtered, and synthesized. Exposing these capabilities through a REST API allows the Telegraf inputs.prometheus module to control the behavior of the sFlow-RT analytics pipeline and retrieve a small set of hight value metrics tailored to your requirements.

Wednesday, October 20, 2021

Telegraf sFlow input plugin

The Telegraf agent is bundled with an SFlow Input Plugin for importing sFlow telemetry into the InfluxDB time series database. However, the plugin has major caveats that severely limit the value that can be derived from sFlow telemetry.

Currently only Flow Samples of Ethernet / IPv4 & IPv4 TCP & UDP headers are turned into metrics. Counters and other header samples are ignored.

Series Cardinality Warning

This plugin may produce a high number of series which, when not controlled for, will cause high load on your database.

InfluxDB 2.0 released describes how to use sFlow-RT to convert sFlow telemetry into useful InfluxDB metrics.

Using sFlow-RT overcomes the limitations of the Telegraf sFlow Input Plugin, making it possible to fully realize the value of sFlow monitoring:

  • Counters are a major component of sFlow, efficiently streaming detailed network counters that would otherwise need to be polled via SNMP. Counter telemetry is ingested by sFlow-RT and used to compute an extensive set of Metrics that can be imported into InfluxDB.
  • Flow Samples are fully decoded by sFlow-RT, yielding visibility that extends beyond the basic Ethernet / IPv4 / TCP / UDP header metrics supported by the Telegraf plugin to include ARP, ICMP, IPv6, DNS, VxLAN tunnels, etc. The high cardinality of raw flow data is mitigated by sFlow-RT's programmable real-time flow analytics pipeline, exposing high value, low cardinality, flow metrics tailored to business requirements.
In addition, there are important scaleability and cost advantages to placing the sFlow-RT analytics engine in front of InfluxDB. For example, in large scale cloud environments the metrics for each member of a dynamic pool isn't necessarily worth trending since virtual machines / containers are frequently added and removed. Instead, sFlow-RT can be instructed to track all the members of the pool, calculates summary statistics for the pool, and log the summary statistics. This pre-processing can significantly reduce storage requirements, lowering costs and increasing query performance.

Tuesday, October 12, 2021

Grafana Cloud


Grafana Cloud is a cloud hosted version of Grafana, Prometheus, and Loki. The free tier makes it easy to try out the service and has enough capability to satisfy simple use cases. In this article we will explore how metrics based on sFlow streaming telemetry can be pushed into Grafana Cloud.

The diagram shows the elements of the solution. Agents in host and network devices are configured to stream sFlow telemetry to an sFlow-RT real-time analytics engine instance. The Grafana Agent queries sFlow-RT's REST API for metrics and pushes them to Grafana Cloud.
docker run -p 8008:8008 -p 6343:6343/udp --name sflow-rt -d sflow/prometheus
Use Docker to run the pre-built sflow/prometheus image which packages sFlow-RT with the sflow-rt/prometheus application. Configure sFlow agents to stream data to this instance.
Create a Grafana Cloud account. Click on the Agent button on the home page to get the configuration settings for the Grafana Agent.
Click on the Prometheus button to get the configuration to forward metrics from the Grafana Agent.
Enter a name and click on the Create API key button to generate configuration settings that include a URL, username, and password that will be used in the Grafana Agent configuration.
server:
  log_level: info
  http_listen_port: 12345
prometheus:
  wal_directory: /tmp/wal
  global:
    scrape_interval: 15s
  configs:
    - name: agent
      host_filter: false
      scrape_configs:
        - job_name: 'sflow-rt-analyzer'
          metrics_path: /prometheus/analyzer/txt
          static_configs:
            - targets: ['host.docker.internal:8008']
        - job_name: 'sflow-rt-metrics'
          metrics_path: /prometheus/metrics/ALL/ALL/txt
          static_configs:
            - targets: ['host.docker.internal:8008']
          metric_relabel_configs:
            - source_labels: ['agent', 'datasource']
              separator: ':'
              target_label: instance
        - job_name: 'sflow-rt-countries'
          metrics_path: /app/prometheus/scripts/export.js/flows/ALL/txt
          static_configs:
            - targets: ['host.docker.internal:8008']
          params:
            metric: ['sflow_country_bps']
            key: ['null:[country:ipsource:both]:unknown','null:[country:ipdestination:both]:unknown']
            label: ['src','dst']
            value: ['bytes']
            scale: ['8']
            aggMode: ['sum']
            minValue: ['1000']
            maxFlows: ['100']
        - job_name: 'sflow-rt-asns'
          metrics_path: /app/prometheus/scripts/export.js/flows/ALL/txt
          static_configs:
            - targets: ['host.docker.internal:8008']
          params:
            metric: ['sflow_asn_bps']
            key: ['null:[asn:ipsource:both]:unknown','null:[asn:ipdestination:both]:unknown']
            label: ['src','dst']
            value: ['bytes']
            scale: ['8']
            aggMode: ['sum']
            minValue: ['1000']
            maxFlows: ['100']
      remote_write:
        - url: API_URL
          basic_auth:
            username: API_USERID
            password: API_KEY
Create an agent.yaml configuration file. Substitute the API_URL, API_USERID, and API_KEY with values from the API Key settings obtained previosly.
docker run -v $PWD/data:/etc/agent/data -v $PWD/agent.yaml:/etc/agent/agent.yaml \
--name grafana-agent -d grafana/agent
Use Docker to run the Grafana Agent.
Data should start appearing in Grafana Cloud. Install the sFlow-RT Health, sFlow-RT Countries and Networks, and sFlow-RT Network Interfaces dashboards to view the data. For example, the Countries and Networks dashboard above shows traffic entering and leaving your network broken out by network and country. Flow metrics with Prometheus and Grafana describes how to build Prometheus scrape_configs that will cause sFlow-RT to export custom traffic flow metrics. 
There are important scaleability and cost advantages to placing the sFlow-RT analytics engine in front of the metrics collection service. For example, in large scale cloud environments the metrics for each member of a dynamic pool isn't necessarily worth trending since virtual machines / containers are frequently added and removed. Instead, sFlow-RT can be instructed to track all the members of the pool, calculates summary statistics for the pool, and log the summary statistics. This pre-processing can significantly reduce storage requirements, lowering costs and increasing query performance. 
Host, Docker, Swarm and Kubernetes monitoring describes how to deploy sFlow agents to monitor compute infrastructure.
The sFlow-RT Prometheus Exporter application exposes a REST API that allows metrics to be summarized, filtered, and synthesized. Exposing these capabilities through a REST API allows Prometheus scrape_configs to control the behavior of the sFlow-RT analytics pipeline and retrieve a small set of hight value metrics tailored to your requirements.

Thursday, October 7, 2021

DDoS protection quickstart guide

DDoS Protect is an open source denial of service mitigation tool that uses industry standard sFlow telemetry from routers to detect attacks and automatically deploy BGP remotely triggered blackhole (RTBH) and BGP Flowspec filters to block attacks within seconds.

This document pulls together links to a number of articles that describe how you can quickly try out DDoS Protect and get it running in your environment:

DDoS Protect is a lightweight solution that uses standard telemetry and control (sFlow and BGP) capabilities of routers to automatically block disruptive volumetric denial of service attacks. You can quickly evaluate the technology on your laptop or in a test lab. The solution leverages standard features of modern routing hardware to scale easily to large high traffic networks.