Monday, November 1, 2021

Sunburst

The recently released open source Sunburst application provides a real-time visualization of the protocols running a network. The Sunburst application runs on the sFlow-RT real-time analytics platform, which receives standard streaming sFlow telemetry from switches and routers throughout the network to provide comprehensive visibility.
docker run -p 8008:8008 -p 6343:6343/udp sflow/prometheus
The pre-built sflow/prometheus Docker image packages sFlow-RT with the applications for exploring real-time sFlow analytics. Run the command above, configure network devices to send sFlow to the application on UDP port 6343 (the default sFlow port) and connect with a web browser to port 8008 to access the user interface.
 
The chart at the top of this article demonstrates the visibility that sFlow can provide into nested protocol stacks that result from network virtualization. For example, the most deeply nested set of protocols shown in the chart is:
  1. eth: Ethernet
  2. q: IEEE 802.1Q VLAN
  3. trill: Transparent Interconnection of Lots of Links (TRILL)
  4. eth: Ethernet
  5. q: IEEE 802.1Q VLAN
  6. ip: Internet Protocol (IP) version 4
  7. udp: User Datagram Protocol (UDP)
  8. vxlan: Virtual eXtensible Local Area Network (VXLAN)
  9. eth: Ethernet
  10. ip Internet Protocol (IP) version 4
  11. esp IPsec Encapsulating Security Payload (ESP)
Click on a segment in the sunburst chart to further explore the selected protocol using the Flow Browser application.
The Flow Browser allows the full set of flow attributes to be explored, see Defining Flows for details. In this example, the filter was added by clicking on a segment in the Sunburst application and additional keys were entered to show inner and outer IP addresses in the tunnel.

Thursday, October 21, 2021

InfluxDB Cloud


InfluxDB Cloud is a cloud hosted version of InfluxDB. The free tier makes it easy to try out the service and has enough capability to satisfy simple use cases. In this article we will explore how metrics based on sFlow streaming telemetry can be pushed into InfluxDB Cloud.

The diagram shows the elements of the solution. Agents in host and network devices are configured to stream sFlow telemetry to an sFlow-RT real-time analytics engine instance. The Telegraf Agent queries sFlow-RT's REST API for metrics and pushes them to InfluxDB Cloud.

docker run -p 8008:8008 -p 6343:6343/udp --name sflow-rt -d sflow/prometheus

Use Docker to run the pre-built sflow/prometheus image which packages sFlow-RT with the sflow-rt/prometheus application. Configure sFlow agents to stream data to this instance.

Create an InfluxDB Cloud account. Click the Data tab. Click on the Telegraf option and the InfluxDB Output Plugin button to get the URL to post data. Click the API Tokens option and generate a token.
[agent]
  interval = "15s"
  round_interval = true
  metric_batch_size = 5000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = "1s"
  hostname = ""
  omit_hostname = true

[[outputs.influxdb_v2]]
  urls = ["INFLUXDB_CLOUD_URL"]
  token = "INFLUXDB_CLOUD_TOKEN"
  organization = "INFLUXDB_CLOUD_USER"
  bucket = "sflow"

[[inputs.prometheus]]
  urls = ["http://host.docker.internal:8008/prometheus/metrics/ALL/ifinutilization,ifoututilization/txt"]
  metric_version = 2

Create a telegraf.conf file. Substitute INFLUXDB_CLOUD_URL, INFLUXDB_CLOUD_TOKEN, and INFLUXDB_CLOUD_USER with values retrieved from the InfluxDB Cloud account.

docker run -v $PWD/telegraf.conf:/etc/telegraf/telegraf.conf:ro \
-d --name telegraf telegraf

Use Docker to run the telegraf agent.

Data should start appearing in InfluxDB Cloud. Use the Explore tab to see what data is available and to create charts. In this case we are plotting ingress / egress utilization for each switch port in the network.

Telegraf sFlow input plugin describes why you would normally bypass Telegraf and have InfluxDB directly retrieve metrics from sFlow-RT. However, in the case of InfluxDB Cloud, Telegraf acts as a secure gateway, retrieving metrics locally using the inputs.prometheus module, and forwarding to the InfluxDB Cloud using the outputs.influxdb_v2 module. InfluxDB 2.0 released describes the settings used in the inputs.prometheus module.

Modify the urls setting in the inputs.prometheus section of the telegraf.conf file to add additional metrics and/or define flows.

There are important scaleability and cost advantages to placing the sFlow-RT analytics engine in front of the metrics collection service. For example, in large scale cloud environments the metrics for each member of a dynamic pool isn't necessarily worth trending since virtual machines / containers are frequently added and removed. Instead, sFlow-RT can be instructed to track all the members of the pool, calculates summary statistics for the pool, and log the summary statistics. This pre-processing can significantly reduce storage requirements, lowering costs and increasing query performance. 

Host, Docker, Swarm and Kubernetes monitoring describes how to deploy sFlow agents to monitor compute infrastructure.

The sFlow-RT Prometheus Exporter application exposes a REST API that allows metrics to be summarized, filtered, and synthesized. Exposing these capabilities through a REST API allows the Telegraf inputs.prometheus module to control the behavior of the sFlow-RT analytics pipeline and retrieve a small set of hight value metrics tailored to your requirements.

Wednesday, October 20, 2021

Telegraf sFlow input plugin

The Telegraf agent is bundled with an SFlow Input Plugin for importing sFlow telemetry into the InfluxDB time series database. However, the plugin has major caveats that severely limit the value that can be derived from sFlow telemetry.

Currently only Flow Samples of Ethernet / IPv4 & IPv4 TCP & UDP headers are turned into metrics. Counters and other header samples are ignored.

Series Cardinality Warning

This plugin may produce a high number of series which, when not controlled for, will cause high load on your database.

InfluxDB 2.0 released describes how to use sFlow-RT to convert sFlow telemetry into useful InfluxDB metrics.

Using sFlow-RT overcomes the limitations of the Telegraf sFlow Input Plugin, making it possible to fully realize the value of sFlow monitoring:

  • Counters are a major component of sFlow, efficiently streaming detailed network counters that would otherwise need to be polled via SNMP. Counter telemetry is ingested by sFlow-RT and used to compute an extensive set of Metrics that can be imported into InfluxDB.
  • Flow Samples are fully decoded by sFlow-RT, yielding visibility that extends beyond the basic Ethernet / IPv4 / TCP / UDP header metrics supported by the Telegraf plugin to include ARP, ICMP, IPv6, DNS, VxLAN tunnels, etc. The high cardinality of raw flow data is mitigated by sFlow-RT's programmable real-time flow analytics pipeline, exposing high value, low cardinality, flow metrics tailored to business requirements.
In addition, there are important scaleability and cost advantages to placing the sFlow-RT analytics engine in front of InfluxDB. For example, in large scale cloud environments the metrics for each member of a dynamic pool isn't necessarily worth trending since virtual machines / containers are frequently added and removed. Instead, sFlow-RT can be instructed to track all the members of the pool, calculates summary statistics for the pool, and log the summary statistics. This pre-processing can significantly reduce storage requirements, lowering costs and increasing query performance.

Tuesday, October 12, 2021

Grafana Cloud


Grafana Cloud is a cloud hosted version of Grafana, Prometheus, and Loki. The free tier makes it easy to try out the service and has enough capability to satisfy simple use cases. In this article we will explore how metrics based on sFlow streaming telemetry can be pushed into Grafana Cloud.

The diagram shows the elements of the solution. Agents in host and network devices are configured to stream sFlow telemetry to an sFlow-RT real-time analytics engine instance. The Grafana Agent queries sFlow-RT's REST API for metrics and pushes them to Grafana Cloud.
docker run -p 8008:8008 -p 6343:6343/udp --name sflow-rt -d sflow/prometheus
Use Docker to run the pre-built sflow/prometheus image which packages sFlow-RT with the sflow-rt/prometheus application. Configure sFlow agents to stream data to this instance.
Create a Grafana Cloud account. Click on the Agent button on the home page to get the configuration settings for the Grafana Agent.
Click on the Prometheus button to get the configuration to forward metrics from the Grafana Agent.
Enter a name and click on the Create API key button to generate configuration settings that include a URL, username, and password that will be used in the Grafana Agent configuration.
server:
  log_level: info
  http_listen_port: 12345
prometheus:
  wal_directory: /tmp/wal
  global:
    scrape_interval: 15s
  configs:
    - name: agent
      host_filter: false
      scrape_configs:
        - job_name: 'sflow-rt-analyzer'
          metrics_path: /prometheus/analyzer/txt
          static_configs:
            - targets: ['host.docker.internal:8008']
        - job_name: 'sflow-rt-metrics'
          metrics_path: /prometheus/metrics/ALL/ALL/txt
          static_configs:
            - targets: ['host.docker.internal:8008']
          metric_relabel_configs:
            - source_labels: ['agent', 'datasource']
              separator: ':'
              target_label: instance
        - job_name: 'sflow-rt-countries'
          metrics_path: /app/prometheus/scripts/export.js/flows/ALL/txt
          static_configs:
            - targets: ['host.docker.internal:8008']
          params:
            metric: ['sflow_country_bps']
            key: ['null:[country:ipsource:both]:unknown','null:[country:ipdestination:both]:unknown']
            label: ['src','dst']
            value: ['bytes']
            scale: ['8']
            aggMode: ['sum']
            minValue: ['1000']
            maxFlows: ['100']
        - job_name: 'sflow-rt-asns'
          metrics_path: /app/prometheus/scripts/export.js/flows/ALL/txt
          static_configs:
            - targets: ['host.docker.internal:8008']
          params:
            metric: ['sflow_asn_bps']
            key: ['null:[asn:ipsource:both]:unknown','null:[asn:ipdestination:both]:unknown']
            label: ['src','dst']
            value: ['bytes']
            scale: ['8']
            aggMode: ['sum']
            minValue: ['1000']
            maxFlows: ['100']
      remote_write:
        - url: API_URL
          basic_auth:
            username: API_USERID
            password: API_KEY
Create an agent.yaml configuration file. Substitute the API_URL, API_USERID, and API_KEY with values from the API Key settings obtained previosly.
docker run -v $PWD/data:/etc/agent/data -v $PWD/agent.yaml:/etc/agent/agent.yaml \
--name grafana-agent -d grafana/agent
Use Docker to run the Grafana Agent.
Data should start appearing in Grafana Cloud. Install the sFlow-RT Health, sFlow-RT Countries and Networks, and sFlow-RT Network Interfaces dashboards to view the data. For example, the Countries and Networks dashboard above shows traffic entering and leaving your network broken out by network and country. Flow metrics with Prometheus and Grafana describes how to build Prometheus scrape_configs that will cause sFlow-RT to export custom traffic flow metrics. 
There are important scaleability and cost advantages to placing the sFlow-RT analytics engine in front of the metrics collection service. For example, in large scale cloud environments the metrics for each member of a dynamic pool isn't necessarily worth trending since virtual machines / containers are frequently added and removed. Instead, sFlow-RT can be instructed to track all the members of the pool, calculates summary statistics for the pool, and log the summary statistics. This pre-processing can significantly reduce storage requirements, lowering costs and increasing query performance. 
Host, Docker, Swarm and Kubernetes monitoring describes how to deploy sFlow agents to monitor compute infrastructure.
The sFlow-RT Prometheus Exporter application exposes a REST API that allows metrics to be summarized, filtered, and synthesized. Exposing these capabilities through a REST API allows Prometheus scrape_configs to control the behavior of the sFlow-RT analytics pipeline and retrieve a small set of hight value metrics tailored to your requirements.

Thursday, October 7, 2021

DDoS protection quickstart guide

DDoS Protect is an open source denial of service mitigation tool that uses industry standard sFlow telemetry from routers to detect attacks and automatically deploy BGP remotely triggered blackhole (RTBH) and BGP Flowspec filters to block attacks within seconds.

This document pulls together links to a number of articles that describe how you can quickly try out DDoS Protect and get it running in your environment:

DDoS Protect is a lightweight solution that uses standard telemetry and control (sFlow and BGP) capabilities of routers to automatically block disruptive volumetric denial of service attacks. You can quickly evaluate the technology on your laptop or in a test lab. The solution leverages standard features of modern routing hardware to scale easily to large high traffic networks.

Monday, September 20, 2021

Containernet

Containernet is a fork of the Mininet network emulator that uses Docker containers as hosts in emulated network topologies.

Multipass describes how build a Mininet testbed that provides real-time traffic visbility using sFlow-RT. This article adapts the testbed for Containernet.

multipass launch --name=containernet bionic
multipass exec containernet -- sudo apt update
multipass exec containernet -- sudo apt -y install ansible git aptitude default-jre
multipass exec containernet -- git clone https://github.com/containernet/containernet.git
multipass exec containernet -- sudo ansible-playbook -i "localhost," -c local containernet/ansible/install.yml
multipass exec containernet -- sudo /bin/sh -c "cd containernet; make develop"
multipass exec containernet -- wget https://inmon.com/products/sFlow-RT/sflow-rt.tar.gz
multipass exec containernet -- tar -xzf sflow-rt.tar.gz
multipass exec containernet -- ./sflow-rt/get-app.sh sflow-rt mininet-dashboard

Run the above commands in a terminal to create the Containernet virtual machine. 

multipass list

List the virtual machines

Name                    State             IPv4             Image
primary                 Stopped           --               Ubuntu 20.04 LTS
containernet            Running           192.168.64.12    Ubuntu 18.04 LTS
                                          172.17.0.1

Find the IP address of the mininet virtual machine we just created (192.168.64.12).

multipass exec containernet -- ./sflow-rt/start.sh

Start sFlow-RT. Use a web browser to connect to the VM and access the Mininet Dashboad application running on sFlow-RT, in this case http://192.168.64.12:8008/app/mininet-dashboard/html/

Open a second shell.

multipass shell containernet

Connet to the Containernet virtual machine.

cp containernet/examples/containernet_example.py .

Copy the Containernet Get started example script.

#!/usr/bin/python
"""
This is the most simple example to showcase Containernet.
"""
from mininet.net import Containernet
from mininet.node import Controller
from mininet.cli import CLI
from mininet.link import TCLink
from mininet.log import info, setLogLevel
setLogLevel('info')

exec(open("./sflow-rt/extras/sflow.py").read())

net = Containernet(controller=Controller)
info('*** Adding controller\n')
net.addController('c0')
info('*** Adding docker containers\n')
d1 = net.addDocker('d1', ip='10.0.0.251', dimage="ubuntu:trusty")
d2 = net.addDocker('d2', ip='10.0.0.252', dimage="ubuntu:trusty")
info('*** Adding switches\n')
s1 = net.addSwitch('s1')
s2 = net.addSwitch('s2')
info('*** Creating links\n')
net.addLink(d1, s1)
net.addLink(s1, s2, cls=TCLink, delay='100ms', bw=1)
net.addLink(s2, d2)
info('*** Starting network\n')
net.start()
info('*** Testing connectivity\n')
net.ping([d1, d2])
info('*** Running CLI\n')
CLI(net)
info('*** Stopping network')
net.stop()

Edit the copy and add the highlighted line to enable sFlow monitoring:

sudo python3 containernet_example.py

Run the Containernet example script.

Finally, the network topology will appear under the Mininet Dashboard topology tab.

Tuesday, August 31, 2021

Netdev 0x15


The recent Netdev 0x15 conference included a number of papers diving into the technology behind Linux as a network operating system. Slides and videos are now available on the conference web site.
Network wide visibility with Linux networking and sFlow describes the Linux switchdev driver used to integrate network hardware with Linux. The talk focuses on network telemetry, showing how standard Linux APIs are used to configure hardware instrumentation and stream telemetry using the industry standard sFlow protocol for data center wide visibility.
Switchdev in the wild describes Yandex's experience of deploying Linux switchdev based switches in production at scale. The diagram from the talk shows the three layer leaf and spine network architecture used in their data centers. Yandex operates multiple data centers, each containing up to 100,000 servers.
Switchdev Offload Workshop provides updates about the latest developments in the switchdev community. 
FRR Workshop discusses the latest development in the FRRouting project, the open source routing software that is now a defacto standard on Linux network operating systems.