sFlow

Thursday, February 22, 2024

VyOS 1.4 LTS released

The VyOS 1.4.0 (Sagitta) LTS release announcement is exciting news! VyOS is an open source router operating system based on Linux that can be installed on commodity PC hardware - for optimal performance at least 1GB RAM and 4GB of storage space is recommended.

The new 1.4 LTS release includes a significantly enhanced implementation of industry standard sFlow telemetry based on the open source Host sFlow agent.

set system sflow interface eth0
set system sflow interface eth1
set system sflow interface eth2
set system sflow interface eth3
set system sflow polling 30
set system sflow sampling-rate 1000
set system sflow drop-monitor-limit 50
set system sflow server 192.0.2.100

Enter the commands above to enable sFlow monitoring on interfaces eth0, eth1, eth2, and eth3. Interface counters will be exported every 30 seconds, packets will be sampled with probability 1/1000, and up to 50 packet headers (and drop reasons) per second will collected from packets dropped by the router. The sFlow telemetry stream will be sent to an sFlow collector at 192.0.2.100.

Running Docker on the sFlow collector makes it easy to run a variety of sFlow analytics tools.

docker run --rm -p 6343:6343/udp sflow/sflowtool

Run the sflow/sflowtool image to decode and print the contents of the sFlow telemetry stream and verify receipt of data.

docker run --rm -p 6343:6343/udp sflow/tcpdump tcp port 80

Run the sflow/tcpdump image to decode and filter sampled packet headers. For more complex packet analysis tasks, try the sflow/tshark image.

Run the sflow/sflowtrend image to trend interface counters and top flows.

Deploy real-time network dashboards using Docker compose describes how to configure Prometheus and Grafana to capture time series data and create custom dashboards.

Dropped packet reason codes in VyOS describes how the new Linux kernel in VyOS 1.4 provides detailed visibility into every dropped packet (including the reason it was dropped). This cabability is used by the new sFlow agent implement the sFlow Dropped Packet Notification Structures extension to provide network-wide visibility into dropped packets.

Download VyOS today to try out the new features. Pre-built LTS images are available with paid support, but anyone can build an image from sources or download the latest rolling release.

Monday, January 15, 2024

Raspberry Pi 5 network emulation with Containerlab

The GitHub sflow-rt/containerlab project contains example network topologies for the Containerlab network emulation tool that demonstrate real-time streaming telemetry in realistic data center topologies and network configurations. The examples use the same FRRouting (FRR) engine that is part of SONiC, NVIDIA Cumulus Linux, and DENT network operating systems. Containerlab can be used to experiment before deploying solutions into production. Examples include: tracing ECMP flows in leaf and spine topologies, EVPN visibility, and automated DDoS mitigation using BGP Flowspec and RTBH controls.

Raspberry Pi 5 real-time network analytics describes how to install Docker on a Raspberry Pi 5.

docker run hello-world

Run the hello-world container to verify that Docker in properly installed and running before proceeding.

git clone https://github.com/sflow-rt/containerlab.git

Download the sflow-rt/containerlab project from GitHub.

cd containerlab
./run-clab

Start Containerlab.

containerlab deploy -t clos5.yml

Start the 5 stage leaf and spine topology shown at the top of this page. The initial launch may take a couple of minutes as the container images are downloaded for the first time. Once the images are downloaded, the topology deploys in around 10 seconds.

./topo.py clab-clos5

Push the topology to the sFlow-RT analytics software.

An instance of the sFlow-RT real-time analytics engine receives industry standard sFlow telemetry from all the switches in the network. All of the switches in the topology are configured to send sFlow to the sFlow-RT instance. In this case, Containerlab is running the pre-built sflow/clab-sflow-rt image which packages sFlow-RT with useful applications for exploring the data.

Connect to the web interface on port 8008. The sFlow-RT dashboard verifies that telemetry is being received from 10 agents (the 10 switches in the Clos fabric). See the sFlow-RT Quickstart guide for more information.

The Containerlab Dashboard (click on sFlow-RT Apps tab and containerlab-dashboard button) shows real-time dashboard displaying up to the second traffic.

docker exec -it clab-clos5-h1 iperf3 -c 172.16.4.2

Each of the hosts in the network has an iperf3 server, so running the above command will test bandwidth between h1 and h4.

docker exec -it clab-clos5-h1 iperf3 -c 2001:172:16:4::2

Generate a large IPv6 flow between h1 and h4. The traffic flows should immediately appear in the Top Flows chart. You can check the accuracy by comparing the values reported by iperf3 with those shown in the chart.

Click on the Topology tab to see a real-time weathermap of traffic flowing over the topology. See how repeated iperf3 tests take different ECMP (equal-cost multi-path) routes across the network.

docker exec -it clab-clos5-leaf1 vtysh

Linux with open source routing software (FRRouting) is an accessible alternative to vendor routing stacks (no registration / license required, no restriction on copying means you can share images on Docker Hub, no need for virtual machines). FRRouting is popular in production network operating systems (e.g. Cumulus Linux, SONiC, DENT, etc.) and the VTY shell provides an industry standard CLI for configuration, so labs built around FRR allow realistic network configurations to be explored.

docker exec -it clab-clos5-leaf1 vtysh -c "show running-config"

Use vtysh to show the running configuration on leaf1.

containerlab destroy -t clos5.yml

When you are finished, run the above command to stop the containers and free the resources associated with the emulation. Try out other topologies from the project to explore topics such as DDoS mitigation, BGP Flowspec, and EVPN.

Note: If you are building your own topologies, the Raspberry Pi 5 8G can comfortably handle topologies with up to 50 FRR/Alpine Linux nodes.

Getting Started provides an introduction to sFlow-RT analytics and APIs. Containerlab provides a useful environment for developing and testing monitoring applications for sFlow-RT before moving them into production.

Moving monitoring solutions from Containerlab to production is straightforward since sFlow is widely implemented in datacenter equipment from vendors including: A10, Arista, Aruba, Cisco, Edge-Core, Extreme, Huawei, Juniper, NEC, Netgear, Nokia, NVIDIA, Quanta, and ZTE. In addition, the open source Host sFlow agent makes it easy to extend visibility beyond the physical network into the compute infrastructure.

Raspberry Pi 5 real-time network analytics describes how to deploy an sFlow-RT, Prometheus, and Grafana monitoring stack to monitor live network traffic.

Tuesday, January 9, 2024

Raspberry Pi 5 real-time network analytics

CanaKit Raspberry Pi 5 Starter Kit - Aluminum

This article describes how build an inexpensive Raspberry Pi 5 based server for real-time flow analytics using industry standard sFlow streaming telemetry. Support for sFlow is widely implemented in datacenter equipment from vendors including: A10, Arista, Aruba, Cisco, Edge-Core, Extreme, Huawei, Juniper, NEC, Netgear, Nokia, NVIDIA, Quanta, and ZTE.

In this example, we will use an 8G Raspberry Pi 5 running Raspberry Pi OS Lite (64-bit). The easiest way to format a memory card and install the operating system is to use the Raspberry Pi Imager (shown above).

Click on EDIT SETTINGS button to customize the installation.

Set a hostname, username, and password.

Click on the SERVICES tab and select Enable SSH. Click SAVE to save the settings and then YES to apply the settings and create a bootable micro SD card. These initial settings allow the Rasberry Pi to be accessed over the network without having to attach a screen, keyboard, and mouse.

ssh pp@192.168.4.170

Use ssh to log into Raspberry Pi (having installled the micro SD card).

sudo apt-get update && sudo apt-get -y upgrade

Update packages and OS to latest version.

curl -sSL https://get.docker.com | sh

Install Docker.

sudo usermod -aG docker $USER

Give permission to run Docker without sudo command. Exit ssh session and log in again to pick up the new settings.

docker run hello-world

Run the hello-world container to verify that docker in properly installed and running.

git clone https://github.com/sflow-rt/prometheus-grafana.git
cd prometheus-grafana
./start.sh

Start sFlow-RT, Prometheus, and Grafana using Docker compose.

Configure sFlow Agents embedded in switches, routers and servers to stream sFlow telemetry to the Raspberry Pi. The sFlow-RT Getting Started guide shows how to verify that sFlow is being received and includes tools flow and counter based analytics.

For example, the Flow Browser application lets you list attributes of network traffic that you are interested in and trend top flows with the attributes in real-time (up to the second). Defining Flows describes the flow analytics capability of sFlow-RT that can be explored.

Deploy real-time network dashboards using Docker compose describes how to configure Prometheus and Grafana to capture time series data and create custom dashboards.

The Raspberry Pi 5 is surprisingly capable, this pocket-sized server can easily monitor thousands of high speed (100G+) links, providing up to the second visibility into network flows. In this example, sFlow telemetry from 100 switches, each with 48 active 100G ports, was easily handled by the Raspberry Pi 5. Performance of the Prometheus database is likely to be the limiting factor given the relatively slow disk performance of the micro SD card, but could be improved adding an M.2 PCIe disk.

Friday, November 17, 2023

SC23 Over 6 Terabits per Second of WAN Traffic

The world’s fastest temporary internet service gets turned on in Denver for one week only describes the SCinet temporary network built to support the The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC23) this week in Denver. The SC23 WAN Stress Test chart demonstrates that the provisioned 6.71 terabits bits per second capacity was pushed to the limits.

SC23 SCinet traffic describes the architecture of the real-time monitoring system used to comprehensively monitor the SCinet network and generate these charts. This chart shows that over 175 Petabytes of data were transfered during the show.

SC23 Dropped packet visibility demonstration describes a joint demonstration by InMon Corp and Arista Networks of one of newest developments in sFlow telemetry, identifying every dropped packet, the reason it was dropped, and the location it was dropped across all the switches in real-time.

SC23 WiFi Traffic Heatmap shows a real-time view of WiFi usage at the conference displayed on a conference floorplan.

Finally, SC23 Data Transfer Node TCP Metrics demonstrates how standard metrics maintained by the Linux kernel can be used to augment sFlow telemetry and track the performance of large science data transfers.

Thursday, November 16, 2023

SC23 Data Transfer Node TCP Metrics

The dashboard shown above is based on the open source sflow-rt/dtn project. The dashboard shows data captured from The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC23) being held this week in Denver.

The dashboard displays data gathered from open source Host sFlow agents installed on Data Transfer Nodes (DTNs) run by the Caltech High Energy Physics Department and used for handling transfer of large scientific data sets (for example, accessing experiment data from the CERN particle accelerator). Network performance monitoring describes how the Host sFlow agents augment standard sFlow telemetry with measurements that the Linux kernel maintains as part of the normal operation of the TCP protocol stack.

The dashboard shows 5 large flows (greater than 50 Gigabits per Second). For each large flow being tracked, additional TCP performance metrics are displayed:

RTT The round trip time observed between DTNs
RTT Wait The amount of time that data waits on sender before it can be sent.
RTT Sdev The standard deviation on observed RTT. This variation is a measure of jitter.
Avg. Packet Size The average packet size used to send data.
Packets in Flight The number of unacknowledged packets.

See Defining Flows for full range of attributes that can be used to create flow metrics.

The conference network used in the demonstration, SCinet, is described as the most powerful and advanced network on Earth, connecting the SC community to the world.

In this example, the sFlow-RT real-time analytics engine receives sFlow telemetry from switches, routers, and servers in the SCinet network and creates metrics to drive the real-time charts in the dashboard. Getting Started provides a quick introduction to deploying and using sFlow-RT for real-time network-wide flow analytics.

Finally, check out the SC23 Dropped packet visibility demonstration, SC23 SCinet traffic, and SC23 WiFi Traffic Heatmap for additional network visibility demonstrations from the show.

Wednesday, November 15, 2023

SC23 WiFi Traffic Heatmap

Real-time WiFi-Traffic Heatmap (source code GitHub: cod3monk/showfloor-heatmap) displays real-time WiFi traffic from The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC23) being held this week in Denver.

The conference network used in the demonstration, SCinet, is described as the most powerful and advanced network on Earth, connecting the SC community to the world.

In this example, the sFlow-RT real-time analytics engine receives sFlow telemetry from switches, routers, and servers in the SCinet network and creates metrics to drive the real-time heatmap. Getting Started provides a quick introduction to deploying and using sFlow-RT for real-time network-wide flow analytics.

Additional use cases being demonstrated this week include, SC23 Dropped packet visibility demonstration and SC23 SCinet traffic.

Monday, November 13, 2023

SC23 SCinet traffic

The real-time dashboard shows total network traffic at The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC23) conference being held this week in Denver. The dashboard shows that 31 Petabytes of data have been transferred already and the conference hasn't even started.

The conference network used in the demonstration, SCinet, is described as the most powerful and advanced network on Earth, connecting the SC community to the world.

The dashboard above trends SC23 Total Traffic. The dashboard was constructed using the Prometheus time series database to store metrics retrieved from sFlow-RT and Grafana to build the dashboard. Deploy real-time network dashboards using Docker compose demonstrates how to deploy and configure these tools to create custom dashboards like the one shown here.

Finally, check out the SC23 Dropped packet visibility demonstration to learn about one of newest developments in sFlow monitoring and see a live demonstration.