Monday, August 14, 2023

Containerlab dashboard

The GitHub sflow-rt/containerlab project contains example network topologies for the Containerlab network emulation tool that demonstrate real-time streaming telemetry in realistic data center topologies and network configurations. The examples use the same FRRouting (FRR) engine that is part of SONiC, NVIDIA Cumulus Linux, and DENT network operating systems. Containerlab can be used to experiment before deploying solutions into production. Examples include: tracing ECMP flows in leaf and spine topologies, EVPN visibility, and automated DDoS mitigation using BGP Flowspec and RTBH controls.
The screen capture at the top of this article shows a real-time dashboard displaying up to the second traffic analytics gathered from the 5 stage Clos fabric shown above. This article walks through the steps needed to run the example.
git clone https://github.com/sflow-rt/containerlab.git
cd containerlab
./run-clab
Run the above commands to download the project and run Containerlab on a system with Docker installed. Docker Desktop is a conventient way to run the labs on a laptop.
containerlab deploy -t clos5.yml
Start the emulation.
./topo.py clab-clos5
Post topology to sFlow-RT REST API. Connect to http://localhost:8008/app/containerlab-dashboard/html/ to access the Dashboard shown at the top of this article.
docker exec -it clab-clos5-h1 iperf3 -c 172.16.4.2
Each of the hosts in the network has an iperf3 server, so running the above command will test bandwidth between h1 and h4.
docker exec -it clab-clos5-h1 iperf3 -c 2001:172:16:4::2
Generate a large IPv6 flow between h1 and h4. The traffic flows should immediately appear in the Top Flows chart. You can check the accuracy by comparing the values reported by iperf3 with those shown in the chart.
Click on the Topology tab to see a real-time weathermap of traffic flowing over the topology. See how repeated iperf3 tests take different ECMP (equal-cost multi-path) routes across the network.
docker exec -it clab-clos5-leaf1 vtysh
Linux with open source routing software (FRRouting) is an accessible alternative to vendor routing stacks (no registration / license required, no restriction on copying means you can share images on Docker Hub, no need for virtual machines). FRRouting is popular in production network operating systems (e.g. Cumulus Linux, SONiC, DENT, etc.) and the VTY shell provides an industry standard CLI for configuration, so labs built around FRR allow realistic network configurations to be explored.
Connect to http://localhost:8008/ to access the main sFlow-RT status page, additional applications, and the REST API. See Getting Started for more information.
containerlab destroy -t clos5.yml
When you are finished, run the above command to stop the containers and free the resources associated with the emulation. Try out other topologies from the project to explore topics such as DDoS mitigation, BGP Flowspec, and EVPN.

Moving the monitoring solution from Containerlab to production is straightforward since sFlow is widely implemented in datacenter equipment from vendors including: A10, Arista, Aruba, Cisco, Edge-Core, Extreme, Huawei, Juniper, NEC, Netgear, Nokia, NVIDIA, Quanta, and ZTE. In addition, the open source Host sFlow agent makes it easy to extend visibility beyond the physical network into the compute infrastructure.

Tuesday, August 8, 2023

Grafana Network Weathermap

The screen capture above shows a simple network weathermap, displaying a network topology with links animated by real-time network analytics.
Hovering over a link in the weathermap pops up a trend chart showing traffic on the link over the last 30 minutes.

Deploy real-time network dashboards using Docker compose, describes how to quickly deploy a real-time network analytics stack that includes the sFlow-RT analytics engine, Prometheus time series database, and Grafana to create dashboards. This article describes how to extend the example using the Grafana Network Weathermap Plugin to display network topologies like the ones shown here.

First, add a dashboard panel and select the Network Weathermap visualization. Next define the three metrics shown above. The ifinoctets and ifoutoctets need to be scaled by 8 to convert from bytes per second to bits per second. Creating a custom legend entry makes it easier to select metrics to associate metric instances with weathermap links.
Add a color scale that will be used to color links by link utilization. Defining the scale first ensures that links will be displayed correctly when they are added later.
Add the nodes to the canvas and drag them to their desired locations. There is a large library of icons that can be used to indicate the node types. The Enable Node Grid Snapping makes it easier to line up nodes.
Add links to connect the nodes. Each link needs to be associated with in/out metrics and and a link speed. Setting the Side Anchor Point values correctly ensures a clean layout.

Network weathermaps are only one method of displaying network telemetry - work through the examples in Deploy real-time network dashboards using Docker compose to learn how to construct dashboards of trend charts and analyze traffic flows.

Thursday, July 13, 2023

Deploy real-time network dashboards using Docker compose


This article demonstrates how to use docker compose to quickly deploy a real-time network analytics stack that includes the sFlow-RT analytics engine, Prometheus time series database, and Grafana to create dashboards.
git clone https://github.com/sflow-rt/prometheus-grafana.git
cd prometheus-grafana
./start.sh
Download the sflow-rt/prometheus-grafana project from GitHub on a system with Docker installed and start the containers. The start.sh script runs docker compose to bring up the containers specified in the compose.yml file, passing in user information so that the containers have correct permission to  write data files in the prometheus and grafana directories.
All the Docker images in this example are available for both x86 and ARM processors, so this stack can be deployed on Intel/AMD platforms as well as Apple M1/M2 or Raspberry Pi. Raspberry Pi 4 real-time network analytics describes how to configure a Raspberry Pi 4 to run Docker and perform real-time network analytics and is a simple way to run this stack for smaller networks.

Configure sFlow Agents in network devices to stream sFlow telemetry to the host running the analytics stack. See Getting Started for information on how to verify that sFlow telemetry is being received.

Connect to the Grafana web interface on port 3000 using default user name and password (admin/admin). You will be promted to change the password.
Select the option to Import a new Dashboard.
Enter the code 11201 to import sFlow-RT Network Interfaces dashboard from Grafana.com and click on the Load button.
Select the sflow_rt_data Prometheus database and click on the Import button.
The dashboard should appear showing top interfaces by Utilization, Discards and Errors.
Repeat the steps to add the sFlow-RT Health dashboard, code 11096.

The sFlow-RT Countries and Networks dashboard is an example of a flow based metric, plotting information about source and destination countries and provider networks based on traffic analytics.

Prometheus has already been programmed to gather metrics for the previous two example, but to run this third example, we need to modify the Prometheus configuration to gather the flow based metrics needed for the dashboard.

  - job_name: 'sflow-rt-countries'
    metrics_path: /app/prometheus/scripts/export.js/flows/ALL/txt
    static_configs:
      - targets: ['sflow-rt:8008']
    params:
      metric: ['sflow_country_bps']
      key:
        - 'null:[country:ipsource:both]:unknown'
        - 'null:[country:ipdestination:both]:unknown'
      label: ['src','dst']
      value: ['bytes']
      scale: ['8']
      aggMode: ['sum']
      minValue: ['1000']
      maxFlows: ['100']

  - job_name: 'sflow-rt-asns'
    metrics_path: /app/prometheus/scripts/export.js/flows/ALL/txt
    static_configs:
      - targets: ['sflow-rt:8008']
    params:
      metric: ['sflow_asn_bps']
      key:
        - 'null:[asn:ipsource:both]:unknown'
        - 'null:[asn:ipdestination:both]:unknown'
      label: ['src','dst']
      value: ['bytes']
      scale: ['8']
      aggMode: ['sum']
      minValue: ['1000']
      maxFlows: ['100']
Edit the prometheus/prometheus.yml file and add the above lines to the end of the file.
docker restart prometheus
Restart the prometheus container to pick up the new configuration and start collecting the data.
Add dashboard 11146 to load the sFlow-RT Countries and Networks dashboard.

Getting Started describes how to use the sFlow-RT Flow Browser and Metrics Browser applications to explore the data that is available (the sFlow-RT web interface is exposed on port 8008). Once you have found a useful metric, add it to the set of metrics for Prometheus (the Prometheus web interface is exposed on port 9090) to collect and use Grafana to build dashboards that incorporate the new metrics. Flow metrics with Prometheus and Grafana describes how Prometheus can use sFlow-RT's REST API to define and retrieve traffic flow based metrics like the ones in the Countries and Networks dashboard. 

Sunday, June 11, 2023

Raspberry Pi 4 real-time network analytics

CanaKit Raspberry Pi 4 EXTREME Kit - Aluminum
This article describes how build an inexpensive Raspberry Pi 4 based server for real-time flow analytics of industry standard sFlow streaming telemetry. Support for sFlow is widely implemented in datacenter equipment from vendors including: A10, Arista, Aruba, Cisco, Edge-Core, Extreme, Huawei, Juniper, NEC, Netgear, Nokia, NVIDIA, Quanta, and ZTE.

In this example, we will use an 8G Raspberry Pi 4 running Raspberry Pi OS Lite (64-bit).  The easiest way to format a memory card and install the operating system is to use the Raspberry Pi Imager (shown above).
Click on the gear icon to set a user and password and enable ssh access. These initial settings allow the Rasberry Pi to be accessed over the network without having to attach a screen, keyboard, and mouse.

Next, follow instruction for installing Docker Engine (Raspberry Pi OS Lite is based on Debian 11).

The diagram shows how the sFlow-RT real-time analytics engine receives a continuous telemetry stream from industry standard sFlow instrumentation build into network, server and application infrastructure and delivers analytics through APIs and can easily be integrated with a wide variety of on-site and cloud, orchestration, DevOps and Software Defined Networking (SDN) tools.
docker run -p 6343:6343/udp -p 127.0.0.1:8008:8008 \
--name sflow-rt -d --restart unless-stopped sflow/prometheus
Run the pre-built sflow/prometheus Docker image. In this example access to the user interface is limited to local host in order prevent unauthorized access since no access controls are provided by sFlow-RT.
ssh -L 8008:127.0.0.1:8008 pp@192.168.4.163
Use ssh to connect to the Raspberry Pi (192.168.4.163) and tunnel port 8008 to your laptop.
Access the web interface at http://127.0.0.1:8008/. See Getting Started for instructions for enabling monitoring and browsing metrics. Python is installed by default on Raspberry Pi OS, making it convenient to experiment with the sFlow-RT REST API, see Writing Applications.
If you don't have immediate access to a network and want to experiment, follow the instructions in Leaf and spine network emulation on Mac OS M1/M2 systems to emulate the 5 stage leaf and spine network shown above using Containerlab.
docker stop sflow-rt
Note: If you are going to try the examples, first run the command above to stop the sflow-rt image to avoid port contention when Containerlab starts an instance of sFlow-RT.
The screen capture shows a real-time view of traffic flowing across the the emulated leaf and spine network during a series iperf3 tests. The emulated results are very close to those you can expect when monitoring production traffic on a physical network.

The Raspberry Pi 4 is surprisingly capable, this pocket-sized server can easily monitor hundreds of high speed (100G+) links, providing up to the second visibility into network flows.

Tuesday, May 23, 2023

Leaf and spine network emulation on Mac OS M1/M2 systems


The GitHub sflow-rt/containerlab project contains example network topologies for the Containerlab network emulation tool that demonstrate real-time streaming telemetry in realistic data center topologies and network configurations. The examples use the same FRRouting (FRR) engine that is part of SONiC, NVIDIA Cumulus Linux, and DENT network operating systems. Containerlab can be used to experiment before deploying solutions into production. Examples include: tracing ECMP flows in leaf and spine topologies, EVPN visibility, and automated DDoS mitigation using BGP Flowspec and RTBH controls.

The Containerlab project currently has limited support for Mac OS, stating "ARM-based Macs (M1/2) are not supported, and no binaries are generated for this platform. This is mainly due to the lack of network images built for arm64 architecture as of now." However, this argument doesn't apply to the Linux based images used in these examples.

First install Docker Desktop on your Apple silicon based Mac (select the Apple Chip option).

mkdir clab
cd clab
docker run --rm -it --privileged \
  --network host --pid="host" \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /run/netns:/run/netns \
  -v $(pwd):$(pwd) -w $(pwd) \
  sflow/clab bash

Run Containerlab by typing the above commands in a terminal. This command uses a pre-built multi-architecture sflow/clab image. If you are running on an x86 platform, follow the official Containerlab Installation instructions.

git clone https://github.com/sflow-rt/containerlab.git

Download the Containerlab topologies from the sflow-rt/containerlab project.

containerlab deploy -t containerlab/clos5.yml

Start the 5 stage leaf and spine topology shown at the top of this page. The initial launch may take a couple of minutes as the container images are downloaded for the first time. Once the images are downloaded, the topology deploys in around 10 seconds.

An instance of the sFlow-RT real-time analytics engine receives industry standard sFlow telemetry from all the switches in the network. All of the switches in the topology are configured to send sFlow to the sFlow-RT instance. In this case, Containerlab is running the pre-built sflow/prometheus image which packages sFlow-RT with useful applications for exploring the data.

Connect to the web interface, http://localhost:8008. The sFlow-RT dashboard verifies that telemetry is being received from 10 agents (the 10 switches in the Clos fabric). See the sFlow-RT Quickstart guide for more information.

The screen capture shows a real-time view of traffic flowing across the network during a series iperf3 tests. Click on the sFlow-RT Apps menu and select the browse-flows application, or click here for a direct link to a chart with the settings shown above.
docker exec -it clab-clos5-h1 iperf3 -c 172.16.4.2

Each of the hosts in the network has an iperf3 server, so running the above command will test bandwidth between h1 and h4.

docker exec -it clab-clos5-leaf1 vtysh

Linux with open source routing software (FRRouting) is an accessible alternative to vendor routing stacks (no registration / license required, no restriction on copying means you can share images on Docker Hub, no need for virtual machines). FRRouting is popular in production network operating systems (e.g. Cumulus Linux, SONiC, DENT, etc.) and the VTY shell provides an industry standard CLI for configuration, so labs built around FRR allow realistic network configurations to be explored.

containerlab destroy -t containerlab/clos5.yml

When you are finished, run the above command to stop the containers and free the resources associated with the emulation. Try out other topologies from the project to explore topics such as DDoS mitigation, BGP Flowspec, and EVPN.

Moving the monitoring solution from Containerlab to production is straightforward since sFlow is widely implemented in datacenter equipment from vendors including: A10, Arista, Aruba, Cisco, Edge-Core, Extreme, Huawei, Juniper, NEC, Netgear, Nokia, NVIDIA, Quanta, and ZTE. In addition, the open source Host sFlow agent makes it easy to extend visibility beyond the physical network into the compute infrastructure.

Monday, April 10, 2023

VyOS DDoS mitigation

Real-time flow analytics on VyOS describes how to install real-time analytics based on sFlow and the sFlow-RT analytics engine. This article extends the example to show how to automatically mitigate DDoS attacks using flow analytics combined with BGP Remotely Triggered Black Hole (RTBH) / Flowspec.
vyos@vyos:~$ add container image sflow/ddos-protect
First, download the sflow/ddos-protect image.
vyos@vyos:~$ mkdir -m 777 /config/sflow-rt
Create a directory to store persistent container state.
set container network sflowrt prefix 192.168.1.0/24
Define an internal network to connect to container. Currently VyOS BGP does not allow direct connections to local addresses (e.g. 127.0.0.1), so we need to put controller on its own network so the router can connect and receive DDoS mitigation BGP RTBH / Flowspec controls.
set container name sflow-rt image sflow/ddos-protect
set container name sflow-rt host-name sflow-rt
set container name sflow-rt arguments '-Dddos_protect.router=192.168.1.1 -Dddos_protect.enable.flowspec=yes'
set container name sflow-rt environment RTMEM value 200M
set container name sflow-rt memory 0
set container name sflow-rt volume store source /config/sflow-rt
set container name sflow-rt volume store destination /sflow-rt/store
set container name sflow-rt network sflowrt address 192.168.1.2

Configure a container to run the image. The -Dddos_protect.router argument sets the BGP neighbor address, 192.168.1.1.

vyos@vyos:~$ ifconfig podman-sflowrt
podman-sflowrt: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.1  netmask 255.255.255.0  broadcast 192.168.1.255
        ether be:9e:69:f4:d0:4e  txqueuelen 1000  (Ethernet)
        RX packets 28  bytes 2662 (2.5 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 27  bytes 8032 (7.8 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
Connections to containers on sflowrt container network appear to originate from 192.168.1.1, the address assigned to VyOS interface podman-sflowrt.
set system sflow interface eth0
set system sflow interface eth1
set system sflow interface eth2
set system sflow polling 30
set system sflow sampling-rate 1000
set system sflow drop-monitor-limit 50
set system sflow server 192.168.1.2
Configure sFlow and send to sflow-rt container address 192.168.1.2.
set protocols bgp system-as 64500
set protocols bgp neighbor 192.168.1.2 port 1179
set protocols bgp neighbor 192.168.1.2 remote-as 65000
set protocols bgp neighbor 192.168.1.2 address-family ipv4-unicast
set protocols bgp neighbor 192.168.1.2 address-family ipv4-flowspec
Configure sflow-rt as BGP neighbor. Documentation ASN 64500 should be replaced by your ASN. The private ASN 65000 is a DDoS Protect default and can be changed with the -Dddos_protect.as argument.
ssh -L 8008:192.168.1.2:8008 vyos@router.example
Use ssh tunnel to connect to the container network and access web interface at http://localhost:8008.
Real-time DDoS mitigation using BGP RTBH and FlowSpec describes how to configure the DDoS protect application. The screen capture above shows the Charts page after a couple of simulated DDoS attacks on an address, 198.51.100.129, protected by the VyOS router. The charts show two ip_flood and a single udp_amplification attack - see DDoS attacks and BGP Flowspec responses for information on simulating different types of DDoS attack to test mitigation responses.
The Controls page shows three active controls. The table shows the targeted address, administrative address group, attack type, protocol, detection time, mitigation action and status of each active DDoS attack.
vyos@vyos:~$ show bgp ipv4
BGP table version is 0, local router ID is 192.168.1.1, vrf id 0
Default local pref 100, local AS 64500
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

    Network          Next Hop            Metric LocPrf Weight Path
    198.51.100.129/32
                    192.0.2.1                              0 65000 i

Displayed  1 routes and 1 total paths
The show command verifies that a Remotely Triggered Black Hole (RTBH) rule has been received for the drop mitigation actions. Advertising a black hole route risks collateral damage since it drops all traffic to the targetted host in order to protect network bandwidth and services provided by other hosts. 
vyos@vyos:~$ show bgp ipv4 flowspec detail 
BGP flowspec entry: (flags 0x418)
        Destination Address 198.51.100.129/32
        IP Protocol = 17 
        Source Port = 53 
        FS:rate 0.000000
        received for 00:00:12
        not installed in PBR
The show command verifies that a Flowspec rule has been received for the filter mitigation action. Using Flowspec to filter traffic is more targetted than using black hole routes. In this case only UDP traffic (IP Protocol 17) with Source Port 53 (DNS) is dropped, all other services provided by the targetted host are still accessible.
vyos@vyos:~$ show log container sflow-rt 
2023-04-08T00:24:14Z INFO: Starting sFlow-RT 3.0-1681
2023-04-08T00:24:16Z INFO: Version check, running latest
2023-04-08T00:24:17Z INFO: Listening, BGP port 1179
2023-04-08T00:24:18Z INFO: Listening, sFlow port 6343
2023-04-08T00:24:19Z INFO: Listening, HTTP port 8008
2023-04-08T00:24:19Z INFO: DNS server 1.1.1.1
2023-04-08T00:24:19Z INFO: app/ddos-protect/scripts/ddos.js started
2023-04-08T00:24:19Z INFO: app/prometheus/scripts/export.js started
2023-04-08T00:24:19Z INFO: app/browse-drops/scripts/top.js started
2023-04-08T00:24:19Z INFO: app/browse-flows/scripts/top.js started
2023-04-08T00:26:11Z INFO: BGP open 192.168.1.1 51252
2023-04-08T14:37:36Z INFO: DDoS drop ip_flood 198.51.100.129 local 47
2023-04-08T14:38:19Z INFO: DDoS filter udp_amplification 198.51.100.129 local 53
2023-04-08T14:38:19Z INFO: DDoS drop ip_flood 198.51.100.129 local 17
Attacks are recorded in the container log. Monitoring DDoS mitigation describes how to use Prometheus / Elasticsearch / Grafana to monitor DDoS activity and build dashboards.

This is only a partial configuration. Peering sessions with upstream routers need to be configured to propagate controls so that DDoS attack traffic can be blocked before it saturates the upstream link. The limited scrubbing capacity of the VyOS software router isn't a factor since traffic will be dropped in hardware upstream. The flexibility of the VyOS router is an advantage in providing visibility and analytics to quickly trigger mitigation actions.

Tuesday, April 4, 2023

Real-time flow analytics on VyOS

VyOS with Host sFlow agent describes support for streaming sFlow telemetry added to the open source VyOS router operating system. This article describes how to install analytics software on a VyOS router by configuring a container.
vyos@vyos:~$ add container image sflow/ddos-protect
First, download the sflow/ddos-protect image.
vyos@vyos:~$ mkdir -m 777 /config/sflow-rt
Create a directory to store persistent container state.
set container name sflow-rt image sflow/ddos-protect
set container name sflow-rt allow-host-networks
set container name sflow-rt arguments '-Dhttp.hostname=10.0.0.240'
set container name sflow-rt environment RTMEM value 200M
set container name sflow-rt memory 0
set container name sflow-rt volume store source /config/sflow-rt
set container name sflow-rt volume store destination /sflow-rt/store
Configure a container to run the image. The RMEM environment variable setting limits the amount of memory that the container will use to 200M bytes. The -Dhttp.hostname argument sets the internal web server to listen on management address, 10.0.0.240, assigned to eth0 on this router. The container has is no built-in authentication, so access needs to be limited using an ACL or through a reverse proxy - see Download and install.
set system sflow interface eth0
set system sflow interface eth1
set system sflow interface eth2
set system sflow polling 30
set system sflow sampling-rate 1000
set system sflow drop-monitor-limit 50
set system sflow server 127.0.0.1
Next, configure sFlow agent to send to localhost (127.0.0.1).
Finally connect to the web interface on the router at port 8008. The status page verifies that the sFlow-RT analytics engine is receiving sFlow from 1 sFlow Agent (the VyOS router). See Getting started for more information.
The included Flow Browser application provides an up to the second view traffic flows. Defining Flows describes the fields that can be used to break out traffic.
VyOS dropped packet notifications describes how to configure and monitor sFlow dropped packet notifications. The included Discard Browser provides an up to the second view of dropped packets.
The included Metric Browser application lets you explore the metrics that are being streamed. The chart updates in real-time as data arrives and in this case shows CPU utilization on the VyOS router. The standard set of metrics exported by the Host sFlow agent include interface counters as well as host cpu, memory, network and disk performance metrics. Metrics lists the set of available metrics.
Flow metrics with Prometheus and Grafana describes how integrate flow analytics into operational dashboards. The included Prometheus application exposes flow analytics in the standard Prometheus scrape format so that they can be logged in time series databases.
DDoS protection quickstart guide describes how to use real-time sFlow analytics with BGP Flowspec / RTBH to automatically mitigate DDoS attacks. The included DDoS Protect application detects common volumetric attacks and can apply automated responses. The screen capture shows traffic associated with a series of simulated DDoS attacks against hosts behind the VyOS router, see DDoS attacks and BGP Flowspec responses.
The embedded sFlow-RT analytics engine exposes a REST API that can be used to program flow analytics, set thresholds, monitor events, and gather statistics. In addition, the applications shown in this article were all written using sFlow-RT's embedded scripting API. See Writing Applications for more information.