Thursday, May 20, 2021

Linux as a network operating system

NVIDIA Linux Switch enables any standard Linux distribution to be used as the operating system on the NVIDIA Spectrum™ switches. Unlike network operating systems that are Linux based, where you are limited to a specific version of Linux and control of the hardware is restricted to vendor specific software modules, Linux Switch allows you to install an unmodified version of your favorite Linux distribution along with familiar Linux monitoring and orchestration tools. 

The key to giving Linux control of the switch hardware is the switchdev module - a standard part of recent Linux kernels. Linux switchdev is an in-kernel driver model for switch devices which offload the forwarding (data) plane from the kernel. Integrating switch ASIC drivers in the Linux kernel makes switch ports appear as additional Linux network interfaces that can be configured and managed using standard Linux tools.

The mlxsw wiki provides instructions for installing Linux using ONIE or PXE boot on Mellanox switch hardware, for example, on NVIDIA® Spectrum®-3 based SN4000 series switches, providing 1G - 400G port speeds to handle scale-out data center applications.

Major benefits of using standard Linux as the switch operating system include:

  • no licensing fees, feature restrictions, or license management complexity associated proprietary network operating systems
  • large ecosystem of open source and commercial software available for Linux
  • software updates and security patches available through Linux distribution
  • install same Linux distribution on the switches and servers to reduce operational complexity and leverage existing expertise
  • run instances of the Linux distribution as virtual machines or containers to test configurations and develop automation scripts
  • standard Linux APIs, and availability of Linux developers, lowers the barrier to customization, making it possible to tailor network behavior to address application / business requirements

The switchdev driver for NVIDIA Spectrum ASICs exposes advanced dataplane instrumentation through standard Linux APIs. This article will explore how the open source Host sFlow agent uses the standard Linux APIs to stream real-time telemetry from the ASIC using industry standard sFlow.

The diagram shows the elements of the solution. Host sFlow agents installed on servers and switches stream sFlow telemetry to an instance of the sFlow-RT real-time analytics engine. The analytics provide a comprehensive, up to the second, view of performance to drive automation.

Note: If you are unfamiliar with sFlow, or want to hear about the latest developments, Real-time network telemetry for automation provides an overview and includes a demonstration of monitoring and troubleshooting network and system performance of a GPU cluster.

Download the latest Host sFlow agent sources:

git clone

INSTALL.Linux provides information on compiling Host sFlow on Linux. The following instructions assume a DEB based distrubution (Debian, Ubuntu):

cd host-sflow/

It isn't necessary to install development tools on the switch. All major Linux distributions are available as Docker images. Select a Docker image that matches the operating system version on the switch and use it to build the package.

Copy the resulting hsflowd package to the switch and install:

sudo dpkg -i hsflowd_2.0.34-3_amd64.deb

Next, edit the /etc/hsflowd.conf file to configure the agent:

sflow {
  collector { ip= }
  systemd { }
  psample { group=1 egress=on }
  dropmon { group=1 start=on sw=off hw=on }
  dent { sw=off switchport=swp.* }

In this case, is the address of the sFlow collector and swp.* is a regular expression used to identify front panel switch ports. The systemd{} module monitors services running on the switch - see Monitoring Linux services, the psample{} module receives randomly sampled packets from the switch ASIC - see Linux 4.11 kernel extends packet sampling support, the dropmon{} module receives dropped packet notifications - see Using sFlow to monitor dropped packets, and the dent{} module automaticallly configures packet sampling of traffic on front panel switch ports - see Packet Sampling.

Note: The same configuration file can be used for for every switch in the network, making configuration of the agents easy to automate. 

Enable and start the agent.

sudo systemctl enable hsflowd.service
sudo systemctl start hsflowd.service

Finally, use the pre-built sflow/prometheus Docker image to start a copy the sFlow-RT real-time analytics software on the collector host (

docker run -p 8008:8008 -p 6343:6343/udp -d sflow/prometheus

The web interface is accessible on port 8008.

The included Metric Browser application lets you explore the metrics that are being streamed. The chart update in real-time as data arrives and in this case identifies the interface in the network with the greatest utilization. The standard set of metrics exported by the Host sFlow agent include interface counters as well as host cpu, memory, disk and service performance metrics. Metrics lists the set of available metrics.

The included Flow Browser application provides an up to the second view traffic flows. Defining Flows describes the fields that can be used to break out traffic. 

Note: The NVIDIA Spectrum 2/3 ASIC includes packet transit delay, selected queue and queue depth with each sampled packet. This information is delivered via the Linux PSAMPLE netlink channel to the Host sFlow agent and included in the sFlow telemetry. These fields are accessible when defining flows in sFlow-RT. See Transit delay and queueing for details.

The included Discard Browser is used to explore packets that are being dropped in the network.

Note: The NVIDIA Spectrum 2/3 ASIC includes instrumentation to capture dropped packets and the reason they were dropped. The information is delivered via the Linux drop_monitor netlink channel to the Host sFlow agent and included in the sFlow telemetry. See Real-time trending of dropped packets for more information.

The included Prometheus application exports metrics to the Prometheus time series database where they can be used to drive Grafana dashboards (e.g. sFlow-RT Countries and Networks, sFlow-RT Health, and sFlow-RT Network Interfaces).

Linux as a network operating system is an exciting advancement if you are interested in simplifying network and system management. Using the Linux networking APIs as a common abstraction layer on servers and switches makes it possible to manage network and compute infrastructure as a unified system.

No comments:

Post a Comment