Thursday, March 30, 2023

Dropped packet reason codes in Linux 6+ kernels

Using sFlow to monitor dropped packets describes support for standard sFlow Dropped Packet Notications in the open source Host sFlow agent. This article describes additional capabilities in Linux 6+ kernels that clarify reasons why packets are dropped in the kernel.

The recent addition of dropreason.h in Linux 6+ kernels provides detailed reasons for packet drops. The netlink drop_monitor API has been extended to include the NET_DM_ATTR_REASON attribute to report the drop reason, see net_dropmon.h.

The following example illustrates the value of the reason code in explaining Linux packet drops.

tcp_v4_rcv+0x7c/0xef0
The value of NET_DM_ATTR_SYMBOL shown above indicates that the packet was dropped in the tcp_v4_rcv function in Linux kernel at memory location 0x7c/0xef0. While this information is helpful, there are many reasons why a TCP packet may be dropped.
NO_SOCKET
In this case, the value of NET_DM_ATTR_REASON shown above indicates that the TCP packet was dropped because no application had opened a socket and so there was nowhere to deliver the packet.

In the case of Linux-based hardware switches or smart network adapters, where packet processing is offloaded to hardware, the netlink drop_monitor events include NET_DM_ATTR_HW_TRAP_GROUP_NAME and NET_DM_ATTR_HW_TRAP_NAME attributes and packet header information supplied by the hardware driver, see Devlink Trap.

The latest version of the open source Host sFlow agent includes adds support for the NET_DM_ATTR_REASON attribute to improve the accuracy of the sFlow drop_reason.

port_unreachable
In our example, the Host sFlow is now able to report port_unreachable as the reason for the dropped packet, rather than a generic unknown_l4 reason reported for older kernels.

The screen capture at the top of this article shows dropped packet information displayed in real-time using the Discard Browser application running on the sFlow-RT analytics engine. The chart demonstrates how the combination of information from the header of the dropped packet along with the reason for dropping the packet quickly gets to the root cause of the packet drop. In this case an attempt has been made from 172.16.1.174 to connect to 172.16.1.1 via telnet (tcp port 23) and telnet has not been enabled on the server so the packet was dropped - as it should be since telnet is not a secure method of connecting.

docker run --name sflow-rt -p 8008:8008 -p 6343:6343/udp -d sflow/prometheus

A quick way to experiment with sFlow is to run the pre-built sflow/prometheus image using Docker. The bundled Discard Browser with the settings shown in the screen capture can be launched by clicking here.

Monday, March 27, 2023

VyOS dropped packet notifications

VyOS with Host sFlow agent describes how to configure and analyze industry standard sFlow telemetry recently added to the VyOS open source router platform. This article discusses sFlow dropped packet notifications support added to the latest release.

Dropped packets have a profound impact on network performance and availability. Packet discards due to congestion can significantly impact application performance. Dropped packets due to black hole routes, expired TTLs, MTU mismatches, etc. can result in insidious connection failures that are time consuming and difficult to diagnose. Visibility into dropped packets offers significant benefits for network troubleshooting, providing real-time network-wide visibility into the specific packets that were dropped as well the reason the packet was dropped. This visibility instantly reveals the root cause of drops and the impacted connections.

vyos@vyos:~$ show version
Version:          VyOS 1.4-rolling-202303260914
Release train:    current

Built by:         autobuild@vyos.net
Built on:         Sun 26 Mar 2023 09:14 UTC
Build UUID:       72b34f74-bfcd-4b51-9b95-544319c2dac5
Build commit ID:  d68bda6a295ba9

Architecture:     x86_64
Boot via:         installed image
System type:       guest

Hardware vendor:  innotek GmbH
Hardware model:   VirtualBox
Hardware S/N:     0
Hardware UUID:    df0a2b79-b8c4-8342-a27f-76aa3e52ad6d

Copyright:        VyOS maintainers and contributors

Verify that the version of of VyOS is VyOS 1.4-rolling-202303260914 or later.

On VyOS dropped packet monitoring relies on instrumentation built into recent Linux kernels and exposed through the netlink drop_monitor API. Enabling drop_monitor in VyOS kernel configuration allows the Host sFlow agent to capture and export information on dropped packets.
set system sflow interface eth0
set system sflow interface eth1
set system sflow interface eth2
set system sflow polling 30
set system sflow sampling-rate 1000
set system sflow drop-monitor-limit 50
set system sflow server 10.0.0.30 port 6343
The drop-monitor-limit configuration entry enables dropped packet monitoring and sets a rate limit of 50 dropped packets notifications per second.
docker run --name sflow-rt -p 8008:8008 -p 6343:6343/udp -d sflow/prometheus

A quick way to experiment with sFlow is to run the pre-built sflow/prometheus image using Docker on the sFlow server (in this case on 10.0.0.30). The chart at the top of the page uses the Discard Browser application to display an up to the second view of packets dropped by the VyOS router, click on this link to open the application with the settings shown.

The chart shows the results of two tests, the first a failed attempt to connect to the VyOS router using telnet (telnet has been disabled in the router config), and the second a traceroute test between two hosts connected to the router. The reason field reports the sFlow drop reason code and the function reports the linux kernel function that dropped the packet. With the telnet test, the packet was dropped in the tcp_v4_rcv function and is reported as an unknown_l4 sFlow reason. In the case of the traceroute test, 3 packets were dropped in the ip_forward function and are reported as unknown_l3 reason.

Enabling sFlow dropped packet notifications on all switches, routers, and hosts provides end-to-end visibility into dropped packets, rapidly identifying the location and reason for packet drops as well as identifying the impacted services.

Dropped packet monitoring complements sFlow's existing counter polling and packet sampling mechanisms and shares a common data model so that all three sources of data can be correlated. For example, if packets are being discarded because of buffer exhaustion, the discard records don't necessarily tell the whole story. The discarded packets may represent mice flows that are victims of an elephant flow. Packet samples will reveal the traffic that isn't being dropped and provide a more complete picture. Counter data adds additional information such as CPU load, interface speed, link utilization, packet and discard rates that further completes the picture.

Friday, March 17, 2023

VyOS with Host sFlow agent

VyOS described deficiencies with the embedded sFlow implementation in the open source VyOS router operating system and suggested that the open source Host sFlow agent be installed as an alternative. The VyOS developer community embraced the suggestion and has been incredibly responsive,  integrating, and releasing a version of VyOS with Host sFlow support within a week.
vyos@vyos:~$ show version
Version:          VyOS 1.4-rolling-202303170317
Release train:    current

Built by:         autobuild@vyos.net
Built on:         Fri 17 Mar 2023 03:17 UTC
Build UUID:       45391302-1240-4cc7-95a8-da8ee6390765
Build commit ID:  e887f582cfd7de

Architecture:     x86_64
Boot via:         installed image
System type:       guest

Hardware vendor:  innotek GmbH
Hardware model:   VirtualBox
Hardware S/N:     0
Hardware UUID:    871dd0f0-c4ec-f147-b1a7-ed536511f141

Copyright:        VyOS maintainers and contributors
Verify that the version of of VyOS is VyOS 1.4-rolling-202303170317 or later
set system sflow interface eth0
set system sflow interface eth1
set system sflow interface eth2
set system sflow polling 30
set system sflow sampling-rate 1000
set system sflow server 10.0.0.30 port 6343
The above commands configure sFlow export in the VyOS CLI using the embedded Host sFlow agent.
docker run --name sflow-rt -p 8008:8008 -p 6343:6343/udp -d sflow/prometheus
A quick way to experiment with sFlow is to run the pre-built sflow/prometheus image using Docker on the sFlow server (in this case on 10.0.0.30). The chart at the top of the page uses the Flow Browser application to display an up to the second view of the largest tcp flows through the VyOS router, click on this link to open the application with the settings shown.
Flow metrics with Prometheus and Grafana describes how integrate flow analytics into operational dashboards.
DDoS protection quickstart guide describes how to use real-time sFlow analytics with BGP Flowspec / RTBH to automatically mitigate DDoS attacks.

Saturday, March 11, 2023

VyOS

VyOS is an open source router operating system based on Linux. This article discusses how to improve network traffic visibility on VyOS based routers using the open source Host sFlow agent.

VyOS claims sFlow support, so why is it necessary to install an alternative sFlow agent? The following experiment demonstrates that there are significant issues with the VyOS sFlow implementation.

vyos@vyos:~$ show version
Version:          VyOS 1.4-rolling-202301260317
Release train:    current

Built by:         autobuild@vyos.net
Built on:         Thu 26 Jan 2023 03:17 UTC
Build UUID:       a95385b7-12f9-438d-b49c-b91f47ea7ab7
Build commit ID:  d5ea780295ef8e

Architecture:     x86_64
Boot via:         installed image
System type:      KVM guest

Hardware vendor:  innotek GmbH
Hardware model:   VirtualBox
Hardware S/N:     0
Hardware UUID:    6988d219-49a6-0a4a-9413-756b0395a73d

Copyright:        VyOS maintainers and contributors
Install a recent version of VyOS under VirtualBox and configure routing between two Linux virtual machines connected to eth1 and eth2 on the router. Out of band management is configured on eth0.
set system flow-accounting disable-imt
set system flow-accounting sflow agent-address 10.0.0.50
set system flow-accounting sflow sampling-rate 1000
set system flow-accounting sflow server 10.0.0.30 port 6343
set system flow-accounting interface eth0
set system flow-accounting interface eth1
set system flow-accounting interface eth2
The above commands configure sFlow monitoring on VyOS using the native sFlow agent.
The sflow/sflow-test tool is used to test the sFlow implementation while generating traffic consisting of a series of iperf3 tests (each generating approximately 50Mbps). The test fails in a number of significant ways:
  1. The implementation of sFlow is incomplete, omitting required interface counter export
  2. The peak traffic reported (3Mbps) is a fraction of the traffic generated by iperf3
  3. There is an inconsistency in the packet size reported in the sFlow messages
  4. Tests comparing counters and flow data fail because of missing counter export (1)
Fortunately, VyOS is a Linux based operating system, so we can install the Host sFlow agent as an alternative to the native sFlow implementation to provide traffic visibility.
delete system flow-accounting
First, disable the native VyOS sFlow agent.
wget https://github.com/sflow/host-sflow/releases/download/v2.0.38-1/hsflowd-ubuntu20_2.0.38-1_amd64.deb
sudo dpkg -i hsflowd-ubuntu20_2.0.38-1_amd64.deb
Next, download and install the Host sFlow agent by typing the above commands in VyOS shell.
# hsflowd configuration file
# http://sflow.net/host-sflow-linux-config.php

sflow {
  collector { ip=10.0.0.30 }
  pcap { dev = eth0 }
  pcap { dev = eth1 }
  pcap { dev = eth2 }
}
Edit the /etc/hsflowd.conf file.
systemctl restart hsflowd
Restart the sFlow agent to pick up the new configuration.
Rerunnig sflow-test shows that the implementation now passes. The peaks shown in the trend graph are consistent with the traffic generated by iperf3 and with traffic levels reported in interface counters.
The sflow/sflow-test Docker image also includes the Flow Browser application that can be used to monitor traffic flows in real-time. The screen shot above shows traffic from a single iperf3 test.
The sflow/sflow-test Docker image also includes the Metric Browser application that can be used to monitor counters in real-time. The screen shot above shows cpu_utilization.

The sFlow Test, Browse Flows and Browse Metrics applications run on the sFlow-RT analytics engine. Additional examples include Flow metrics with Prometheus and Grafana and DDoS protection quickstart guide.