Monday, March 27, 2023

VyOS dropped packet notifications

VyOS with Host sFlow agent describes how to configure and analyze industry standard sFlow telemetry recently added to the VyOS open source router platform. This article discusses sFlow dropped packet notifications support added to the latest release.

Dropped packets have a profound impact on network performance and availability. Packet discards due to congestion can significantly impact application performance. Dropped packets due to black hole routes, expired TTLs, MTU mismatches, etc. can result in insidious connection failures that are time consuming and difficult to diagnose. Visibility into dropped packets offers significant benefits for network troubleshooting, providing real-time network-wide visibility into the specific packets that were dropped as well the reason the packet was dropped. This visibility instantly reveals the root cause of drops and the impacted connections.

vyos@vyos:~$ show version
Version:          VyOS 1.4-rolling-202303260914
Release train:    current

Built by:         autobuild@vyos.net
Built on:         Sun 26 Mar 2023 09:14 UTC
Build UUID:       72b34f74-bfcd-4b51-9b95-544319c2dac5
Build commit ID:  d68bda6a295ba9

Architecture:     x86_64
Boot via:         installed image
System type:       guest

Hardware vendor:  innotek GmbH
Hardware model:   VirtualBox
Hardware S/N:     0
Hardware UUID:    df0a2b79-b8c4-8342-a27f-76aa3e52ad6d

Copyright:        VyOS maintainers and contributors

Verify that the version of of VyOS is VyOS 1.4-rolling-202303260914 or later.

On VyOS dropped packet monitoring relies on instrumentation built into recent Linux kernels and exposed through the netlink drop_monitor API. Enabling drop_monitor in VyOS kernel configuration allows the Host sFlow agent to capture and export information on dropped packets.
set system sflow interface eth0
set system sflow interface eth1
set system sflow interface eth2
set system sflow polling 30
set system sflow sampling-rate 1000
set system sflow drop-monitor-limit 50
set system sflow server 10.0.0.30 port 6343
The drop-monitor-limit configuration entry enables dropped packet monitoring and sets a rate limit of 50 dropped packets notifications per second.
docker run --name sflow-rt -p 8008:8008 -p 6343:6343/udp -d sflow/prometheus

A quick way to experiment with sFlow is to run the pre-built sflow/prometheus image using Docker on the sFlow server (in this case on 10.0.0.30). The chart at the top of the page uses the Discard Browser application to display an up to the second view of packets dropped by the VyOS router, click on this link to open the application with the settings shown.

The chart shows the results of two tests, the first a failed attempt to connect to the VyOS router using telnet (telnet has been disabled in the router config), and the second a traceroute test between two hosts connected to the router. The reason field reports the sFlow drop reason code and the function reports the linux kernel function that dropped the packet. With the telnet test, the packet was dropped in the tcp_v4_rcv function and is reported as an unknown_l4 sFlow reason. In the case of the traceroute test, 3 packets were dropped in the ip_forward function and are reported as unknown_l3 reason.

Enabling sFlow dropped packet notifications on all switches, routers, and hosts provides end-to-end visibility into dropped packets, rapidly identifying the location and reason for packet drops as well as identifying the impacted services.

Dropped packet monitoring complements sFlow's existing counter polling and packet sampling mechanisms and shares a common data model so that all three sources of data can be correlated. For example, if packets are being discarded because of buffer exhaustion, the discard records don't necessarily tell the whole story. The discarded packets may represent mice flows that are victims of an elephant flow. Packet samples will reveal the traffic that isn't being dropped and provide a more complete picture. Counter data adds additional information such as CPU load, interface speed, link utilization, packet and discard rates that further completes the picture.

No comments:

Post a Comment