Tuesday, May 19, 2026

Fixing Ghost Drops: How eBPF Rescued IPv6 Telemetry


A customer complains that they aren't getting IPFIX flow data from a router.

Use socat to check that IPFIX is being received (IANA assigned port for IPFIX is 4739):

socat -b 0 -dd -u UDP6-RECV:4739 - 2>&1
Output demonstrates that at least some IPFIX messages can be received when listening on port 4739.
2026/05/15 22:46:32 socat[108419] N using stdout for writing
2026/05/15 22:46:32 socat[108419] N starting data transfer loop with FDs [5,5] and [1,1]
2026/05/15 22:46:33 socat[108419] N received packet with 0 bytes from AF=10 [fec0:0000:0000:0000:0001:000c:2744:69f1]:50978
2026/05/15 22:46:33 socat[108419] N received packet with 0 bytes from AF=10 [fec0:0000:0000:0000:0001:000c:2744:69f1]:50978
Use tcpdump to check for IPFIX packets. This gives visibility into packets before the host network stack, so you can see packets before they are dropped by host network stack or host firewall
tcpdump -i enp0s3 -n udp port 4739
The output shows that IPFIX datagrams are being received from a second source, fec0::1:c:2744:69f0, but they aren't showing up in the socat output, so the Linux kernel must be dropping them for some reason.
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on enp0s3, link-type EN10MB (Ethernet), snapshot length 262144 bytes
21:09:57.217821 IP6 fec0::1:c:2744:69f1.50978 > fec0::27ff:fe8d:4f0b.ipfix: UDP, length 1432
21:09:57.217921 IP6 fec0::1:c:2744:69f0.50978 > fec0::27ff:fe8d:4f0b.ipfix: UDP, length 1428
A check of the host firewall and reverse path filtering settings don't explain the drops, so take a more detailed look with tcpdump with the -v (verbose) option.
tcpdump -i enp0s3 -nv udp port 4739
This time we see that packets from fec0::1:c:2744:69f0 have a bad UDP checksum.
dropped privs to tcpdump
tcpdump: listening on enp0s3, link-type EN10MB (Ethernet), snapshot length 262144 bytes
22:07:36.443823 IP6 (flowlabel 0x1991c, hlim 64, next-header UDP (17) payload length: 1230) fec0::1:c:2744:69f1.50978 > fec0::27ff:fe8d:4f0b.ipfix: [udp sum ok] UDP, length 1222
22:07:37.356528 IP6 (flowlabel 0x1991c, hlim 64, next-header UDP (17) payload length: 1008) fec0::1:c:2744:69f0.50978 > fec0::27ff:fe8d:4f0b.ipfix: [bad udp cksum 0xfc60 -> 0xfb60!] UDP, length 1000
Verify by looking at the UDP error counter:
nstat -az Udp6InErrors
Results show an increasing error counter, confirming that the packets are being dropped by the Linux kernel.
#kernel
Udp6InErrors                    5106               0.0
Unfortunately, simple fixes that might work for IPv4 (disabling/ignoring UDP checksum) are not available for IPv6, see RFC 8200 Internet Protocol, Version 6 (IPv6) Specification
Unlike IPv4, the default behavior when UDP packets are originated by an IPv6 node is that the UDP checksum is not optional. That is, whenever originating a UDP packet, an IPv6 node must compute a UDP checksum over the packet and the pseudo-header, and, if that computation yields a result of zero, it must be changed to hex FFFF for placement in the UDP header. IPv6 receivers must discard UDP packets containing a zero checksum and should log the error.
Ideally the router would send IPFIX with correctly computed UDP checksum and on closer examination it looks that some IFPIX messages from the router do have a correct UDP checksum while others do not. The messages with the correct checksum originate from the routers management CPU, while the messages with incorrect checksum are being directly generated in hardware by the routing chip.

This isn't too surprising - if this were IPv4 export, the hardware could set the UDP checksum to zero (since it is optional) and there would be no issue, However, with IPv6 the mandatory checksum must be computed - a complex calculation that is likely to produce errors.

Computation of the UDP checksum involves creating an IPv6 pseudo header (shown above) and then calculating the checksum over the IPv6 pseudo header, the UDP header, and the UDP payload.

Ideally, the router vendor will fix the issue, but it may not be possible if there are hardware limitations, or it may take time if a fix isn't seen as a priority, so a workaround is needed.

This is where eBPF comes to the rescue! The GitHub fix-udp6-checksum project uses an eBPF program, fix_checksum.c, to compute and rewrite the UDP checksum before the IPFIX packet is handed to the Linux network stack.

Dowload and compile the fix_checksum.c program on a system running Docker:

git clone https://github.com/inmoncorp/fix-udp6-checksum.git
cd fix-udp6-checksum
./build.sh

Copy the resulting fix_checksum.o file to the IPFIX collector.

Install the eBPF program on interface enp0s3

sudo tc qdisc add dev enp0s3 clsact
sudo tc filter add dev enp0s3 ingress bpf da obj fix_checksum.o sec tc/ingress

Check to see that the filter has been installed

sudo tc -s filter show dev enp0s3 ingress
The output shows that the filter is installed and that the eBPF Just-In-Time (JIT) compiler has run for maximum performance.
filter protocol all pref 49151 bpf chain 0 
filter protocol all pref 49151 bpf chain 0 handle 0x1 fix_checksum.o:[tc/ingress] direct-action not_in_hw id 230 name fix_ipfix_check tag c6a96524b9f80adb jited
Finally, using socat to verify that the missing IPFIX data is being received:
2026/05/18 14:06:02 socat[945320] N using stdout for writing
2026/05/18 14:06:02 socat[945320] N starting data transfer loop with FDs [5,5] and [1,1]
2026/05/18 14:06:03 socat[945320] N received packet with 0 bytes from AF=10 [fec0:0000:0000:0000:0001:000c:2744:69f1]:50978
2026/05/18 14:06:03 socat[945320] N received packet with 0 bytes from AF=10 [fec0:0000:0000:0000:0001:000c:2744:69f0]:50978
eBPF is a game changer, providing the ability to change how Linux processes packets without having to build a new kernel or even reboot a production system. What would otherwise have been a major issue is transformed into a relatively straightforward fix and a happy customer.

No comments:

Post a Comment