Monday, March 22, 2021

In-band Network Telemetry (INT)

The recent addition of in-band streaming telemetry (INT) measurements to the sFlow industry standard simplifies deployment by addressing the operational challenges of in-band monitoring.

The diagram shows the basic elements of In-band Network Telemetry (INT) in which the ingress switch is programmed to insert a header containing measurements to packets entering the network. Each switch in the path is programmed to append additional measurements to the packet header. The egress switch is programmed to remove the header so that the packet can be delivered to its destination. The egress switch is responsible for processing the measurements or sending them on to analytics software.

There are currently two competing specifications for in-band telemetry:

  1. In-band Network Telemetry (INT) Dataplane Specification
  2. Data Fields for In-situ OAM

Common telemetry attributes from both standards include:

  1. node id
  2. ingress port
  3. egress port
  4. transit delay (egress timestamp - ingress timestamp)
  5. queue depth

Visibility into network forwarding performance is very useful, however, there are practical issues that should be considered with the in-band telemetry approach for collecting the measurements:

  1. Transporting measurement headers is complex with different encapsulations for each transport protocol:  Geneve, VxLAN, GRE, UDP, TCP etc.
  2. Addition of headers increases the size of packets and risks causing traffic to be dropped downstream due to maximum transmission unit (MTU) restrictions.
  3. The number of measurements that can be added by each switch and the number of switches adding measurements in the path needs to be limited.
  4. In-band telemetry cannot be incrementally deployed. Ideally, all devices need to participate, or at a minimum, the ingress and egress devices need to be in-band telemetry aware.
  5. In-band telemetry transports data from the data plane to the control/management planes, providing a potential attack surface that could be exploited by crafting malicious packets with fake measurement headers.
  6. There is no standard mechanism for transporting measurements from the egress switch for analysis.
  7. There is no data model to link in-band telemetry to other sources of data (NETCONF, SNMP, etc.)

The sFlow Transit Delay Structures extension addresses these issues by defining how the in-band network telemetry attributes can be exported in real-time using the industry standard sFlow protocol.

The sFlow architecture, shown at the top of this article, provides an out of band alternative for transporting the per packet forwarding plane measurements. The switch ASIC attaches performance measurements as metadata to sampled packets sent to the sFlow Agent instead of adding the measurements to the egress packet. The sFlow Agent immediately forwards the additional packet metadata as part of the standard sFlow telemetry stream to a central sFlow analyzer. The sFlow Analyzer provides a real-time view of the performance of the entire network.

Using sFlow as the telemetry transport has a number of benefits:

  1. Simple to deploy since there is no modification of packets (no issues with encapsulations, MTU, number of measurements, path length, incremental deployment, etc.)
  2. Extensibility of sFlow protocol allows additional forwarding plane measurements to augment existing sFlow measurements, fully integrating the new measurements with sFlow data exported from other switches in the network (Arista, Aruba, Cisco, Dell, Huawei, Juniper, etc.)
  3. sFlow's is a unidirectional telemetry transport protocol originates from the device management plane, can be sent out of band, limiting possible attack surfaces.
  4. Measurements are delivered in real-time directly to the sFlow Analyzer.
  5. sFlow data model links telemetry to external data (SNMP, NETCONF, OpenConfig, etc.)

Transit delay and queueing describes the new sFlow measurements in more detail and demonstrates a working implementation. The instrumentation to support these measurements is widely available in current generation network ASICs. If you are interested in visibility into network performance, ask your network vendor about their plans to implement the sFlow Transit Delay Structures extension.

Wednesday, March 17, 2021

Transit delay and queueing

The recently finalized sFlow Transit Delay Structures extension provides visibility into the performance of packet forwarding in a switch or router using the industry standard sFlow protocol.

The diagram provides a logical representation of packet forwarding. A packet is received at an Ingress Port, the packet header is examined and a forwarding decision is made to add the packet to one of the queues associated with an Egress Port, finally the packet is removed from the queue and sent out the Egress Port to be received by the next device in the chain.

The time between sending and receiving a packet is the packet's transit delay. The transit delay is affected by the time it takes to make the forwarding decision and the time the packet spends in the queue. Identifying the specific queue selected and the number of bytes already in the queue fills out the set of performance metrics for the forwarding decision. The sFlow Transit Delay Structures extension adds these performance metrics to the metadata associated with each packet sample. 

The following output from sflowtool shows that data contained in a packet sample:

startSample ----------------------
sampleType_tag 0:1
sampleSequenceNo 91159
sourceId 0:2216
meanSkipCount 400
samplePool 36463600
dropEvents 0
inputPort 2215
outputPort 2216
flowBlock_tag 0:1036
extendedType egress_queue
egress_queue_id 7
flowBlock_tag 0:1040
extendedType queue_depth
queue_depth_bytes 11354112
flowBlock_tag 0:1039
extendedType transit_delay
transit_delay_nS 839660224
flowBlock_tag 0:1
flowSampleType HEADER
headerProtocol 1
sampledPacketSize 1446
strippedBytes 4
headerLen 128
headerBytes 98-03-9B-8F-B5-CC-98-03-9B-94-C7-D5-08-00-45-16-05-94-12-C7-00-00-FE-11-B8-43-C0-00-02-02-C6-33-64-02-30-39-D4-31-05-80-D7-1D-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42-42
dstMAC 98039b8fb5cc
srcMAC 98039b94c7d5
IPSize 1428
ip.tot_len 1428
IPProtocol 17
IPID 50962
UDPSrcPort 12345
UDPDstPort 54321
UDPBytes 1408
endSample   ----------------------

The forwarding performance information is highlighted. The inputPort, outputPort, egress_queue_id, queue_depth_bytes, and transit_delay_nS values describe the performance observed by the sampled packet. The sampled packet header allows performance to be reported by specific hosts, protocols, ports, connections, etc.

Linux 4.11 kernel extends packet sampling support describes the Linux PSAMPLE interface used by the Host sFlow agent to receive packet samples. PSAMPLE has been extended in the Linux 5.13 kernel to add the performance metrics, PSAMPLE_ATTR_OUT_TC (egress queue), PSAMPLE_ATTR_OUT_TC_OCC (egress queue depth), and PSAMPLE_ATTR_LATENCY (transit delay) needed to populate the sFlow Transit Delay Structures and the Host sFlow agent now exports the performance data when it is available. The decoded sFlow record, above, was generated by Host sFlow running on a hardware switch and shows measurements made by the switch ASIC.

PSAMPLE and the Host sFlow agent are becoming the standard for sFlow monitoring of Linux based operating systems such as Cumulus Linux, DENT, and SONiC.  As ASIC vendors include the measurements in their device driver PSAMPLE support, they will automatically be included in the sFlow telemetry.

Support for the new extensions has also been added to the sFlow-RT real-time analytics engine. The open source sFlow-RT Flow Browser application shown in the screen shot above displays a real-time, up to the second, view of traffic based on the packet sample telemetry streaming from network devices (switches, routers, and hosts).

In the chart above, the value being plotted has been changed from Bits per Second and is now displaying flows with the highest transit delay (in nanoseconds). The specific device, ingress port, egress port, and egress queue are also identified. 

In the chart above, queue depth (in bytes) is displayed, showing that the nearly 12 Mbytes queue depth is responsible for the transit delay seen in the previous chart.

If the queue is full and the packet is dropped, the sFlow Dropped Packet Notification Structures extension allows the sFlow agent to report details of the dropped packet. Using sFlow to monitor dropped packets describes how the Host sFlow agent uses the Linux drop_monitor interface to implement the extension.

In the final chart above, the open source sFlow-RT Discard Browser application displays a sequence of packets being dropped by a switch as a host attempts, and fails, to establish a TCP connection. The reason for dropping the packets (an access control list) as well as device and ingress port where the packets were dropped are captured.

Transit delay and dropped packet monitoring leverage advanced instrumentation in the latest generation of network ASICs to provide valuable insight into network performance. Integration with industry standard streaming sFlow telemetry provides real-time network-wide visibility into traffic, performance, and errors.

Tuesday, March 9, 2021

InfluxDB 2.0 released

InfluxData advances possibilities of time series data with general availability of InfluxDB 2.0 announced the production release of InfluxDB 2.0. This article demonstrates how to import sFlow data into InfluxDB 2.0 using sFlow-RT in order to provide visibility into network traffic.

Real-time network and system metrics as a service describes how to use Docker Desktop to replay previously captured sFlow data. Follow the instructions in the article to start an instance of sFlow-RT.

Create a directory for InfluxDB to use to store data and configuration settings:
mkdir data
Now start InfluxDB using the pre-built influxdb image:
docker run --rm --name=influxdb -p 8086:8086 \
-v  $PWD/data:/var/lib/influxdb2 influxdb:alpine \

Note: sFlow-RT is collecting metrics for all the sFlow agents embedded in switches, routers, and servers. The default value of nats-max-payload-bytes (1048576) may be too small to hold all the metrics returned when sFlow-RT is queried. The error,  nats: maximum payload exceeded, in InfluxDB logs indicates that the limit needs to be increased. In this example, the value has been increased to 10000000.

Now access the InfluxDB web interface at http://localhost:8086/

The screen capture above shows three scrapers configured in InfluxDB 2.0:
  1. sflow-analyzer
    URL: http://host.docker.internal:8008/prometheus/analyzer/txt
  2. sflow-metrics
    URL: http://host.docker.internal:8008/prometheus/metrics/ALL/ALL/txt
  3. sflow-flow-src-dst
    URL: http://host.docker.internal:8008/app/prometheus/scripts/export.js/flows/ALL/txt?metric=flow_src_dst_bps&key=ipsource,ipdestination&value=bytes&aggMode=max&maxFlows=100&minValue=1000&scale=8
The first collects metrics about the performance of the sFlow-RT analytics engine, the second, all the metrics exported by the sFlow agents, and the third, is a flow metric.
InfluxDB 2.0 now includes the data exploration and dashboard building capabilities that were previously in the separate Chronograf application. The screen capture above shows a simple chart trending the flow metric.

Monday, March 1, 2021

DDoS Mitigation with Juniper, sFlow, and BGP Flowspec

Real-time DDoS mitigation using BGP RTBH and FlowSpec, DDoS protection of local address space, Pushing BGP Flowspec rules to multiple routersMonitoring DDoS mitigation, and Docker DDoS testbed demonstrate how sFlow and BGP Flowspec are combined by the DDoS Protect application running on the sFlow-RT real-time analytics engine to automatically detect and block DDoS attacks.

This article discusses how to deploy the DDoS Protect application in a Juniper Networks environment. Juniper has a long history of supporting BGP Flowspec on their routing platforms and Juniper has added support for sFlow to their entire product range, see sFlow available on Juniper MX series routers.

First, Junos doesn't provide a way to connect to the non-standard BGP port (1179) that sFlow-RT uses by default. Allowing sFlow-RT to open the standard BGP port (179) requires that the service be given additional Linux capabilities. 

docker run --rm --net=host --sysctl net.ipv4.ip_unprivileged_port_start=0 \
sflow/ddos-protect -Dbgp.port=179

The above command launches the prebuilt sflow/ddos-protect Docker image. Alternatively, if sFlow-RT has been installed as a deb / rpm package, then the required permissions can be added to the service.

sudo systemctl edit sflow-rt.service
Type the above command to edit the service configuration and add the following lines:
Next, edit the sFlow-RT configuration file for the DDoS Protect application:
sudo vi /usr/local/sflow-rt/conf.d/ddos-protect.conf
and add the line:
Finally, restart sFlow-RT:
sudo systemctl restart sflow-rt
The application is now listening for BGP connections on TCP port 179.

Now configure the router to send sFlow telemetry to sFlow-RT - see Junos: sFlow Monitoring Technology
set protocols sflow collector udp-port 6343
set protocols sflow polling-interval 20
set protocols sflow sample-rate ingress 1000
set protocols sflow interfaces ge-0/0/0
set protocols sflow interfaces ge-0/0/1
For example, the above commands enable sFlow monitoring on a Juniper MX router. See sFlow-RT Agents for recommended sFlow configuration settings.

Also configure a BGP Flowspec session with sFlow-RT - see Junos: Multiprotocol BGP.
policy-options {
    policy-statement ACCEPT_ALL {
        from protocol bgp;
        then accept;
routing-options {
    autonomous-system 65000;
protocols {
    bgp {
        group sflow-rt {
            type internal;
            family inet {
                flow {
                    no-validate ACCEPT_ALL;
            family inet6 {
                flow {
                    no-validate ACCEPT_ALL;
            neighbor {
                import ACCEPT_ALL;
                peer-as 65000;
The above configuration establishes the BGP Flowspec session with sFlow-RT.

Real-time DDoS mitigation using BGP RTBH and FlowSpec describes how to simulate a DDoS UDP amplification attack in order to test the automated detection and control functionality.  
root@07358a106c21> show route table inetflow.0 detail    

inetflow.0: 1 destinations, 1 routes (1 active, 0 holddown, 0 hidden),*,proto=17,srcport=53/term:N/A (1 entry, 0 announced)
        *BGP    Preference: 170/-101
                Next hop type: Fictitious, Next hop index: 0
                Address: 0x55653aae979c
                Next-hop reference count: 1
                Next hop: 
                State: <Active Int Ext SendNhToPFE>
                Local AS: 65000 Peer AS: 65000
                Age: 6 
                Validation State: unverified 
                Task: BGP_65000.
                AS path: I 
                Communities: traffic-rate:0:0
                Localpref: 100
                Router ID:
Command line output from the router shown above verifies that a Flowspec control blocking the amplification attack has been received. The control will remain in place for 60 minutes (the configured timeout), after which it will be automatically withdrawn. If the attack is still in progress it will be immediately detected and the control reapplied.

DDoS Protect can mitigate a wide range of common attacks, including: NTP, DNS, Memcached, SNMP, and SSDP amplification attacks; IP, UDP, ICMP and TCP flood attacks; and IP fragmentation attacks. Mitigation options include: remote triggered black hole (RTBH), filtering, rate limiting, DSCP marking, and redirection. IPv6 is fully supported in detection and mitigation of each of these attack types.