sFlow

Monday, December 7, 2015

Broadcom BroadView Instrumentation

The diagram above, from the BroadView™ 2.0 Instrumentation Ecosystem presentation, illustrates how instrumentation built into the network Data Plane (the Broadcom Trident/Tomahawk ASICs used in most data center switches) provides visibility to Software Defined Networking (SDN) controllers so that they can optimize network performance.

The sFlow measurement standard provides open, scaleable, multi-vendor, streaming telemetry that supports SDN applications. Broadcom has been augmenting the rich set of counter and flow measurements in the base sFlow standard with additional metrics. For example, Broadcom ASIC table utilization metrics, DevOps, and SDN describes metrics that were added to track ASIC table resource consumption.

The highlighted Buffer congestion state / statistics capability in the slide refers to the BroadView Buffer Statistics Tracking (BST) instrumentation. The Memory Management Unit (MMU) is on-chip logic that manages how the on-chip packet buffers are organized. BST is a feature that enables tracking the usage of these buffers. It includes snapshot views of peak utilization of the on-chip buffer memory across queues, ports, priority group, service pools and the entire chip.

The above chart from the Broadcom technical brief, Building an Open Source Data Center Monitoring Tool Using Broadcom BroadView™ Instrumentation Software, shows buffer utilization trended over an hour.

While the trend chart is useful, the value of BST instrumentation is fully realized when the data is integrated into the sFlow telemetry stream, allowing buffer utilizations to be correlated with traffic flows consuming the buffers. Broadcom's recently published sFlow extension, sFlow Broadcom Peak Buffer Utilization Structures, standardizes the export of the buffer metrics, ensures multi-vendor interoperability, and providing the comprehensive, actionable, telemetry from the network required by SDN applications.

Ask switch vendors about their plans to support the extension in their sFlow implementations. The enhanced visibility into buffer utilization addresses a number of important use cases:

Fabric-wide visibility into peak buffer utilization
Derive worst end to end case latency
Pro-actively track microbursts and identify hot spots before packets are lost
Correlate with traffic flows and link utilizations
Improve performance through QoS marking, load spreading, and workload placement

Wednesday, December 2, 2015

DDoS Blackhole

DDoS Blackhole has been released on GitHub, https://github.com/sflow-rt/ddos-blackhole. The application detects Distributed Denial of Service (DDoS) flood attacks in real-time and can automatically install a null / blackhole route to drop the attack traffic and maintain Internet connectivity. See DDoS for additional background.

The screen capture above shows a simulated DNS amplification attack. The Top Targets chart is a real-time view of external traffic to on-site IP addresses. The red line indicates the threshold that has been set at 10,000 packets per second and it is clear that traffic to address 192.168.151.4 exceeds the threshold. The Top Protocols chart below shows that the increase in traffic is predominantly DNS. The Controls chart shows that a control was added the instant the traffic crossed the threshold.

The Controls tab shows a table of the currently active controls. In this case, the controller is running in Manual mode and is listed with a pending status as it awaits manual confirmation (which is why the attack traffic persists in the Charts page). Clicking on the entry brings up a form that can be used to apply the control.

The chart above from the DDoS article shows an actual attack where the controller automatically dropped the attack traffic.

The basic settings are straightforward, allowing the threshold, duration, mode of operation and protected address ranges to be set.

Controls are added and removed by calling an external TCL/Expect script which logs into the site router and applies the following CLI command to drop traffic to the targeted address:

ip route target_ip/32 null0 name "DOS ATTACK"

The script can easily be modified or replaced to apply different controls or to work with different vendor CLIs.

Additional instructions are available under the Help tab. Instructions for downloading and installing the DDoS Blackhole application are available on sFlow-RT.com.

The software will work on any site with sFlow capable switches, even if the router itself doesn't support sFlow. Running the application in Manual mode is a completely safe way to become familiar with the software features and get an understanding of normal traffic levels. Download the software and give it a try.

Saturday, November 21, 2015

OVN service injection demonstration

Enabling extensibility in OVN, by Gal Sagie, Huawei and Liran Schour, IBM, Open vSwitch 2015 Fall Conference describes a method for composing actions from an external application with actions installed by the Open Network Virtualization (OVN) controller.

An API allows services to be attached to logical topology elements in the OVN logical topology, resulting in a table in the OVN logical flow table that is under the controller of the external service. Changes to the logical table are then automatically instantiated as concrete flows in the Open vSwitch instances responsible for handling the packets in the flow.

The demo presented involves detecting large "Elephant" flows using sFlow instrumentation embedded in Open vSwitch. Once a large flow is detected, logical flows are instantiated in the OVN controller to mark the packets. The concrete marking rules are inserted in the Open vSwitch packet processing pipelines handling the logical flow's packets. In the demo, the marked packets are then diverted by the physical network to a dedicated optical circuit.

There are a number of interesting traffic control use cases described on this blog that could leverage the capabilities of Open vSwitch using this approach:

The logical flow abstraction implemented by OVN greatly simplifies the problem of composing policies needed to integrate service injection and chaining within the packet pipeline and is a very promising platform for tackling this class of problem.

Many service injection use cases, including the hybrid packet/optical case demonstrated in this talk, need to be informed by measurements made throughout the network, for example, you wouldn't want to divert traffic to the optical link if it was already at capacity. New OVS instrumentation features aimed at real-time monitoring of virtual networks, another talk from the Open vSwitch conference, describes how sFlow measurements from Open vSwitch and physical network switches can be combined to provide the comprehensive visibility needed for this type of traffic control.

There are significant advantages to applying service injection actions in the virtual switches, primarily avoiding the resource constraints imposed by physical hardware that limit the number of and types of rules that can be applied. However, some services are location dependent and so the external application may need direct access to the injected table on the individual virtual switches. This could be achieved by exposing the tables through the OVN daemons running on the hypervisors, or directly via OpenFlow connections to the Open vSwitch instances to allow the external application to control only the tables created for it.

The OpenFlow approach could be implemented through the OVN Northbound database, attaching the external service specific tables to the virtual network topology and assigning an OpenFlow controller to manage those tables. Once the changes propagate down to the vSwitches, they will create an additional OpenFlow connection to the designated controller that expose the assigned table. This notion of "slicing" resources is one of the earliest use cases for OpenFlow, see FlowVisor. In addition to supporting location dependent use cases, this approach offloads runtime control of delegated tables from OVN, decoupling the performance of the injected services from OVN applied controls.

Friday, November 20, 2015

Open vSwitch 2015 Fall Conference

Open vSwitch is an open source software virtual switch that is popular in cloud environments such as OpenStack. Open vSwitch is a standard Linux component that forms the basis of a number of commercial and open source solutions for network virtualization, tenant isolation, and network function virtualization (NFV) - implementing distributed virtual firewalls and routers.

The recent Open vSwitch 2015 Fall Conference agenda included a wide variety speakers addressing a range of topics, including: Open Network Virtualization (OVN), containers, service chaining, and network function virtualization (NFV).

The video above is a recording of the following sFlow related talk from the conference:

New OVS instrumentation features aimed at real-time monitoring of virtual networks (Peter Phaal, InMon)
The talk will describe the recently added packet-sampling mechanism that returns the full list of OVS actions from the kernel. A demonstration will show how the OVS sFlow agent uses this mechanism to provide real-time tunnel visibility. The motivation for this visibility will be discussed, using examples such as end-to-end troubleshooting across physical and virtual networks, and tuning network packet paths by influencing workload placement in a VM/Container environment.

This talk is a follow up to an Open vSwitch 2014 Fall Conference talk on the role of monitoring in building feedback control systems.

Slides and videos for all the conference talks are available on the Open vSwitch web site.

Wednesday, November 18, 2015

Network virtualization visibility demo

New OVS instrumentation features aimed at real-time monitoring of virtual networks, Open vSwitch 2015 Fall Conference, included a demonstration of real-time visibility into the logical network overlays created by network virtualization, virtual switches, and the leaf and spine underlay carrying the tunneled traffic between hosts.

The diagram above shows the demonstration testbed. It consists of a leaf and spine network connecting two hosts, each of which is running a pair of Docker containers connected to Open vSwitch (OVS). The vSwitches are controlled by Open Virtual Network (OVN), which has been configured to create two logical switches, the first connecting the left most containers on each host and the second connecting the right most containers. The testbed is described in more detail in Open Virtual Network (OVN) and is built from free components and can easily be replicated.

The dashboard in the video illustrates the end to end visibility that is possible by combining standard sFlow instrumentation in the physical switches with sFlow instrumentation in Open vSwitch and Host sFlow agents on the servers.

The diagram on the left of the dashboard shows a logical map of the elements in the testbed. The top panel shows the two logical switches created in OVN, sw1 connecting the left containers and sw0 connecting the right containers. The dotted lines represent the logical switches ports and their width shows the current traffic rate flowing over the logical link.

The solid lines below show the path that the virtual network traffic actually takes. From a container on Server1 to the virtual switch (OVS), where it is encapsulated in a Geneve tunnel and sent via leaf1, spine1, and leaf2 to the OVS instance on Server2, which decapsulates the traffic and delivers it to the other container on the logical switch.

Both logical networks share the same physical network and this is shown be link color. Traffic associated with sw1 is shown as red, traffic associated with sw0 is shown as blue, and a mixture of sw1 and sw0 traffic is shown as magenta.

The physical network is a shared resource for the virtual networks that make use of it. Understanding how this resource is being utilized is essential to ensure that virtual networks do not interfere with each other, for example, one tenant backing up data over their virtual network may cause unacceptable response time problems for another tenant.

The strip charts to the right of the diagram show representative examples of the data that is available and demonstrate comprehensive visibility into, and across, layers in the virtualization stack. Going from top to bottom:

Container CPU Utilization The trend chart shows the per container CPU load for the containers. This data comes from the Host sFlow agents.
Container Traffic This trend chart merges sFlow data from the Host sFlow and Open vSwitch to show traffic flowing between containers.
OVN Virtual Switch Traffic This trend chart merges data from the OVN Northbound interface (specifically, logical port MAC addresses and logical switch names) and Open vSwitch to show traffic flowing through each logical switch.
Open vSwitch Performance This trend chart shows key performance indicators based on metrics exported by the Open vSwitches, see Open vSwitch performance monitoring.
Leaf/Spine Traffic This chart combines data from all the switches in the leaf / spine network to show traffic flows. The chart demonstrates that sFlow from the physical network devices provides visibility into the outer (tunnel) addresses and inner (tenant/virtual network) addresses, see Tunnels.

The visibility shown in the diagram and charts is only possible because all the elements of the infrastructure are instrumented using sFlow. No single element or layer has a complete picture, but when you combine information from all the elements a full picture emerges, see Visibility and the software defined data center

The demonstration is available on GitHub. The following steps will download the demo and provide data that can be used to explore the sFlow-RT APIs described in Open Virtual Network (OVN) that were used to construct this dashboard.

wget http://www.inmon.com/products/sFlow-RT/sflow-rt.tar.gz
tar -xvzf sflow-rt.tar.gz
cd sflow-rt
./get-app.sh pphaal ovs-2015

Edit the start.sh file to playback the included captured sFlow data:

#!/bin/sh

HOME=`dirname $0`
cd $HOME

JAR="./lib/sflowrt.jar"
JVM_OPTS="-Xincgc -Xmx200m"
RT_OPTS="-Dsflow.port=6343 -Dhttp.port=8008 -Dsflow.file=app/ovs-2015/demo.pcap"
SCRIPTS="-Dscript.file=init.js"

exec java ${JVM_OPTS} ${RT_OPTS} ${SCRIPTS} -jar ${JAR}

Start sFlow-RT:

[user@server sflow-rt]$ ./start.sh 
2015-11-18T13:22:14-0800 INFO: Reading PCAP file, app/ovs-2015/demo.pcap
2015-11-18T13:22:15-0800 INFO: Starting the Jetty [HTTP/1.1] server on port 8008
2015-11-18T13:22:15-0800 INFO: Starting com.sflow.rt.rest.SFlowApplication application
2015-11-18T13:22:15-0800 INFO: Listening, http://localhost:8008
2015-11-18T13:22:15-0800 INFO: init.js started
2015-11-18T13:22:15-0800 INFO: app/ovs-2015/scripts/status.js started
2015-11-18T13:22:16-0800 INFO: init.js stopped

Finally, access the web interface http://server:8008/app/ovs-2015/html/ and you should see the screen shown in the video.

The sFlow monitoring technology scales to large production networks. The instrumentation is built into physical switch hardware and is available in 1G/10G/25G/40G/50G/100G data center switches from most vendors (see sFlow.org). The sFlow instrumentation in Open vSwitch is built into the Linux kernel and is an extremely efficient method of monitoring large numbers of virtual machines and/or containers.

The demonstration dashboard illustrates the type of operational visibility that can be delivered using sFlow. However, for large scale deployments, sFlow data can be incorporated into existing DevOps tool sets to augment data that is already being collected.

The diagram above shows how the sFlow-RT analytics engine is used to deliver metrics and events to cloud based and on-site DevOps tools, see: Cloud analytics, InfluxDB and Grafana, Metric export to Graphite, and Exporting events using syslog. There are important scaleability and cost advantages to placing the sFlow-RT analytics engine in front of metrics collection applications as shown in the diagram. For example, in large scale cloud environments the metrics for each member of a dynamic pool are not necessarily worth trending since virtual machines are frequently added and removed. Instead, sFlow-RT can be configured to track all the members of the pool, calculates summary statistics for the pool, and log summary statistics. This pre-processing can significantly reduce storage requirements, reduce costs and increase query performance.

Friday, November 13, 2015

SC15 live real-time weathermap

Connect to http://inmon.sc15.org/sflow-rt/app/sc15-weather/html/ between now and November 19th to see a real-time heat map of the The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15) network.

From the SCinet web page, "SCinet brings to life a very high-capacity network that supports the revolutionary applications and experiments that are a hallmark of the SC conference. SCinet will link the convention center to research and commercial networks around the world. In doing so, SCinet serves as the platform for exhibitors to demonstrate the advanced computing resources of their home institutions and elsewhere by supporting a wide variety of bandwidth-driven applications including supercomputing and cloud computing."

The real-time weathermap leverages industry standard sFlow instrumentation built into network switch and router hardware to provide scaleable monitoring of the over 6 Terrabit/s aggregate link capacity comprising the SCinet network. Link colors are updated every second to reflect operational status and utilization of each link.

Clicking on a link in the map pops up a 1 second resolution strip chart showing the protocol mix carried by the link.

The SCinet real-time weathermap was constructed using open source components running on the sFlow-RT real-time analytics engine. Download sFlow-RT and see what you can build.

Update December 1, 2015 The source code is now available on GitHub

Wednesday, November 11, 2015

sFlow Test

sFlow Test has been released on GitHub, https://github.com/sflow-rt/sflow-test. The suite of checks is intended to validate the implementation of sFlow on a data center switch. In particular, the tests are designed to verify that the sFlow agent implementation provides measurements under load with the accuracy needed to drive SDN control applications, including:

Many of the tests can be run while the switches are in production and are a useful way of verifying that a switch is configured and operating correctly.

The stress tests can be scaled to run without specialized equipment. For example, the recommended sampling rate for 10G links in production is 1-in-10,000. Driving a switch with 48x10G ports to 30% of total capacity would require a load generator capable of generating 288Gbit/s. However, dropping the sampling rate to 1-in-100 and generating a load of 2.88Gbit/s is an equivalent test of the sFlow agent's performance and can be achieved by two moderately powerful servers with 10G network adapters.

For example, using the test setup above, run an iperf server on Server2:

iperf -su

Then run the following sequence of tests on Server1:

#!/bin/bash
RT="10.0.0.162"
TGT="server2"

ping -f -c 100000 $TGT
sleep 40
curl http://$RT:8008/app/sflow-test/scripts/test.js/load/json?bps=100000000
iperf -c $TGT -t 40 -u -b 100M
curl http://$RT:8008/app/sflow-test/scripts/test.js/load/json?bps=0
sleep 40
curl http://$RT:8008/app/sflow-test/scripts/test.js/load/json?bps=200000000
iperf -c $TGT -t 40 -u -b 200M
curl http://$RT:8008/app/sflow-test/scripts/test.js/load/json?bps=0
sleep 40
curl http://$RT:8008/app/sflow-test/scripts/test.js/load/json?bps=300000000
iperf -c $TGT -t 40 -u -b 300M
curl http://$RT:8008/app/sflow-test/scripts/test.js/load/json?bps=0

The results of the test are shown in the screen capture at the top of this article. The test pattern is based on the article Large flow detection and the sequence of steps in Load (show in gold) are accurately tracked by the Flows metric (shown in red). The Counters metric (shown in blue) doesn't accurately track the shape of the load and is delayed, but the peaks are consistent with Load and Flows values.

The table below the two strip charts shows that all the tests have passed:

test duration The test must run for at least 5 minutes (300 seconds)
check sequence numbers Checks that sequence numbers for datagrams and measurements are correctly reported and that measurements are delivered in order
check data sources Checks that sampling rates and polling intervals are correctly configured and that data is being received
sampled packet size Checks to make sure that sampled packet sizes are correctly reported
random number generator Tests the hardware random number generator used for packet sampling
compare byte flows and counters Checks that Bits per Second values (shown in the upper strip chart) are consistently reported by interface counters and by packet sampling
compare packet flows and counters Checks that Packets per Second values (shown in the lower strip chart) are consistently reported by interface counters and by packet sampling.
check ingress port information Verifies that ingress port information is included with packet samples

The project provide an opportunity for users and developers to collaborate on creating a set of tests that capture operational requirements and that can be used in product selection and development. An effective test application will help increase the quality of sFlow implementations and ensure that they address the need for measurement in SDN control applications.