Monday, May 27, 2013

Controlling large flows with OpenFlow

Performance aware software defined networking describes how sFlow and OpenFlow can be combined to create SDN applications such as  DDoS mitigationload balancing large flows, and packet capture. This article takes a deep dive, describing how to build a test bed for developing performance aware SDN applications. The test bed is used to build and test a basic DDoS mitigation application and demonstrate how quickly the controller can adapt the network changing traffic conditions.

Test bed

Mininet uses Linux containers and Open vSwitch to allow realistic virtual networks of hosts and switches to be constructed using a virtual machine. Mininet is widely used by SDN researchers since it allows anyone with a laptop to build realistic network topologies and experiment with OpenFlow controllers and SDN applications.

The Floodlight OpenFlow Controller was selected for the test bed because its default behavior is to provide basic connectivity which can be selectively overridden using the Static Flow Pusher API. This separation allows simple performance optimizing applications to be developed since they don't need to concern themselves with maintaining connectivity and are free to focus on implementing optimizations. The Static Flow Pusher provides an example of hybrid OpenFlow (in which OpenFlow is used to selectively override forwarding decisions made by the normal switch forwarding logic). This makes it straightforward to move applications from the test bed to physical switches that support hybrid OpenFlow control. Finally, Floodlight is one of the most mature platforms used in production settings, so any applications developed on the test bed can be easily moved into production.

There are a number of ways to get started, both the Mininet and Floodlight projects distribute a pre-built virtual machine (VM) appliance (the Floodlight VM includes Mininet). Alternatively, it is straightforward to build an Ubuntu 13.04 virtual machine and install Mininet using apt-get (this is the route the author took to build a Mininet VM on a XenServer pool).

Once you have a system with Mininet and Floodlight installed, download and install sFlow-RT:
tar -xvzf sflow-rt.tar.gz
And finally, this example will be using node.js as the application language. Node.js is well suited for implementing SDN applications. Its asynchronous IO model supports a high degree of parallelism, allowing the SDN application to interact with multiple controllers, monitoring systems, databases etc. without blocking, resulting in a fast and consistent response time that makes it well suited for control applications.

The following command installs node.js:
sudo apt-get install nodejs
For development, it is helpful to run each tool in a separate window so that you can see an logging messages (in a production setting processes would be daemonized).

1. Start Floodlight
cd floodlight
2. Start Mininet, specifying Floodlight as the controller
sudo mn --controller=remote,ip=
*** Creating network
*** Adding controller
*** Adding hosts:
h1 h2 
*** Adding switches:
*** Adding links:
(h1, s1) (h2, s1) 
*** Configuring hosts
h1 h2 
*** Starting controller
*** Starting 1 switches
*** Starting CLI:
3. Configure sFlow monitoring on each of the switches:
sudo ovs-vsctl -- --id=@sflow create sflow agent=eth0  target=\"\" sampling=10 polling=20 -- -- set bridge s1 sflow=@sflow
4. Start sFlow-RT
cd sflow-rt
By default, Floodlight provides a basic layer 2 switching service, ensuring connectivity between hosts connected to the OpenFlow switches. Connectivity can be verified using the Mininet command line:
mininet> h1 ping h2
PING ( 56(84) bytes of data.
64 bytes from icmp_req=1 ttl=64 time=36.7 ms
64 bytes from icmp_req=2 ttl=64 time=0.159 ms
There are many options for using the Mininet test bed, previous articles on this blog have used Python to develop applications and different SDN controllers can be installed. For example, the PyTapDEMon series of articles describes the uses Python, POX (an OpenFlow controller written in Python) and Mininet to recreate the Microsoft DEMon SDN packet broker.

DDoS mitigation application

The following node.js script is based on the Python script described in Performance Aware Software Defined Networking.
var fs = require("fs");
var http = require('http');

var keys = 'inputifindex,ethernetprotocol,macsource,macdestination,ipprotocol,ipsource,ipdestination';
var value = 'frames';
var filter = 'sourcegroup=external&destinationgroup=internal&outputifindex!=discard';
var thresholdValue = 100;
var metricName = 'ddos';

// mininet mapping between sFlow ifIndex numbers and switch/port names
var ifindexToPort = {};
var nameToPort = {};
var path = '/sys/devices/virtual/net/';
var devs = fs.readdirSync(path);
for(var i = 0; i < devs.length; i++) {
  var dev = devs[i];
  var parts = dev.match(/(.*)-(.*)/);
  if(!parts) continue;

  var ifindex = fs.readFileSync(path + dev + '/ifindex');
  var port = {"switch":parts[1],"port":dev};
  ifindexToPort[parseInt(ifindex).toString()] = port;
  nameToPort[dev] = port;

var fl = { hostname: 'localhost', port: 8080 };

var groups = {'external':[''],'internal':['']};
var rt = { hostname: 'localhost', port: 8008 };
var flows = {'keys':keys,'value':value,'filter':filter};
var threshold = {'metric':metricName,'value':thresholdValue};

function extend(destination, source) {
  for (var property in source) {
    if (source.hasOwnProperty(property)) {
      destination[property] = source[property];
  return destination;

function jsonGet(target,path,callback) {
  var options = extend({method:'GET',path:path},target);
  var req = http.request(options,function(resp) {
    var chunks = [];
    resp.on('data', function(chunk) { chunks.push(chunk); });
    resp.on('end', function() { callback(JSON.parse(chunks.join(''))); });

function jsonPut(target,path,value,callback) {
  var options = extend({method:'PUT',headers:{'content-type':'application/json'}
  var req = http.request(options,function(resp) {
    var chunks = [];
    resp.on('data', function(chunk) { chunks.push(chunk); });
    resp.on('end', function() { callback(chunks.join('')); });

function jsonPost(target,path,value,callback) {
  var options = extend({method:'POST',headers:{'content-type':'application/json'},"path":path},target);
  var req = http.request(options,function(resp) {
    var chunks = [];
    resp.on('data', function(chunk) { chunks.push(chunk); });
    resp.on('end', function() { callback(chunks.join('')); });

function lookupOpenFlowPort(agent,ifIndex) {
  return ifindexToPort[ifIndex];

function blockFlow(agent,dataSource,topKey) {
  var parts = topKey.split(',');
  var port = lookupOpenFlowPort(agent,parts[0]);
  if(!port || !port.dpid) return;
  var message = {"switch":port.dpid,

  console.log("message=" + JSON.stringify(message));
      function(response) {
         console.log("result=" + JSON.stringify(response));

function getTopFlows(event) {
  jsonGet(rt,'/metric/' + event.agent + '/' + event.dataSource + '.' + event.metric + '/json',
    function(metrics) {
      if(metrics && metrics.length == 1) {
        var metric = metrics[0];
        if(metric.metricValue > thresholdValue
           && metric.topKeys
           && metric.topKeys.length > 0) {
            var topKey = metric.topKeys[0].key;

function getEvents(id) {
  jsonGet(rt,'/events/json?maxEvents=10&timeout=60&eventID='+ id,
    function(events) {
      var nextID = id;
      if(events.length > 0) {
        nextID = events[0].eventID;
        for(var i = 0; i < events.length; i++) {
          if(metricName == events[i].thresholdID) getTopFlows(events[i]);

// use port names to link dpid and port numbers from Floodlight
function getSwitches() {
    function(switches) { 
      for(var i = 0; i < switches.length; i++) {
        var sw = switches[i];
        var ports = sw.ports;
        for(var j = 0; j < ports.length; j++) {
          var port = nameToPort[ports[j].name];
          if(port) {
            port.dpid = sw.dpid;
            port.portNumber = ports[j].portNumber;

function setGroup() {
    function() { setFlows(); }

function setFlows() {
  jsonPut(rt,'/flow/' + metricName + '/json',
    function() { setThreshold(); }

function setThreshold() {
  jsonPut(rt,'/threshold/' + metricName + '/json',
    function() { getEvents(-1); }

function initialize() {

The script should be fairly self explanatory to anyone familiar with JavaScript. The asynchronous style of programming in which the response to each call is handled by a callback function may be unfamiliar to non-Javascript programmers, but it is widely used in JavaScript and is the keys to node.js's low latency and ability to handle large numbers of concurrent requests. The sFlow-RT REST API calls and the basic logic for this script are explained in the Performance aware software defined networking article.

There are a couple of topics addressed in the script that warrant mention:

The OpenFlow protocol has its own way of identifying switches and ports on the network and an SDN application needs to be able to translate between the performance monitoring system's model of the network (identifying switches by their management IP addresses and ports by SNMP ifIndex numbers) and OpenFlow identifiers. Currently, there is no standard way to map between these two models and this deficiency needs to be addressed by the Open Networking Foundation, either through extensions to the configuration or OpenFlow protocols.

However, this script shows how to build these mappings in a Mininet environment by examining files in the /sys/devices/virtual/net directory and combining the information with data about the switches retrieved using Floodlight's /wm/core/controller/switches/json REST API call.

Finally, the sFlow data from Open vSwitch includes dropped packets. The sFlow-RT filter expression outputifindex!=discard is used to detect flows that aren't being blocked.


This example uses a Ping Flood to demonstrate a basic denial of service attack.

The following Mininet command opens an terminal window connected to host h1:
mininet> xterm h1
Type the following command in the terminal to generate a ping flood between h1 and h2:
# ping -f

The sFlow-RT chart shows that without mitigation the ping flood generates a sustained traffic rate of around 6 thousand packets per second.

Next stop the ping flood attack and let the traffic settle down.

The following command runs the denial of service mitigation script:
nodejs mininet.js
Now start the ping flood again and see what happens.
The chart shows that the controller is able to respond quickly when the traffic flow exceeds the defined threshold of 100 packets per second. The mitigation control is applied within a second and instead of reaching a peak of 6 thousand packets per second, the attack is limited to a peak of 130 packets per second.

The ping flood attack is quickly detected by sFlow-RT, which notifies the mitigation application. The mitigation application retrieves details of the attack from the sFlow-RT in order to construct the following message, which is sent to the Floodlight's Static Flow Pusher:
The Floodlight controller then uses OpenFlow to push the rule to Open vSwitch which immediately starts dropping packets.

Note: The mitigation script doesn't automatically remove the control once the attack has been stopped, so the following command is needed to clear the controls on Floodlight:
curl http://localhost:8080/wm/staticflowentrypusher/clear/all/json

While far from a complete application, this example demonstrates that the sFlow and OpenFlow standard can be combined to build fast acting performance aware SDN applications that address important use cases, such as DDoS mitigation, large flow load balancing, multi-tenant performance isolation, traffic engineering, and packet capture. The Mininet platform provides a convenient way to develop, test and share applications addressing these use cases.

Saturday, May 11, 2013

SDN and WAN optimization

Amin Vahdat described Google's SDN based wide area network traffic engineering solution at the recent Open Networking Summit. Amin stated that for existing networks, "a typical rule of thumb is to over-provision by a factor of 3." Amin further stated that moving to a logically centralized SDN based control "improves convergence times, improves failover behavior, more deterministic, more efficient, more fault tolerant."
Google's traffic engineering system is able to insert multiple non-shortest path routes depending on traffic priorities and measured demand. Using only 3 non-shortest path routes, the overall throughput can be increased by around 15%.
However, Amin stated that the big win is being able to run the backbone links at close to 100% utilization 24 x 7, a greater than a 3 times improvement over traditional WAN designs.

A key element of the Google architecture is the use of traffic prioritization. Generally over provisioning has prevailed as the technique for ensuring wide areas network quality of service, see The economics of the Internet: Utility, utilization, pricing, and Quality of Service and more recently The Concept of Quality of Service in the Internet. This seems like a contradiction - why does it make sense to use quality of service mechanisms in Google's case?

Actually there isn't a contradiction, by using SDN to accurately place traffic into just two classes (high priority and low priority), Google is effectively using over provisioning to ensure high quality of service for the high priority class (which comprises 10-20% of the link traffic). The rest of the bandwidth is filled with low priority traffic that must tollerate packet loss and lower availability, since the low priority traffic may be dropped in the case of link failure.

Figure 1: Cloud operation system (from Pragmatic software defined networking)
Pragmatic software defined networking and Multi-tenant traffic in virtualized network environments describe how the visibility and control offered by the OpenFlow and sFlow standards can be used to dynamically engineer traffic in the data center. In the data center, end-to-end control and accurate traffic classification is feasible and should have similar benefits, allowing high bandwidth background activities like data replication and migration to use all the spare capacity in the network without affecting high priority flows.

SDN and large flows describes how steering large flows can significantly increase available bandwidth. Active flow steering and traffic classification are complementary techniques that could be combined to dramatically increase the usable bandwidth in any given physical network.

Wednesday, May 8, 2013


The sflow/haproxy project is an implementation of the sFlow HTTP standard for the open source HAProxy high performance TCP/HTTP load balancer. Load balancers are used to virtualize scale out service pools: clients connect to a virtual IP address and service port associated with the load balancer which selects a member of the server pool to handle the request. This architecture provides operational flexibility, allowing servers to be added and removed from the pool as demand changes.
The load balancer is uniquely positioned to provide information on the overall performance of the entire service pool and link the performance seen by clients with the behavior of individual servers in the pool. The advantage of using sFlow to monitor performance is the scalability it offers when request rates are high and conventional logging solutions generate too much data or impose excessive overhead. In addition, monitoring HTTP services using sFlow is part of an integrated monitoring system that spans the data center, providing real-time visibility into application, server and network performance.

The sflow/haproxy software is designed to integrate with the Host sFlow agent to provide a complete picture of proxy performance. Download, install and configure Host sFlow before proceeding to install sflow/haproxy - see Installing Host sFlow on a Linux Server.

Note: the sflow/haproxy agent picks up its configuration from the Host sFlow agent. The sampling.http setting can be used to override the default sampling setting to set a specific sampling rate for HTTP requests.

The following commands download and install the sFlow instrumented version of HAProxy on a Linux server:
git clone
cd haproxy
make TARGET=linux26 USE_SFLOW=yes
make install
Once installed and configured, HAProxy will stream measurements to a central sFlow Analyzer. Download, compile and install the sflowtool on the system your are using to receive sFlow to see the raw data and verify that the measurements are being received.

Running sflowtool will display output of the form:
$ sflowtool
startDatagram =================================
datagramSize 564
unixSecondsUTC 1368058148
datagramVersion 5
agentSubId 80
packetSequenceNo 23
sysUpTime 417000
samplesInPacket 2
startSample ----------------------
sampleType_tag 0:2
sampleSequenceNo 1
sourceId 3:80
counterBlock_tag 0:2201
http_method_option_count 0
http_method_get_count 71
http_method_head_count 0
http_method_post_count 0
http_method_put_count 0
http_method_delete_count 0
http_method_trace_count 0
http_methd_connect_count 0
http_method_other_count 2
http_status_1XX_count 0
http_status_2XX_count 26
http_status_3XX_count 24
http_status_4XX_count 23
http_status_5XX_count 0
http_status_other_count 0
endSample   ----------------------
startSample ----------------------
sampleType_tag 0:1
sampleSequenceNo 71
sourceId 3:80
meanSkipCount 1
samplePool 71
dropEvents 0
inputPort 0
outputPort 1073741823
flowBlock_tag 0:2102
extendedType proxy_socket4
proxy_socket4_ip_protocol 6
proxy_socket4_local_port 0
proxy_socket4_remote_port 80
flowBlock_tag 0:2100
extendedType socket4
socket4_ip_protocol 6
socket4_local_port 0
socket4_remote_port 57642
flowBlock_tag 0:2206
flowSampleType http
http_method 2
http_protocol 1001
http_uri GET /games/animals.php HTTP/1.1
http_useragent Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.31 (KHTML
http_mimetype text/html; charset=UTF-8
http_request_bytes 346
http_bytes 487
http_duration_uS 13000
http_status 200
endSample   ----------------------
endDatagram   =================================
There are two types of sFlow record shown: COUNTERSAMPLE and FLOWSAMPLE data. The counters are useful for trending overall performance using tools like Ganglia and Graphite. Using sflowtool to output combined logfile format makes the data available to most logfile analyzers.
Note: The highlighted IP addresses in the FLOWSAMPLE correspond to addresses in the diagram and illustrate how request records from the proxy link clients to the back end servers. 
A native sFlow analyzer like sFlowTrend can combine the counters, flows and host performance metrics to provide an integrated view of performance.
Installing sFlow agents on the backend web servers further extends visibility: implementations are available for Apache, NGINX, Tomcat and node.js. Application logic running on the servers can also be instrumented with sFlow, see Scripting languages. Back end Memcache, Java and virtualization pools can also be instrumented with sFlow. sFlow agents embedded in physical and virtual switches provide visibility into the network.

Comprehensive end to end visibility in multi-tiered environments allows the powerful control capabilities of the load balancers to be used to greatest effect: regulating traffic between tiers, protecting overloaded backend systems, defending against denial of service attacks, moving resources from over provisioned pools to under provisioned pools.

The sFlow-RT real-time analytics engine makes the full set of sFlow metrics accessible through a RESTful API so that they can be used to drive automation. A future article will explore how sFlow metrics can be used to control HAProxy behavior (by issuing UnixSocketCommands).

Wednesday, May 1, 2013

Software defined analytics

Figure 1: Performance aware software defined networking
Software defined networking (SDN) separates the network Data Plane and Control Plane, permitting external software to monitor and control network resources. Open Southbound APIs like sFlow and OpenFlow are an essential part of this separation, connecting network devices to external controllers, which in turn present high level Open Northbound APIs to SDN applications.

This article demonstrates the architectural similarities between OpenFlow and sFlow configuration and use within an SDN stack. Developers working in the SDN field are likely familiar with the configuration and use of OpenFlow and it is hoped that this comparison will be helpful as a way to understand how to incorporate sFlow measurement technology to create performance aware SDN solutions such as load balancing, DDoS protection and packet brokers.

OpenFlow and sFlow

In this example, Open vSwitch, Floodlight and sFlow-RT are used to demonstrate how switches are configured to use the OpenFlow and sFlow protocols to communicate with the centralized control plane. Next, representative Northbound REST API calls are used to illustrate how control plane software presents network wide visibility and control functionality to SDN applications.

1. Connect switches to control plane

Configure each switch to connect to the OpenFlow controller:
ovs-vsctl set-controller br0 tcp:
Similarly, configure each switch to send measurements to the sFlow analyzer:
ovs-vsctl -- --id=@sflow create sflow agent=eth0  target=\"\" sampling=1000 polling=20 -- -- set bridge br0 sflow=@sflow
2. REST APIs for network wide visibility and control

The following command uses the Floodlight static flow pusher API to set up a forwarding path:
curl -d '{"switch": "00:00:00:00:00:00:00:01", "name":"flow-mod-1", "cookie":"0", "priority":"32768", "ingressport":"1","active":"true", "actions":"output=2"}'
The following command uses sFlow-RT's flow API to setup monitoring of TCP flows across all switches:
curl -H "Content-Type:application/json" -X PUT --data "{keys:'ipsource,ipdestination,tcpsourceport,tcpdestinationport', value:'bytes'}"
Next, the following command finds the top TCP flow currently in progress anywhere in network:
 "agent": "",
 "dataSource": "2",
 "metricN": 14,
 "metricName": "incoming",
 "metricValue": 3.4061718002956964E7,
 "topKeys": [{
  "key": ",,80,52577",
  "updateTime": 1367092118446,
  "value": 3.4061718002956964E7
 "updateTime": 1367092118446
The response doesn't just identify the flow, HTTP packets from a web server to a client, it also identifies the switch and port carrying the traffic, information that would allow the OpenFlow controller to take action to rate limit, tap, re-route or block this traffic.
Figure 2: Rapidly detecting large flows, sFlow vs. NetFlow/IPFIX
The flexible Software Defined Analytics (SDA) functionality shown in this example is possible because the sFlow architecture shifts analytic functions to external software, relying on minimal core measurements embedded in the switch hardware data plane to deliver wire-speed performance. The simplicity and openness of the sFlow standard has resulted in widespread adoption in merchant silicon and by switch vendors.

In contrast, measurement technologies such as Cisco NetFlow and IPFIX perform traffic analysis using specialized hardware on the switch.  Configuring the hardware measurement features can be complex: for example, monitoring TCP flows using Cisco's Flexible NetFlow requires the following CLI commands:

1. define a flow record
flow record tcp-analysis
match transport tcp destination-port
match transport tcp source-port
match ipv4 destination address
match ipv4 source address
collect counter bytes
2. specify the collector
flow exporter export-to-server
transport udp 9985
template data timeout 60
3. define a flow cache
flow monitor my-flow-monitor
record tcp-analysis
exporter export-to-server
cache timeout active 60
4. enable flow monitoring on each switch interface:
interface Ethernet 1/0
ip flow monitor my-flow-monitor input
interface Etherent 1/48
ip flow monitor my-flow-monitor input
5. For network wide visibility, go to step 1 and repeat for each switch in the network

Based on the architecture of on-switch flow analysis and this configuration example, it is apparent that there are limitations to this approach to monitoring, particularly in the context of software defined networking:
  1. Flexible NetFlow is complex to configure, see Complexity Kills.
  2. Configuration changes to switches are typically limited to infrequent maintenance windows making it difficult to deploy new measurements.
  3. Each flow cache (step 3 in the Flexible NetFlow configuration) consumes significant on-switch memory, limiting the number of simultaneous flow measurements that can be made, and taking memory that could be used for additional forwarding rules.
  4. Hardware differences mean that measurements are inconsistent between vendors, or even between different products from the same vendor, see Snowflakes, IPFIX, NetFlow and sFlow.
  5. Adding support for new protocols, like GRE, VXLAN etc. involves upgrading switch firmware and may require new hardware.
What about using OpenFlow counters to drive analytics? Since maintaining OpenFlow counters relies on switch hardware to decode packets and track flows, OpenFlow based traffic measurement shares many of the same limitations described for NetFlow/IPFIX, see Hey, You Darned Counters! Get Off My ASIC!

On the other hand, software defined analytics based on the sFlow standard is highly scaleable and extremely flexible. For example, adding an additional flow definition to report on tunneled traffic across the data center involves a single additional REST API call:
url -H "Content-Type:application/json" -X PUT --data "{keys:'stack,ipsource,ipdestination,ipsource.1,ipdestination.1', value:'bytes'}"
The following command retrieves the top tunneled flow:
 "agent": "",
 "dataSource": "3",
 "metricN": 6,
 "metricName": "stack",
 "metricValue": 74663.29589986047,
 "topKeys": [{
  "key": "eth.ip.gre.ip.tcp,,,,",
  "updateTime": 1367096917146,
  "value": 74663.29589986047
 "updateTime": 1367096917146
The result shows that the top tunneled flow currently traversing the network is a TCP connection in a GRE tunnel between inner addresses and
Note: Monitoring and controlling tunneled traffic is an important use case since tunnels are widely used for network virtualization and IPv6 migration, see Tunnels and Down the rabbit hole.
Perhaps the greatest limitation of on-switch flow analysis is the fact that the measurements are delayed on the switch, making them inaccessible to SDN applications, see Rapidly detecting large flows, sFlow vs. NetFlow/IPFIX. Centralized flow analysis liberates measurements from the devices to deliver real-time network wide analytics that support new classes of performance aware SDN application such as: load balancing, DDoS protection and packet brokers.