Wednesday, August 28, 2013


Figure 1: Embedded, on-switch flow cache with flow record export
This article describes RESTflow™, a new method for exporting flow records that has significant advantages over current approaches to flow export.

A flow record summarizes a set of packets that share common attributes - for example, a typical flow record includes ingress interface, source IP address, destination IP address, IP protocol, source TCP/UDP port, destination TCP/UDP port, IP ToS, start time, end time, packet count and byte count.

Figure 1 shows the steps performed by the switch in order to construct flow records. First the stream of packets is likely to be sampled (particularly in high-speed switches). Next, the sampled packet header is decoded to extract key fields. A hash function is computed over the keys in order to look up the flow record in the flow cache. If an existing record is found, its values are updated, otherwise a record is created for the new flow. Records are flushed from the cache based on protocol information (e.g. if a FIN flag is seen in a TCP packet), a timeout, inactivity, or when the cache is full. The flushed records are finally sent to the traffic analysis application using one of the many formats that switches use to export flow records (e.g. NetFlow, IPFIX, J-Flow, NetStream, etc.).
Figure 2: External software flow cache with flow record export
Figure 2 shows the relationship between the widely supported sFlow standard for packet export and flow export. With sFlow monitoring, the decode, hash, flow cache and flush functionality are no longer implemented on the switch. Instead, sampled packet headers are immediately sent to the traffic analysis application which decodes the packets and analyzes the data. In typical deployments, large numbers of switches stream sFlow data to a central sFlow analyzer. In addition, sFlow provides a polling function; switches periodically send standard interface counters to the traffic analysis applications, eliminating the need for SNMP polling, see Link utilization.
There are significant advantages to moving the flow cache to external software: the article, Superlinear, discusses some of the scaleability implications of on device flow caches and, Rapidly detecting large flows, sFlow vs. NetFlow/IPFIX, describes how on device flow caches delay measurements and makes them less useful for software defined networking (SDN) applications.
The following example uses the sFlow-RT analyzer to demonstrate flow record export based on sFlow packet data received from a network of switches.
Figure 3: Performance aware software defined networking
Figure 3 from Performance aware software defined networking shows how sFlow-RT exposes the active flow cache to applications that address important use cases, such as DDoS mitigationlarge flow load balancingmulti-tenant performance isolationtraffic engineering, and packet capture.

The recent extension of the REST API to support flow record export provides a useful log of network activity that can be incorporated in security, information and event management (SIEM) tools.

Three types of query combine to deliver the RESTflow flexible flow definition and export interface:

1. Define flow cache

The following command instructs the central sFlow-RT analytics engine running on host to build a flow cache for TCP flows and log the completed flows:
curl -H "Content-Type:application/json" -X PUT --data '{"keys":"ipsource,ipdestination,tcpsourceport,tcpdestinationport", "value":"bytes", "log":true}'
What might not be apparent is that the single configuration command to sFlow-RT enabled network wide monitoring of TCP connections, even in a network containing hundreds of physical switches, thousands of virtual switches, different switch models, multiple vendors etc. In contrast, if devices maintain their own flow caches then each switch needs to be re-configured whenever monitoring requirements change - typically a time consuming and complex manual process, see Software defined analytics.

To illustrate the point, the following command defines an additional network wide flow cache for records describing DNS (UDP port 53) requests and log the completed flows:
curl -H "Content-Type:application/json" -X PUT --data '{"keys":"ipsource", "value":"frames", "filter":"udpdestinationport=53", "log":true}'

2. Query flow cache definition

The following command retrieves the flow definitions:
$ curl
 "dns": {
  "filter": "udpdestinationport=53",
  "fs": ",",
  "keys": "ipsource",
  "log": true,
  "n": 5,
  "t": 2,
  "value": "frames"
 "tcp": {
  "fs": ",",
  "keys": "ipsource,ipdestination,tcpsourceport,tcpdestinationport",
  "log": true,
  "n": 5,
  "t": 2,
  "value": "bytes"
The definition for a specific flow can also be retrieved:
$ curl
 "fs": ",",
 "keys": "ipsource,ipdestination,tcpsourceport,tcpdestinationport",
 "log": true,
 "n": 5,
 "t": 2,
 "value": "bytes"

3. Retrieve flow records

The following command retrieves flow records logged by all the flow caches:
  "agent": "",
  "dataSource": "2",
  "end": 1377658682679,
  "flowID": 250,
  "flowKeys": "",
  "name": "dns",
  "start": 1377658553679,
  "value": 400
  "agent": "",
  "dataSource": "5",
  "end": 1377658681678,
  "flowID": 249,
  "flowKeys": ",,47571,3260",
  "name": "tcp",
  "start": 1377658613678,
  "value": 1217600
And the following command retrieves flow records from a specific cache:
$ curl ""
  "agent": "",
  "dataSource": "53",
  "end": 1377658938378,
  "flowID": 400,
  "flowKeys": "",
  "name": "dns",
  "start": 1377658398378,
  "value": 400
  "agent": "",
  "dataSource": "2",
  "end": 1377658682679,
  "flowID": 251,
  "flowKeys": "",
  "name": "dns",
  "start": 1377658612679,
  "value": 400
The JSON encoded text based output is easy to read and widely supported by programming tools.
Transporting large amounts of flow data using a text based protocol might seem inefficient when compared to binary flow record export protocols such as IPFIX, NetFlow etc. However, one of the advantages of a REST API is that it builds on the mature and extensive capabilities built into the HTTP protocol stack. For example, most HTTP clients are capable of handling compression and will set the HTTP Accept-Encoding header to indicate that they are willing to accept compressed data. The sFlow-RT web server responds by compressing the data before sending it, resulting in a 20 times reduction in data volume. Similarly, using a REST API, allows users to leverage the existing infrastructure to load balance, encrypt, authenticate, cache and proxy requests.
The real power of the RESTflow API becomes apparent when it is accessed programmatically. For example, the following Python script defines the TCP flow described earlier and continuously retrieves new flow records:
#!/usr/bin/env python
import requests
import json
import signal

rt = ''
name = 'tcp'

def sig_handler(signal,frame):
  requests.delete(rt + '/flow/' + name + '/json');
signal.signal(signal.SIGINT, sig_handler)

flow = {'keys':'ipsource,ipdestination,tcpsourceport,tcpdestinationport',
r = requests.put(rt + '/flow/' + name + '/json',data=json.dumps(flow))

flowurl = rt + '/flows/json?name=' + name + '&maxFlows=100&timeout=60'
flowID = -1
while 1 == 1:
  r = requests.get(flowurl + "&flowID=" + str(flowID))
  if r.status_code != 200: break
  flows = r.json()
  if len(flows) == 0: continue

  flowID = flows[0]["flowID"]
  for f in flows:
    print str(f['flowKeys']) + ',' + str(int(f['value'])) + ',' + str(f['end'] - f['start']) + ',' + f['agent'] + ',' + str(f['dataSource'])
The following command runs the script, which results in the newly arriving flow records being printed as comma separated text:
$ ./,,38834,3260,4000,98100,,5,,39046,22,837800,60000,,2,,39046,22,851433,60399,,25,,443,48859,12597,64000,,1,,22,39046,67049,61800,,19
Instead of simply printing the flow records, the script could easily add them to scale out databases like MongoDB so that they can be combined with other types of information and easily searched.

The sFlow-RT REST API doesn't just provide access to completed flows, access to real-time information on in progress flows is available by querying the central flow cache. For example, the following command searches the flow cache and reports the most active flow in the network (based on current data transfer rate, i.e. bits/second).
$ curl
 "agent": "",
 "dataSource": "2",
 "metricN": 9,
 "metricName": "tcp",
 "metricValue": 29958.06882871171,
 "topKeys": [
   "key": ",,443,40870",
   "updateTime": 1377664899679,
   "value": 29958.06882871171
   "key": ",,3260,56044",
   "updateTime": 1377664888679,
   "value": 23.751630816369214
 "updateTime": 1377664899679
As well as identifying the most active flow, the query result also identifies the switch and port carrying the traffic (out of potentially tens of thousands of ports being monitored).

While flow records are a useful log of completed flows, the ability to track flows in real time transforms traffic monitoring from a reporting tool to a powerful driver for active control, unlocking the capabilities of software defined networking to dynamically adapt the network to changing demand. Embedded flow caches in networking devices are not easily accessible and even if there were a programmatic way to access the on device cache, polling thousands of devices would take so long that the information would be stale by the time it was retrieved.
Figure 4: Visibility and the software defined data center
Looking at the big picture, flow export is only one of many functions that can be performed by an sFlow analyzer, some of which have been described on this blog. Providing simple, programmatic, access allows these functions to be integrated into the broader orchestration system. REST APIs are the obvious choice since they are already widely used in data center orchestration and monitoring tools.
Embedded flow monitoring solutions typically require CLI access to the network devices to define flow caches and direct flow record export. Access to switch configurations is tightly controlled by the network management team and configuration changes are often limited to maintenance windows. In part this conservatism results because hardware resource limitations on the devices need to be carefully managed - for example, a misconfigured flow cache can destabilize the performance of the switch. In contrast, the central sFlow analyzer is software running on a server with relatively abundant resources that can safely support large numbers of requests without any risk of destabilizing the network.
The REST APIs in sFlow-RT are part of a broader movement to break out of the networking silo and integrate management of network resources with the orchestration tools used to automatically manage compute, storage and application resources. Automation transforms the network from a fragile static resource into a robust and flexible resource that can be adapted to support the changing demands of the applications it supports.

Monday, August 26, 2013

NSX network gateway services

Figure 1: VMware NSX network gateway services partners
VMware recently released the list of Network Gateway Services (top of rack switch) partners. All but one of these vendors supports the sFlow standard for network visibility across their full range of data center switches (Arista Networks, Brocade Networks, Dell Systems, HP and Juniper Networks). The remaining vendor, Cumulus Networks, has developed a version of Linux that runs on merchant silicon based hardware platforms. Merchant silicon switch ASICs include hardware support for sFlow and it is likely that future versions of Cumulus Linux will expose this capability.
Figure 2: Network gateway services / VxLAN tunnel endpoint (VTEP)
Figure 2 from Network Virtualization Gets Physical shows the role that top of rack switches play in virtualizing physical workloads (e.g. servers, load balancers, firewalls, etc.). Essentially, the physical top of rack switch provides the same services for the physical devices as Open vSwitch in the hypervisor provides for virtual machines. The OVSDB protocol, described in the Internet Draft (ID) The Open vSwitch Database Management Protocol, allows the NSX controller to configure physical and virtual switches to set up the VxLAN tunnels used to overlay the virtual networks over the underlying physical network.

The Open vSwitch also supports the sFlow standard, providing a common monitoring solution for virtual switches and top of rack switches. In addition, core switches and routers from the listed partner vendors (and many other switch vendors) also implement sFlow, offering complete end to end visibility into traffic flowing on the virtualized and physical networks.

The packet header export mechanism in sFlow is uniquely suited to monitoring tunneled traffic, see Tunnels, exposing inner and outer addresses and allowing monitoring tools to trace virtualized traffic as it flows over the physical fabric, see Down the rabbit hole.

In addition, F5 networks is listed as a partner in the Application Delivery category. F5 supports sFlow on their BIG IP platform, see F5 BIG-IP LTM and TMOS 11.4.0, providing visibility into application performance (including response times, URLs, status etc.) and linking the front-end performance seen by clients accessing virtual IP addresses (VIPS) with the performance of individual back-end servers.

Embedding visibility in all the elements of the data center provides comprehensive, cost effective, visibility into data center resources. Visibility is critical in reducing operational complexity, improving performance, decreasing the time to identify the root cause of performance problems, and isolating performance between virtual networks, see Multi-tenant performance isolation.

Saturday, August 17, 2013

RESTful control of switches

Figure 1: Performance aware software defined networking with OpenFlow controller
Software defined networking (SDN) controllers typically offer RESTful Northbound APIs to support application developers. RESTful APIs are widely used in systems software, allowing an SDN application to use a common set of tools to orchestrate the network and the services that make use of it. For example, Performance aware software defined networking describes how standard sFlow instrumentation in the switch forwarding plane can be used to drive configuration changes to address important use cases, such as DDoS mitigation, large flow load balancing, multi-tenant performance isolation, traffic engineering, and packet capture.

Programmatic configuration of network devices is one of the challenges in SDN solutions. Even if the OpenFlow protocol is going to be used to control forwarding, there is still a need to perform basic configuration tasks like connecting the switch to the controller, configuring interfaces, enabling monitoring, etc. One interesting option is to implement a RESTful configuration API directly on the switches.
Figure 2: Controller-less performance aware software defined networking
Not only does a RESTful API provide a way for an SDN controller to configure the switches, but it also provides a convenient way for SDN applications to directly access switches - making controller-less SDN applications an option. There are limitations with using HTTP as the Southbound API for applications that require large scale, fine grain control of the data plane. In these use cases an OpenFlow controller based solution offers significant advantages. However, for simple use cases, a controller-less design is an attractive alternative.

This article explores the concept of using a RESTful switch API to develop SDN applications, using Arista Network's implementation of JSON-RPC as an example, see eAPI: Learning the basics.

The following script re-implements the Large flow detection script in Python:

#!/usr/bin/env python

import requests
import json
import signal
from jsonrpclib import Server

switch_ips = ["",""]
username = "user"
password = "password"

sflow_ip = ""
sflow_port = "6343"
sflow_polling = "20"
sflow_sampling = "10000"

metric = "largeflow"
metric_threshold = 125000000

flows = { "keys":"ipsource,ipdestination",
          "t":2 }
threshold = {"metric":metric,"value":metric_threshold,"byFlow":True}

for switch_ip in switch_ips:
  switch = Server("http://%s:%s@%s/command-api" %
                (username, password, switch_ip))
  response = switch.runCmds(1,
    "sflow source %s" % switch_ip,
    "sflow destination %s %s" % (sflow_ip, sflow_port),
    "sflow polling-interval %s" % sflow_polling,
    "sflow sample output interface",
    "sflow sample dangerous %s" % sflow_sampling,
    "sflow run"])

r = requests.put("http://%s:8008/flow/%s/json" % (sflow_ip, metric),
r = requests.put("http://%s:8008/threshold/%s/json" % (sflow_ip, metric),

def sig_handler(signal,frame):
  requests.delete("http://%s:8008/flow/%s/json" % (sflow_ip, metric))
  requests.delete("http://%s:8008/threshold/%s/json" % (sflow_ip, metric))
signal.signal(signal.SIGINT, sig_handler)

eventID = -1
while 1 == 1:
  r = requests.get("http://%s:8008/events/json?maxEvents=10&timeout=60&eventID=%s"
                   % (sflow_ip,eventID))
  if r.status_code != 200: break
  events = r.json()
  if len(events) == 0: continue

  eventID = events[0]["eventID"]
  for e in events:
    if metric == e["metric"]:
      print e["flowKey"]

There are a few points worth noting:
  1. The script starts by making JSON-RPC calls to configure sFlow on a pair of switches ( and The script can easily configure large numbers of switches by adding the additional switches to the switch_ips list. See, Configuring Arista switches for more information on the sFlow configuration commands.
  2. The switches are configured to send the sFlow data to the sFlow-RT analyzer running on host
  3. The script then makes REST calls to the sFlow analyzer to configure it track ipsource,ipdestination flows and generate an event when any flows consumes more than 10% of the bandwidth of a 10Gigabit link, 125000000 bytes per second.
  4. The script then waits for events, displaying the flows as they are reported.
The script can be extended to develop automated traffic engineering solutions: modifying the flow definitions and adding additional eAPI calls to implement configuration changes in response to the detected flows, including blocking, rate limiting, routing changes etc.

Many switches already contain an embedded web server in order to provide a user friendly web interface for operators to monitor and configure the devices. Moving to a RESTful API for programmatic access allows large numbers of devices to be managed by orchestration systems, reducing costs and eliminating manual errors. Automating configuration management doesn't just reduce costs. More importantly, automated configuration management increases agility, transforming the network from a fragile static resource into a robust and flexible resource that can be adapted to changing demands.

Thursday, August 15, 2013

Frenetic, Pyretic and Resonance

Northbound APIs for traffic engineering describes some of the limitations with current OpenFlow controllers and describes some of the features needed to support traffic engineering applications. Looking for alternatives, I was excited to discover the Frenetic project, a collaborative effort between researchers at Princeton and Cornell to develop a language for developing SDN applications.

The Frenetic project has released Pyretic, an embedded implementation of Frenetic in Python. Looking further, I discovered that Pyretic is being used by the PyResonance project to implement a finite state machine (FSM) framework developed by researchers at the Georgia Institute of Technology. Resonance FSM provides a framework for an SDN controller to react to external events by changing forwarding policies. PyResonance expresses policies in the Frenetic language which can compose policies from multiple logical modules (for example forwarding, intrusion detection and access control) into OpenFlow rules which are then pushed to network switches.
This article uses the Mininet testbed described in Controlling large flows with OpenFlow to demonstrate how real-time sFlow measurements can be used to drive dynamic changes in the network using the PyResonance controller.

First install Pyretic. Next install PyResonance.

For development, it is helpful to run each tool in a separate window so that you can see an logging messages (in a production setting processes would be daemonized).

1. Start Mininet, specifying a remote controller

sudo mn --controller=remote --topo=single,3 --mac --arp

2. Start PyResonance

cd pyretic
./ pyretic.pyresonance.resonance_simple -m i

3. Enable authorize each of the hosts

cd pyretic/pyretic/pyresonance
./ -i -e auth -V authenticated
./ -i -e auth -V authenticated
./ -i -e auth -V authenticated
At this point all three hosts should be able to communicate. This can be verified by running the following command in the Mininet console:
mininet> pingall
*** Ping: testing ping reachability
h1 -> h2 h3 
h2 -> h1 h3 
h3 -> h1 h2
Note: It's interesting to play around with the forwarding authenticated policy. You can clear the authenticated state with the following command and run pingall again to verify that the host has been excluded from the network.
./ -i -e auth -V clear

4. Configure sFlow monitoring on the vSwitch

sudo ovs-vsctl -- --id=@sflow create sflow agent=eth0  target=\"\" sampling=2 polling=20 -- -- set bridge s1 sflow=@sflow
Note: A low sampling rate of 1-in-2 was used because stability problems with the controller didn't allow realistic traffic levels to be used (see Results section below).

5. Start sFlow-RT

cd sflow-rt

DDoS mitigation application

The following node.js script is based on the script in Controlling large flows with OpenFlow.
var http = require('http');
var exec = require('child_process').exec;

var keys = 'ipsource';
var value = 'frames';
var filter = 'outputifindex!=discard&direction=ingress';
var thresholdValue = 4;
var metricName = 'ddos';

var rt = { hostname: 'localhost', port: 8008 };
var flows = {'keys':keys,'value':value,'filter':filter, t:2};
var threshold = {'metric':metricName,'value':thresholdValue, byFlow:true};

function extend(destination, source) {
  for (var property in source) {
    if (source.hasOwnProperty(property)) {
      destination[property] = source[property];
  return destination;

function jsonGet(target,path,callback) {
  var options = extend({method:'GET',path:path},target);
  var req = http.request(options,function(resp) {
    var chunks = [];
    resp.on('data', function(chunk) { chunks.push(chunk); });
    resp.on('end', function() { callback(JSON.parse(chunks.join(''))); });

function jsonPut(target,path,value,callback) {
  var options = extend({method:'PUT',headers:{'content-type':'application/json'}
  var req = http.request(options,function(resp) {
    var chunks = [];
    resp.on('data', function(chunk) { chunks.push(chunk); });
    resp.on('end', function() { callback(chunks.join('')); });

function sendy(address,type,state) {
  function callback(error, stdout, stderr) { };
  exec("../pyretic/pyretic/pyresonance/ -i " 
        + address + " -e " + type + " -V " + state, callback);  

function getEvents(id) {
  jsonGet(rt,'/events/json?maxEvents=10&timeout=60&eventID='+ id,
    function(events) {
      var nextID = id;
      if(events.length > 0) {
        nextID = events[0].eventID;
        var now = (new Date()).getTime();
        for(var i = 0; i < events.length; i++) {
          var evt = events[i];
          var dt = now - evt.timestamp;
          if(metricName == evt.thresholdID
            && Math.abs(dt) < 5000) {
            var flowKey = evt.flowKey;

function setFlows() {
  jsonPut(rt,'/flow/' + metricName + '/json',
    function() { setThreshold(); }

function setThreshold() {
  jsonPut(rt,'/threshold/' + metricName + '/json',
    function() { getEvents(-1); }

function initialize() {

There are a few points worth noting:
  1. This script is simpler than the original in Controlling large flows with OpenFlow. The application is no longer responsible for translating traffic measurements into concrete OpenFlow actions. Instead, the script expresses actions in terms of high level policy and the controller handles the translation of policy into specific OpenFlow actions.
  2. The script currently calls the script to implement policy changes. The Python script simply sends a json message over a TCP socket to the PyResonance controller and could easily be implemented in node.js, allowing the application to directly communicate with the controller. Alternatively, the controller could be modified to include a RESTful HTTP version of the API.
  3. The low threshold value of 4 packets per second assigned because stability problems with the controller didn't allow realistic traffic levels to be used (see Results section below).


The following command runs the denial of service mitigation script:
nodejs resonance.js
This example uses a Ping Flood to demonstrate a basic denial of service attack.

The following Mininet command opens a terminal window connected to host h1:
mininet> xterm h1
Start by generating small amount of traffic by typing the following command into the terminal to generate traffic between h1 and h2:
Now generate a ping "flood" between h1 and h2:
ping -i 0.3

The chart shows that the controller is able to respond quickly when the traffic flow exceeds the defined threshold of 4 packets per second. The mitigation control is applied within a second, removing the host from the network.

The ping flood attack is quickly detected by sFlow-RT, which notifies the mitigation application, see Large flow detection script for a discussion of detection times and sampling rates. The mitigation application retrieves details of the attack from the sFlow-RT and executes the following policy change:
./ -i -e auth -V clear
The PyResonance controller receives this message and the authorization state machine for host is set to "not authenticated". This state change results in a new Frenetic policy blocking traffic from this address. The policy is translated into a new set of OpenFlow rules for the switch that drop packets from the blocked address, but still allow authenticated hosts to send traffic.

Note: The host can easily be re-authenticated by issuing the command:
./ -i -e auth -V authenticated
While far from a complete application, this example demonstrates how the sFlow and OpenFlow standards can be combined to build fast acting performance aware SDN applications that address important use cases, such as DDoS mitigation, large flow load balancing, multi-tenant performance isolation, traffic engineering, and packet capture. The Mininet platform provides a convenient way to develop, test and share applications addressing these use cases.

The performance of the PyResonance controller was disappointing, limiting the tests to very low packet rates. Higher packet rates and large packet sizes quickly crash the controller. It isn't clear where the problem lies. Running a simple learning bridge in Pyretic is stable. The problem could be due to bugs in the PyResonance code, or PyResonance could be exposing bugs in Pyretic. It should be noted that the current version of Pyretic is a simple interpreter designed to test the Frenetic language and that the project plans to provide compilers that will proactively push rules to devices and deliver performance equivalent to custom built controllers, see Composing Software Defined Networks.  The Lithium controller project also looks interesting since they are building a controller based on finite state machines and policies that could also be a useful platform for traffic engineering. 
Aug. 29, 2013 Update: Joshua Reich, the developer of Pyretic, identified a known Python bug as the cause of the instability at high traffic levels and kindly provided a patch for Python 2.7. In addition, Hyojoon Kim, the developer of PyResonance, pointed out that the -m i option included in Step 2 of the instructions above puts the controller into reactive mode (which is much slower than proactive rule insertion). After applying the patch and dropping the -m i option, the PyResonance controller performance is now comparable to the previous tests that used the Floodlight controller, see Controlling large flows with OpenFlow
Oct. 1, 2013 Update: Embedding SDN applications repeats this experiment using sFlow-RT's embedding JavaScript API and demonstrates that the PyResonance controller performs at the same level as Floodlight on the Minet testbed, see Controlling large flows with OpenFlow
Northbound APIs for traffic engineering described some of the frustrations with current OpenFlow controllers and argued that higher levels of abstraction and mechanisms for composing policies from multiple SDN applications is critical to the long term success of software defined networking. It is exciting to see the researchers tackling these problems and it will be interesting to see how long it takes for these technologies to make their way into production quality controllers.

Tuesday, August 13, 2013

Northbound APIs for traffic engineering

Figure 1: Performance aware software defined networking
Previous articles on this blog have looked at use cases for performance aware software defined networking, including DDoS mitigation, load balancing large flows, traffic marking and packet brokers. Figure 1 shows the components of a traffic engineering SDN application. The sFlow-RT analytics engine receives a continuous stream of sFlow datagrams from network devices and converts them into actionable metrics, accessible through a REST API. SDN applications respond to changing network traffic by instructing a controller to adapt the network. In the diagram, an OpenFlow controller uses the OpenFlow protocol to communicate with network devices and modify their forwarding behavior.

Based on experiences with a variety of OpenFlow controllers, this article identifies areas where OpenFlow controllers could be improved in order to better support traffic engineering applications.

Firstly, it is worth understanding why external measurements need to be incorporated to build traffic engineering SDN applications. The following entry from the Floodlight FAQ page describes the limitations of OpenFlow for delivering measurements:
How can I measure link bandwidth and latency in Floodlight?
You can find the theoretical max link bandwidth of a port by making a features query through the REST API. Unfortunately finding the real link latency and bandwith is not as simple. OpenFlow, being a control-plane protocol has great difficult accurately determining data-plane measurements. The way to do this with OpenFlow entails making periodic flow stats requests to the switches on the network. The problem is that stats are never truly accurate. By the time they have been processed on the switch, sent over the network, and then processed by the controller they will be out of date. Data-plane measurements are best handled by third party applications.
Instrumentation to support the sFlow standard is embedded in the data plane of most switches, providing the real time visibility into link utilizations and large flows needed to support traffic engineering applications.

The first set of requirements arises from need to translate between the agent addresses and interface index numbers used to identify switches and ports in sFlow and the identifiers used in OpenFlow.
  • Agent IP address ⟷ OpenFlow switch ID
  • SNMP ifIndex ⟷ OpenFlow port ID
Note: The requirement for these mappings isn't unique to sFlow, incorporating measurements from other measurement protocols requires similar mappings (e.g. NetFlow, IPFIX, jFlow, NetStream, SNMP etc.)

In order to support the incorporation of external data, the OpenFlow controller Northbound APIs should allow control actions to be expressed using external identifiers, automatically translating them into OpenFlow identifiers.

Any device supporting OpenFlow and sFlow (or any of the other measurement protocols) will be able to map between the different port and device identifiers. Exposing these mappings to OpenFlow controllers can be easily accomplished through the device configuration protocol. For example, the Open vSwitch project recently added SNMP ifIndex numbers to the Interface table exposed by its configuration protocol (ovs-vsctl):
% ovs-vsctl --format json --columns name,ofport,ifindex list Interface
Similar capabilities could be added to the Open Networking Foundation configuration protocol (OF-Config).

The second set of requirements is more challenging. Current generation OpenFlow controllers are limited in their ability to integrate requests from multiple applications into a unified set of OpenFlow commands. For example, traffic engineering applications would be much easer to develop if the OpenFlow controller exposed the following Northbound actions:
  • mark flow at network edge (DSCP, priority etc.)
  • drop flow at network edge
  • rate limit flow at network edge
  • steer flow at network core
The controller should support wild card matches on flow keys for these commands. In addition, the controller needs to accept commands from other APIs (authentication, access control etc.) and compose the potentially competing settings into OpenFlow rules that are realizable by the switch hardware and ensure that packets reach their destination.

It is often argued that OpenFlow controllers are becoming a commodity and that the value of SDN derives from applications running on top of the controller (e.g. Much Ado About SDN Controller Platforms).  However, a rich ecosystem of applications is only possible if the controller delivers the platform services needed to safely deploy multiple third party applications on production networks. An effective controller simplifies the task of application developers by factoring APIs and allowing each application to address a specific task and rely on the platform to combine applications to deliver a complete service. Customers benefit by being able to draw from a rich selection of pre-built applications in order to meet their specific requirements.

Today, controllers are alike because they are little more than OpenFlow device drivers supporting a single embedded SDN application (e.g. packet capture) that exports a RESTful Northbound API for configuration. The field is still wide open in the OpenFlow controller space, the first controller platform capable of supporting a rich ecosystem of commercial and open source SDN applications has the potential to quickly displace existing controllers, establish dominance in the market and move SDN into the mainstream.

Sunday, August 4, 2013

Visibility and the software defined data center

The emerging software defined data center (SDDC) involves automated control of all network, server, storage and application resources - resulting in a "cloud operating system." Unified visibility is essential - allowing the cloud operating system to efficiently allocate resources, detect problems and ensure consistent performance.

The diagram shows how standard sFlow agents embedded within the elements of the infrastructure, stream essential performance metrics to management tools, ensuring that every resource in a dynamic cloud infrastructure is immediately detected and continuously monitored.
  • Applications - e.g. Apache, NGINX, Tomcat, Memcache, HAProxy, F5 ...
  • Virtual Servers - e.g. Xen, Hyper-V, KVM ...
  • Virtual Network - e.g. Open vSwitch, Hyper-V extensible vSwitch
  • Servers - e.g. BSD, Linux, Solaris and Windows
  • Network - over 30 switch vendors, see Vendor support.
Embedding instrumentation within the infrastructure ensures that all resources are efficiently monitored and eliminates the complexity of installing, configuring and managing add-on measurement components later.

By standardizing the essential metrics, sFlow breaks the dependency between agents and management tools. Instead of having to install monitoring tool specific agents on each server - often multiple agents per server, each exporting similar metrics - a single embedded sFlow agent exports a standard set of metrics that ensure consistency in performance reporting, no matter which tools you choose for analysis.

More importantly, the measurements reported by sFlow agents form part of a consistent data model, allowing the measurements to be combined to deliver a coherent picture of the performance of an integrated view of the performance of all the application instances and the server and network resources they depend on. The consistent, comprehensive and timely view of data center performance provided by the sFlow standard is an essential component of a cloud operating system.