Wednesday, October 28, 2015

Active Route Manager

SDN Active Route Manager has been released on GitHub, https://github.com/sflow-rt/active-routes. The software is based on the article White box Internet router PoC. Active Route Manager peers with a BGP route reflector to track prefixes and combines routing data with sFlow measurements to identify the most active prefixes. Active prefixes can be advertised via BGP to a commodity switch, which acts as a hardware route cache, accelerating the performance of a software router.
There is an interesting parallel with the Open vSwitch architecture, see Open vSwitch performance monitoring, which maintains a cache of active flows in the Linux kernel to accelerate forwarding. In the SDN routing case, active prefixes are pushed to the switch ASIC in order to bypass the slower software router.
In this example, the software is being used in passive mode, estimating the cache hit / miss rates without offloading routes. The software has been configured to manage a cache of 10,000 prefixes. The first screen shot shows the cache warming up.

The first panel shows routes being learned from the route reflector: the upper chart shows the approximately 600,000 routes being learned from the BGP route reflector, and the lower chart shows the rate at which routes are being added (peaking at just over 300,000 prefixes per second).

The second panel shows traffic analytics: the top chart shows how many of the prefixes are seeing traffic (shown in blue) as well as the number inactive prefixes that are covered by the active prefixes (shown in red). The lower chart shows the percentage of traffic destined to the active prefixes.

The third panel shows the behavior of the cache: the upper chart shows the total number of prefixes in the cache, the middle chart, the rate of prefix additions / removals from the cache, and the lower chart shows the cache miss rate, dropping to less than 10% within a couple of seconds of the BGP route reflector session being established.
The second screen shot shows the cache a few minutes later once it has warmed up and is in steady state. There are approximately 10,000 prefixes in the cache with prefixes being added and removed as traffic patterns change and routes are added and dropped. The estimated cache miss rate is less than 0.5% and misses are mostly due to newly active prefixes rather than recently deleted cache entries.

This example is a further demonstration that it is possible to use SDN analytics and control to combine the standard sFlow and BGP capabilities of commodity switch hardware and deliver Terabit routing capacity.

Friday, October 9, 2015

Fabric View

The Fabric View application has been released on Github, https://github.com/sflow-rt/fabric-view Fabric View provides real-time visibility into the performance of leaf and spine ECMP fabrics.
A leaf and spine fabric is challenging to monitor. The fabric spreads traffic across all the switches and links in order to maximize bandwidth. Unlike traditional hierarchical network designs, where a small number of links can be monitored to provide visibility, a leaf and spine network has no special links or switches where running CLI commands or attaching a probe would provide visibility. Even if it were possible to attach probes, the effective bandwidth of a leaf and spine network can be as high as a Petabit/second, well beyond the capabilities of current generation monitoring tools.

Fabric View solves the visibility challenge by using the industry standard sFlow instrumentation built into most data center switches. Fabric View represents the fabric as if it were a single large chassis switch, treating each leaf switch as a line card and the spine switches as the backplane. The result is an intuitive tool that is easily understood by anyone familiar with traditional networks.

Fabric View provides real-time, second-by-second visibility to traffic, identifying top talkers, protocols, tenants, tunneled traffic, etc. In addition, Fabric View reveals key fabric performance indicators such as number of congested spine links, links with colliding Elephant flows, discards and errors.

If you have a leaf / spine network, download the software and try it out. Community support is available on sFlow-RT.com.

Demo


If you don't have access to a live network, Fabric View includes data captured from the Cumulus Networks workbench network shown above, a 2 leaf / 2 spine 10Gbit/s network.

First, download sFlow-RT
wget http://www.inmon.com/products/sFlow-RT/sflow-rt.tar.gz
tar -xvzf sflow-rt.tar.gz
cd sflow-rt
Next, install Fabric View:
./get-app.sh sflow-rt fabric-view
Now edit the start.sh script to playback the captured packet trace:
#!/bin/sh

HOME=`dirname $0`
cd $HOME

JAR="./lib/sflowrt.jar"
JVM_OPTS="-Xincgc -Xmx200m -Dsflow.file=app/fabric-view/demo/ecmp.pcap"
RT_OPTS="-Dsflow.port=6343 -Dhttp.port=8008"
SCRIPTS="-Dscript.file=init.js"

exec java ${JVM_OPTS} ${RT_OPTS} ${SCRIPTS} -jar ${JAR}
Start sFlow-RT:
[user@server sflow-rt]$ ./start.sh 
2015-10-09T11:08:25-0700 INFO: Reading PCAP file, app/fabric-view/demo/ecmp.pcap
2015-10-09T11:08:26-0700 INFO: Starting the Jetty [HTTP/1.1] server on port 8008
2015-10-09T11:08:26-0700 INFO: Starting com.sflow.rt.rest.SFlowApplication application
2015-10-09T11:08:26-0700 INFO: Listening, http://localhost:8008
2015-10-09T11:08:26-0700 INFO: init.js started
2015-10-09T11:08:26-0700 INFO: app/fabric-view/scripts/fabric-view-stats.js started
2015-10-09T11:08:26-0700 INFO: app/fabric-view/scripts/fabric-view.js started
2015-10-09T11:08:26-0700 INFO: app/fabric-view/scripts/fabric-view-elephants.js started
2015-10-09T11:08:26-0700 INFO: app/fabric-view/scripts/fabric-view-usr.js started
2015-10-09T11:08:27-0700 INFO: init.js stopped
Now install the network topology and address groupings:
[user@server sflow-rt]$ curl -H 'Content-Type:application/json' -X PUT --data @app/fabric-view/demo/topology.json http://127.0.0.1:8008/app/fabric-view/scripts/fabric-view.js/topology/json
[user@server sflow-rt]$ curl -H 'Content-Type:application/json' -X PUT --data @app/fabric-view/demo/groups.json http://127.0.0.1:8008/app/fabric-view/scripts/fabric-view.js/groups/json
Finally, access the web interface http://server:8008/app/fabric-view/html/ and you should see the screen shown at the top of this article.

Real-time control

Visibility is just the starting point. Real-time detection of congested links and Elephant flows can be used to drive software defined networking (SDN) control actions to improve performance, for example, by marking and/or steering Elephant flows.

Leaf and spine traffic engineering using segment routing and SDN describes a demonstration shown at the 2015 Open Network Summit that integrated Fabric View with the ONOS controller to load balance Elephant flows.

The fabric-view-elephants.js script has three dummy function that are place holders for control actions:
function elephantStart(flowKey, rec) {
  // place holder to mark elephant flows
}

function linkBusy(linkDs, linkRec, now) {
  // place holder to steer elephant flows
}

function elephantEnd(flowKey, rec) {
  // place holder to remove marking
}
RESTful control of Cumulus Linux ACLs demonstrated how flows could be marked. Adding the following code to the fabric-view-elephants.js script demonstrates how the large flow marking function can be integrated in Fabric View:
var aclProtocol = "http";
var aclPort = 8080;
var aclRoot = '/acl/';
var mark_dscp = 10;
var mark_cos  = 5;
var id = 0;

var marking_enabled = true;

function newRequest(host, path) {
    var url = aclProtocol + "://" + host;
    if(aclPort) url += ":" + aclPort;
    url += aclRoot + path;
    var req = {"url":url};
    if(aclUser) req.user = aclUser;
    if(aclPasswd) req.password = aclPasswd;
    return req;
}

function submitRequest(req) {
    if(!req.error) req.error =  function(err) { logWarning("request error " + req.url + " error: " + err); };
    try { httpAsync(req); }
    catch(e) { logWarning('bad request, ' + req.url + " " + e); }
}

function elephantStart(flowKey, rec) {
   if(!marking_enabled) return;

   var [ipsrc,ipdst,proto,srcprt,dstprt] = flowKey.split(',');

   // only interested in TCP flows
   if(proto !== 6) return;

   var acl = [
     '[iptables]',
     '# marking Elephant flow',
     '-t mangle -A FORWARD --in-interface swp+ '
     + ' -s ' + ipsrc + ' -d ' + ipdst 
     + ' -p tcp --sport ' + srcprt + ' --dport ' + dstprt
     + ' -j SETQOS --set-dscp ' + mark_dscp + ' --set-cos ' + mark_cos
   ];

   var rulename = 'mark' + id++;
   var req = newRequest(rec.agent, '/acl/'+rulename);
   req.operation = 'PUT';
   req.headers = {
     "Content-Type":"application/json; charset=utf-8",
     "Accept":"application/json"
   };
   req.body = JSON.stringify(acl);
   req.error= function(res) {
      logWarning('mark failed=' + res + ' agent=' + rec.agent);
   };
   req.success = function(res) {
      logInfo('mark rule=' + rulename);
   };
   rec.rulename=rulename; 
   submitRequest(req);
}

function elephantEnd(flowKey, rec) {
   if(!rec.rulename) return;

   var req = newRequest(rec.agent, '/acl/'+rec.rulename);
   req.operation = 'DELETE';
   req.error = function(res) {
      logInfo('delete failed='+res);
   };
   req.success = function(res) {
      logInfo('delete rule='+rec.rulename);
   };
   submitRequest(req);
}
This example doesn't only apply to Cumulus Linux. It can easily be modified to interact with other vendor's switch API's, for example NX-API for Cisco Nexus 9k/3k switches, eAPI for Arista switches, or with SDN controller REST APIs, such as Floodlight, OpenDaylight, ONOS etc.