Saturday, September 17, 2016

Triggered remote packet capture using filtered ERSPAN

Packet brokers are typically deployed as a dedicated network connecting network taps and SPAN/mirror ports to packet analysis applications such as Wireshark, Snort, etc.

Traditional hierarchical network designs were relatively straightforward to monitor using a packet broker since traffic flowed through a small number of core switches and so a small number of taps provided network wide visibility. The move to leaf and spine fabric architectures eliminates the performance bottleneck of core switches to deliver low latency and high bandwidth connectivity to data center applications. However, traditional packet brokers are less attractive since spreading traffic across many links with equal cost multi-path (ECMP) routing means that many more links need to be monitored.

This article will explore how the remote Selective Spanning capability in Cumulus Linux 3.0 combined with industry standard sFlow telemetry embedded in commodity switch hardware provides a cost effective alternative to traditional packet brokers.

Cumulus Linux uses iptables rules to specify packet capture sessions. For example, the following rule forwards packets with source IP 20.0.1.0 and destination IP 20.0.1.2 to a packet analyzer on host 20.0.2.2:
-A FORWARD --in-interface swp+ -s 20.0.0.2 -d 20.0.1.2 -j ERSPAN --src-ip 90.0.0.1 --dst-ip 20.0.2.2
REST API for Cumulus Linux ACLs describes a simple Python wrapper that exposes IP tables through a RESTful API. For example, the following command remotely installs the capture rule on switch 10.0.0.233:
curl -H "Content-Type:application/json" -X PUT --data \
  '["[iptables]","-A FORWARD --in-interface swp+ -s 20.0.0.2 -d 20.0.1.2 -j ERSPAN --src-ip 90.0.0.1 --dst-ip 20.0.2.2"]' \
  http://10.0.0.233:8080/acl/capture1
The following command deletes the rule:
curl -X DELETE http://10.0.0.233:8080/acl/capture1
Selective Spanning makes it possible to turn every switch and port in the network into a capture device. However, it is import to carefully select which traffic to capture since the aggregate bandwidth of an ECMP fabric is measured in Terabits per second - far more traffic than can be handled by typical packet analyzers.
SDN packet broker describes an analogy for the role that sFlow plays in steering the capture network to that of a finderscope, the small wide-angle telescope used to provide an overview of the sky and guide a telescope to its target. The article goes on to describes some of the benefits of combining sFlow analytics with selective packet capture:
  1. Offload The capture network is a limited resource, both in terms of bandwidth and in the number of flows that can be simultaneously captured.  Offloading as many tasks as possible to the sFlow analyzer frees up resources in the capture network, allowing the resources to be applied where they add most value. A good sFlow analyzer delivers data center wide visibility that can address many traffic accounting, capacity planning and traffic engineering use cases. In addition, many of the packet analysis tools (such as Wireshark) can accept sFlow data directly, further reducing the cases where a full capture is required.
  2. Context Data center wide monitoring using sFlow provides context for triggering packet capture. For example, sFlow monitoring might show an unusual packet size distribution for traffic to a particular service. Queries to the sFlow analyzer can identify the set of switches and ports involved in providing the service and identify a set of attributes that can be used to selectively capture the traffic.
  3. DDoS Certain classes of event such as DDoS flood attacks may be too large for the capture network to handle. DDoS mitigation with Cumulus Linux frees the capture network to focus on identifying more serious application layer attacks.
The diagram at the top of this article shows an example of using sFlow to target selective capture of traffic to blacklisted addresses. In this example sFlow-RT is used to perform real-time sFlow analytics. The following emerging.js script instructs sFlow-RT to download the Emerging Threats blacklist and identify any local hosts that are communicating with addresses in the blacklist. A full packet capture is triggered when a potentially compromised host is detected:
var wireshark = '10.0.0.70';
var idx=0;
function capture(localIP,remoteIP,agent) {
  var acl = [
    '[iptables]',
    '# emerging threat capture',
    '-A FORWARD --in-interface swp+ -s '+localIP+' -d '+remoteIP 
    +' -j ERSPAN --src-ip '+agent+' --dst-ip '+wireshark,
    '-A FORWARD --in-interface swp+ -s '+remoteIP+' -d '+localIP 
    +' -j ERSPAN --src-ip '+agent+' --dst-ip '+wireshark
  ];
  var id = 'emrg'+idx++;
  logWarning('capturing '+localIP+' rule '+id+' on '+agent);
  http('http://'+agent+':8080/acl/'+id,
        'PUT','application/json',JSON.stringify(acl));
}

var groups = {};
function loadGroup(name,url) {
  try {
    var res, cidrs = [], str = http(url);
    var reg = /^(\d{1,3}\.){3}\d{1,3}(\/\d{1,2})?$/mg;
    while((res = reg.exec(str)) != null) cidrs.push(res[0]);
    if(cidrs.length > 0) groups[name]=cidrs;
  } catch(e) {
    logWarning("failed to load " + url + ", " + e);
  }
}

loadGroup('compromised',
  'https://rules.emergingthreats.net/blockrules/compromised-ips.txt');
loadGroup('block',
  'https://rules.emergingthreats.net/fwrules/emerging-Block-IPs.txt');
setGroups('emerging',groups);

setFlow('emerging',
  {keys:'ipsource,ipdestination,group:ipdestination:emerging',value:'frames',
   log:true,flowStart:true});

setFlowHandler(function(rec) {
  var [localIP,remoteIP,group] = rec.flowKeys.split(',');
  try { capture(localIP,remoteIP,rec.agent); }
  catch(e) { logWarning("failed to capture " + e); }
});
Some comments about the script:
  1. The script uses sFlow telemetry to identify the potentially compromised host and the location (agent) observing the traffic.
  2. The location information is required so that the capture rule can be installed on a switch that is in the traffic path.
  3. The application has been simplified for clarity. In production, the blacklist information would be periodically updated and the capture sessions would be tracked so that they can be deleted when they they are no longer required.
  4. Writing Applications provides an introduction to sFlow-RT's API.
Configure sFlow on the Cumulus switches to stream telemetry to a host running Docker. Next, log into the host and run the following command in a directory containing the emerging.js script:
docker run -v "$PWD/emerging.js":/sflow-rt/emerging.js \
 -e "RTPROP=-Dscript.file=emerging.js" -p 6343:6343/udp sflow/sflow-rt
Note: Deploying analytics as a Docker service is a convenient method of packaging and running sFlow-RT. However, you can also download and install sFlow-RT as a package.

Once the software is running, you should see output similar to the following:
2016-09-17T22:19:16+0000 INFO: Listening, sFlow port 6343
2016-09-17T22:19:16+0000 INFO: Listening, HTTP port 8008
2016-09-17T22:19:16+0000 INFO: emerging.js started
2016-09-17T22:19:44+0000 WARNING: capturing 10.0.0.162 rule emrg0 on 10.0.0.253
The last line shows that traffic from host 10.0.0.162 to a blacklisted address has been detected and that selective spanning session has been configured on switch 10.0.0.253 to capture packets and send them to the host running Wireshark (10.0.0.70) for further analysis.

No comments:

Post a Comment