Sunday, November 24, 2013

Exporting events using syslog

Figure 1: ICMP unreachable
ICMP unreachable described how standard sFlow monitoring built into switches can be used to detect scanning activity on the network. This article shows how sFlow-RT's embedded scripting API can be used to notify Security Information and Event Management (SIEM) tools when unreachable messages are observed.
Figure 2: Components of sFlow-RT
The following sFlow-RT JavaScript application (syslog.js) defines a flow to track ICMP port unreachable messages and generate syslog events that are sent to the SIEM tool running on server and listening for UDP syslog events on the default syslog port (514):
var server = '';
var port = 514;
var facility = 16; // local0
var severity = 5;  // notice

var flowkeys = ['ipsource','ipdestination','icmpunreachableport'];

setFlow('uport', {
  keys: flowkeys,

setFlowHandler(function(rec) {
  var keys = rec.flowKeys.split(',');
  var msg = {};
  for(var i = 0; i < flowkeys.length; i++) msg[flowkeys[i]] = keys[i];
The following command line argument loads the script on startup:
The following screen capture shows the events collected by the Splunk SIEM tool:
While Splunk was used in this example, there are a wide variety of open source and commercial tools that can be used to collect and analyze syslog events. For example, the following screen capture shows events in the open source Logstash tool:
Splunk, Logstash and other SIEM tools don't natively understand sFlow records and require a tool like sFlow-RT to extract information and convert it into a text format that can be processed. Using sFlow-RT to selectively forward high value data reduces the load on the SIEM system and in the case of commercial software like Splunk significantly lowers the expense of monitoring since licensing costs are typically based on the volume of data collected and indexed.

ICMP unreachable messages are only one example of the kinds of events that can be generated from sFlow data. The sFlow standard provides a scaleable method of monitoring all the network, server and application resources in the data center, see Visibility and the software defined data center.
Figure 3: Visibility and the software defined data center
For example, Cluster performance metrics describes how sFlow-RT can be used to summarize performance metrics, and periodic polling, or setting thresholds on metrics is another source of events for the SIEM system. A hybrid approach that splits the metrics stream so that exceptions are sent to the SIEM system and periodic summaries are sent to a time series database (e.g. Metric export to Graphite) leverages the strengths of the different tools.

Finally, log export is only one of many applications for sFlow data, some of which have been described on this blog. The data center wide visibility provided by sFlow-RT supports orchestration tools and allows them to automatically optimize the allocation of compute, storage and application resources and the placement of loads on these resources.

Saturday, November 23, 2013

Metric export to Graphite

Figure 1: Cluster performance metrics
Cluster performance metrics describes how sFlow-RT can be used to calculate summary metrics for cluster performance. The article includes a Python script that polls sFlow-RT's REST API and then sends metrics to to Graphite. In this article sFlow-RT's internal scripting API will be used to send metrics directly to Graphite.
Figure 2: Components of sFlow-RT
The following script (graphite.js) re-implements the Python example (generating a sum of the load_one metric for a cluster of Linux machines) in JavaScript using sFlow-RT built-in functions for retrieving metrics and sending them to Graphite:
// author: Peter
// version: 1.0
// date: 11/23/2013
// description: Log metrics to Graphite


var graphiteServer = "";
var graphitePort = null;

var errors = 0;
var sent = 0;
var lastError;

setIntervalHandler(function() {
  var names = ['sum:load_one'];
  var prefix = 'linux.';
  var vals = metric('ALL',names,{os_name:['linux']});
  var metrics = {};
  for(var i = 0; i < names.length; i++) {
    metrics[prefix + names[i]] = vals[i].metricValue;
  try { 
  } catch(e) {
    lastError = e.message;
} , 15);

setHttpHandler(function() {
  var message = { 'errors':errors,'sent':sent };
  if(lastError) message.lastError = lastError;
  return JSON.stringify(message);
The interval handler function runs every 15 seconds and retrieves the set of metrics in the names array (in this case just one metrics, but multiple metrics could be retrieved). The names are then converted into a Graphite friendly form (prefixing each metric with the token linux. so that they can be easily grouped) and then sent to the Graphite collector running on using the default TCP port 2003. The script also keeps track of any errors and makes them available through the URL /script/graphite.js/json

The following command line argument loads the script on startup:
The following Graphite screen capture below shows a trend of the metric:
There are a virtually infinite number of core and derived metrics that can be collected by sFlow-RT using standard sFlow instrumentation embedded in switches, servers and applications throughout the data center. For example Packet loss describes the importance of collecting network packet loss metrics and including them in performance dashboards.
Figure 2: Visibility and the software defined data center
While having access to all these metrics is extremely useful, not all of them need to be stored in Graphite. Using sFlow-RT to calculate and selectively export high value metrics reduces pressure on the time series database, while still allowing any of the remaining metrics to be polled using the REST API when needed.

Finally, metrics export is only one of many applications for sFlow data, some of which have been described on this blog. The data center wide visibility provided by sFlow-RT supports orchestration tools and allows them to automatically optimize the allocation of compute, storage and application resources and the placement of loads on these resources.

Thursday, November 14, 2013

SC13 large flow demo

For the duration of the SC13 conference, Denver will host of one of the most powerful and advanced networks in the world - SCinet. Created each year for the conference, SCinet brings to life a very high capacity network that supports the revolutionary applications and experiments that are a hallmark of the SC conference. SCinet will link the Colorado Convention Center to research and commercial networks around the world. In doing so, SCinet serves as the platform for exhibitors to demonstrate the advanced computing resources of their home institutions and elsewhere by supporting a wide variety of bandwidth-driven applications including supercomputing and cloud computing. - SCinet

The screen shot is from a live demonstration of network-wide large flow detection and tracking using standard sFlow instrumentation build into switches in the SCInet network. Currently multiple vendor's switches, 1,223 ports, with speeds up to 100Gbit/s, are sending sFlow data.
Note: The network is currently being set up, traffic levels will build up and reach a peak next week during the SC13 show (Nov. 17-22). Visit the demonstration site next week to see live traffic on one of the worlds busiest networks:
The sFlow-RT real-time analytics engine is receiving the sFlow and centrally tracking large flows. The HTML5 web pages poll the analytics engine every half second for the largest 100 flows in order to update the charts, which represent large flows as follows:
  • Dot an IP address
  • Circle a logical grouping of IP addresses
  • Line width represents bandwidth consumed by flow
  • Line color identifies traffic type
Real-time detection and tracking of large flows has many applications in software defined networking (SDN), including: DDoS mitigation, large flow load balancing, and multi-tenant performance isolation. For more information, see Performance Aware SDN

Sunday, November 10, 2013

UDP packet replication using Open vSwitch

UDP protocols such as sFlow, syslog, NetFlow, IPFIX and SNMP traps, have many advantages for large scale network and system monitoring, see Push vs Pull.  In a typical deployment each managed element is configured to send UDP packets to a designated collector (specified by an IP address and port). For example, in a simple sFlow monitoring system all the switches might be configured to send sFlow data to UDP port 6343 on the host running the sFlow analysis application. Complex deployments may require multiple analysis applications, for example: a first application providing analytics for software defined networking, a second focused on host performance, a third addressing packet capture and security, and the fourth looking at application performance. In addition, a second copy of each application may be required for redundancy. The challenge is getting copies of the data to all the application instances in an efficient manner.

There are a number of approaches to replicating UDP data, each with limitations:
  1. IP Multicast - if the data is sent to an IP multicast address then each application could subscribe to the multicast channel receive a copy of the data. This sounds great in theory, but in practice configuring and maintaining IP multicast connectivity can be a challenge. In addition, all the agents and collectors would need to support IP multicast. In addition, IP multicast also doesn't address the situation where you have multiple applications running on single host and so each application has to be receive the UDP data on a different port.
  2. Replicate at source - each agent could be configured to send a copy of the data to each application. Replicating at source is a configuration challenge (all agents need to be reconfigured if you add an additional application). This approach is also wasteful of bandwidth - multiple copies of the same data are send across the network.
  3. Replicate at destination - a UDP replicator, or "samplicator" application receives the stream of UDP messages, copies them and resends them to each of the applications. This functionality may be deployed as a stand alone application, or be an integrated function within an analysis application. The replicator application is a single point of failure - if it is shut down none of the applications receive data. The replicator adds delay to the measurements and at high data rates can significantly increase UDP loss rate as the datagrams are received, sent, and received again. 
This article will examine a fourth option, using software defined networking (SDN) techniques to replicate and distribute data within the network. The Open vSwitch is implemented in the Linux kernel and includes OpenFlow and network virtualization features that will be used to build the replication network.

First, you will need a server (or virtual machine) running a recent version of Linux. Next download and install Open vSwitch.

Next, configure the Open vSwitch to handle networking for the server:
ovs-vsctl add-br br0
ovs-vsctl add-port br0 eth0
ifconfig eth0 0
ifconfig br0
Now configure the UDP agents to send their data to You should be able to run a collector application for each service port (e.g. sFlow 6343, syslog 514, etc.).

The first case to consider is replicating the datagrams to a second port on the server (sending packets to App 1 and App 2 in the diagram). First, use the ovs-vsctl command to list the OpenFlow port numbers on the virtual switch:
% ovs-vsctl --format json --columns name,ofport list Interface
We are interested in replicating packets received on eth0 and output shows that the corresponding OpenFlow port is 1.

The Open vSwitch provides a command line utility ovs-ofctl that uses the OpenFlow protocol to configure forwarding rules in the vSwitch. The following OpenFlow rule will replicate sFlow datagrams:
in_port=1 dl_type=0x0800 nw_proto=17 tp_dst=6343 actions=LOCAL,mod_tp_dst:7343,normal
The match part of the rule looks for packets received on port 1 (in_port=1), where the Ethernet type is IPv4 (dl_type=0x0800), the IP protocol is UDP (nw_protocol=17), and the destination UDP port is 6343 (tp_dst=6343). The actions section of the rule is the key to building the replication function. The LOCAL action delivers the original packet as intended. The destination port is then changed to 7343 (mod_tp_dst:7343) and the modified packet is sent through the normal processing path to be delivered to the application.

Save this rule to a file, say replicate.txt, and then use ovs-ofctl to apply the rule to br0:
ovs-ofctl add-flows br0 replicate.txt
At this point a second sFlow analysis application listening for sFlow datagrams on port 7343 should start receiving data - sflowtool is a convenient way to verify that the packets are being received:
sflowtool -p 7343
The second case to consider is replicating the datagrams to a remote host (sending packets to App 3 in the diagram).
in_port=1 dl_type=0x0800 nw_proto=17 tp_dst=6343 actions=LOCAL,mod_tp_dst:7343,normal,mod_nw_src:,mod_nw_dst:,normal
The extended rule includes additional actions that modify the source address of the packets (mod_nw_src: and the destination IP address (mod_nw_dst: and sends the packet through the normal processing path. Since we are relying on the routing functionality in the Linux stack to deliver the packet, make sure that routing in enabled - see How to Enable IP Forwarding in Linux.
Unicast reverse path filtering (uRPF) is mechanism that routers use to drop spoofed packets (i.e. packets where the source address doesn't belong to the subnet on the access port the packet was received on). uRPF should be enabled wherever practical because spoofing is used in a variety of security and denial of service attacks, e.g. DNS amplification attacks. By modifying the IP source address to be the address of the forwarding host ( rather than the original source IP address the OpenFlow rule ensures that the packet will pass through uRPF filters, both on the host and on the access router. Rewriting the sFlow source address does not cause any problems because the sFlow protocol identifies the original source of the data within its payload and doesn't rely on the IP source address. However, other UDP protocols (for example, NetFlow/IPFIX) rely on the IP source address to identify the source of the data. In this case, removing the mod_nw_src action will leave the IP source address unchanged, but the packet may well be dropped by uRPF filters. Newer Linux distributions implement strict uRPF by default, however it can be disabled if necessary, see Reverse Path Filtering.
This article has only scratched the surface of capabilities of the Open vSwitch. In situations where passing the raw packets across the network isn't feasible the Open vSwitch can be configured to send the packets over a tunnel (sending packets to App 4 in the diagram). Tunnels, in conjunction with OpenFlow, can be used to create a virtual UDP distribution overlay network with its own addressing scheme and topology - Open vSwitch is used by a number of network virtualization vendors (e.g. VMware NSX). In addition, more complex filters can also be implemented, forwarding datagrams based on source subnet to different collectors etc.

The replication functions don't need to be performed in software in the virtual switch. OpenFlow rules can be pushed to OpenFlow capable hardware switches which can perform the replication, or source based forwarding functions at wire speed. A full blown controller based solution isn't necessarily required, the ovs-ofctl command can be used to push OpenFlow rules to physical switches.

More generally, building flexible UDP datagram distribution and replication networks is an interesting use case for software defined networking. The power of software defined networking is that you can adapt the network behavior to suit the needs of the application - in this case overcoming the limitations of existing UDP distribution solutions by modifying the behavior of the network.

Sunday, November 3, 2013

ICMP unreachable

Figure 1: ICMP port unreachable
Figure 1 provides an example that demonstrates how Internet Control Message Protocol (ICMP) destination port unreachable messages are generated. In the example, host h1 sends a UDP packet to port 30000 on host h4. The packet message transits switches s1 and s2 on its path to h4. In this case, h4 is not running a service that listens for UDP packets on port 30000, so host h4 sends an ICMP destination port unreachable message (ICMP type 3, code 3) back to host h1 to inform it that that the port cannot be reached. ICMP unreachable messages include the header of the original packet within their payload so that the sender can examine the header fields and determine the source of the error.

ICMP unreachable messages provide a clear indication of configuration errors and should be rare in a well configured network. Typically, the ICMP unreachable messages that are seen result from scanning and network reconnaissance:
  • Scanning a host for open ports will generate ICMP port / protocol unreachable messages
  • Scanning for hosts will generate ICMP host / network unreachable messages
The sources of scanning activity can identify compromised hosts on the network and gives information about potential security challenges to the network. From the example, UDP port 30000 is known to be associated with trojan activity and so any requests to connect to this port from host h1 suggest that h1 may be compromised. It also make sense to follow up to see if any hosts are responding to requests to UDP port 30000.

The challenge in monitoring ICMP messages is that there is no single location that can see all the messages - they take a direct path between sender and receiver. Installing monitoring agents on all the hosts poses practical challenges in a heterogeneous environment, and agent based monitoring may be circumvented since trojans often disable security monitoring software when they infect a host.

Support for the sFlow standard in switches provides an independent method of profiling host behavior. The sFlow standard is widely supported by switch vendors and has the scaleability to deliver real-time, network wide, monitoring of host traffic. The switches export packet headers, allowing the central monitoring software to perform deep packet inspection and extract details from the ICMP protocol.

DNS amplification attacks describes how the sFlow-RT analyzer can be used to monitor DNS activity. The SMURF attack uses spoofed ICMP messages as a method of DDoS amplification and similar techniques to those described in the DNS article can be used to detect and mitigate these attacks.

The following example illustrates how sFlow can be used to monitor ICMP unreachable activity; a single instance of sFlow-RT is monitoring 7500 switch ports in a data center network.

The following ICMP attributes are extracted from packet samples and can be used in flow definitions or as filters:

Message type, e.g. Destination Unreachable (3)icmptype3
Message code, e.g. Protocol Unreachable (2)icmpcode2
IP address in network unreachable responseicmpunreachablenet10.0.0.1
Host in host unreachable responseicmpunreachablehost10.0.0.1
Protocol in protocol unreachable response icmpunreachableprotocol41
Port in port unreachable responseicmpunreachableportudp_30000

The following flow definitions were created using sFlow-RT's embedded scripting API:
Alternatively, the flow definitions can be specified by making calls to the REST API using cURL:
curl -H "Content-Type:application/json" -X PUT --data "{keys:'icmpunreachableport', value:'frames', t:20}" http://localhost:8008/flow/uports/json
Using the script API has a number of advantages: it ensures that flow definitions are automatically reinstated on a system restart, makes it easy to generate trend charts (for example the graphite() function sends metrics to Graphite for integration in performance dashboards) and to automate the response when ICMP anomalies are detected (for example, using the syslog() function to send an alert or http() to access a REST API on a device or SDN controller to block the traffic).
The table above (http://localhost:8008/activeflows/ALL/uports/html?maxFlows=20&aggMode=sum) shows a continuously updating, real-time, view of the top ICMP unreachable ports - a bit like the Linux top command, but applied to the active flows. The table shows that the most frequently reported unreachable port is UDP port 30000.

There are a number of more detailed flow definitions that can be created:
  • To identify hosts generating scan packets, include ipdestination in the flow definition
  • To identify targets of Smurf attacks, include ipdestination and filter to exclude local addresses
  • To identify target country, include destinationcountry and filter to exclude local addresses
Note: Examples of these detailed flows have been omitted to preserve the anonymity.
Figure 2: Performance aware software defined networking
Incorporating sFlow analytics in a performance aware software defined networking solution offers the opportunity to automate a response. The following script monitors for ICMP unreachable messages and generates syslog events when an unreachable message is detected:
setFlowHandler(function(rec) {
  var name =;
  var keys = rec.flowKeys.split(',');
  var msg = {type:name,host:keys[0],target:keys[1]};

setFlow('unets',{keys:'ipdestination,icmpunreachablenet', value:'frames', t:20, log:true, flowStart:true});
setFlow('uhosts',{keys:'ipdestination,icmpunreachablehost', value:'frames', t:20, log:true, flowStart:true});
setFlow('uprotos',{keys:'ipdestination,icmpunreachableprotocol', value:'frames', t:20, log:true, flowStart:true});
setFlow('uports',{keys:'ipdestination,icmpunreachableport', value:'frames', t:20, log:true, flowStart:true});
While this example focused on a data center hosting servers, a similar approach could be used to monitor campus networks, detecting hosts that are scanning or participating in DDoS attacks. In this case, the SDN controller would respond by isolating the compromised hosts from the rest of the network.