Tuesday, September 27, 2016

Docker 1.12 swarm mode elastic load balancing

Docker Built-In Orchestration Ready For Production: Docker 1.12 Goes GA describes the native swarm mode feature that integrates cluster management, virtual networking, and policy based deployment of services.

This article will demonstrate how real-time streaming telemetry can be used to construct an elastic load balancing solution that dynamically adjusts service capacity to match changing demand.

Getting started with swarm mode describes the steps to configure a swarm cluster. For example, following command issued on any of the Manager nodes deploys a web service on the cluster:
docker service create --replicas 2 -p 80:80 --name apache httpd:2.4
And the following command raises the number of containers in the service pool from 2 to 4:
docker service scale apache=4
Asynchronous Docker metrics describes how sFlow telemetry provides the real-time visibility required for elastic load balancing. The diagram shows how streaming telemetry allows the sFlow-RT controller to determine the load on the service pool so that it can use the Docker service API to automatically increase or decrease the size of the pool as demand changes. Elastic load balancing of the service pools ensures consistent service levels by adding additional resources if demand increases. In addition, efficiency is improved by releasing resources when demand drops so that they can be used by other services. Finally, global visibility into all resources and services makes it possible to load balance between services, reducing service pools for non-critical services to release resources during peak demand.

The first step is to install and configure Host sFlow agents on each of the nodes in the Docker swarm cluster. The following /etc/hsflowd.conf file configures Host sFlow to monitor Docker and send sFlow telemetry to a designated collector (in this case
sflow {
  sampling = 400
  polling = 10
  collector { ip = } 
  docker { }
  pcap { dev = docker0 }
  pcap { dev = docker_gwbridge }
Note: The configuration file is identical for all nodes in the cluster making it easy to automate the installation and configuration of sFlow monitoring using  Puppet, Chef, Ansible, etc.

Verify that the sFlow measurements are arriving at the collector node ( using sflowtool:
docker -p 6343:6343/udp sflow/sflowtool
The following elb.js script implements elastic load balancer functionality using the sFlow-RT real-time analytics engine:
var api = "";
var certs = '/tls/';
var service = 'apache';

var replicas_min = 1;
var replicas_max = 10;
var util_min = 0.5;
var util_max = 1;
var bytes_min = 50000;
var bytes_max = 100000;
var enabled = false;

function getInfo(name) {
  var info = null;
  var url = api+'/services/'+name;
  try { info = JSON.parse(http2({url:url, certs:certs}).body); }
  catch(e) { logWarning("cannot get " + url + " error=" + e); }
  return info;

function setReplicas(name,count,info) {
  var version = info["Version"]["Index"];
  var spec = info["Spec"];
  var url = api+'/v1.24/services/'+info["ID"]+'/update?version='+version;
  try {
      url:url, certs:certs, method:'POST',
  catch(e) { logWarning("cannot post to " + url + " error=" + e); }
  logInfo(service+" set replicas="+count);

var hostpat = service+'\\.*';
setIntervalHandler(function() {
  var info = getInfo(service);
  if(!info) return;

  var replicas = info["Spec"]["Mode"]["Replicated"]["Replicas"];
  if(!replicas) {
    logWarning("no active members for service=" + service);

  var res = metric(
    'ALL', 'avg:vir_cpu_utilization,avg:vir_bytes_in,avg:vir_bytes_out',

  var n = res[0].metricN;

  // we aren't seeing all the containers (yet)
  if(replicas !== n) return;

  var util = res[0].metricValue;
  var bytes = res[1].metricValue + res[2].metricValue;

  if(!enabled) return;

  // load balance
  if(replicas < replicas_max && (util > util_max || bytes > bytes_max)) {
  else if(replicas > replicas_min && util < util_min && bytes < bytes_min) {

setHttpHandler(function(req) {
  enabled = req.query && req.query.state && req.query.state[0] === 'enabled';
  return enabled ? "enabled" : "disabled";
Some notes on the script:
  1. The setReplicas(name,count,info) function uses the Docker Remote API to implement functionality equivalent to the docker service scale name=count command shown earlier. The REST API is accessible at in this example.
  2. The setIntervalHandler() function runs every 2 seconds, retrieving metrics for the service pool and scaling the number of replicas in the service up or down based on thresholds.
  3. The setHttpHandler() function exposes a simple REST API for enabling / disabling the load balancer functionality. The API can easily be extended to all thresholds to be set, to report statistics, etc.
  4. Certificates, key.pem, cert.pem, and ca.pem, required to authenticate API requests must be present in the /tls/ directory.
  5. The thresholds are set to unrealistically low values for the purpose of this demonstration.
  6. The script can easily be extended to load balance multiple services simultaneously.
  7. Writing Applications provides additional information on sFlow-RT scripting.
Run the controller:
docker run -v `pwd`/tls:/tls -v `pwd`/elb.js:/sflow-rt/elb.js \
 -e "RTPROP=-Dscript.file=elb.js" -p 8008:8008 -p 6343:6343/udp -d sflow/sflow-rt
The autoscaling functionality can be enabled:
curl "http://localhost:8008/script/elb.js/json?state=enabled"
and disabled:
curl "http://localhost:8008/script/elb.js/json?state=disabled"
using the REST API exposed by the script.
The chart above shows the results of a simple test to demonstrate the elastic load balancer function. First, ab - Apache HTTP server benchmarking tool was used to generate load on the apache service running under Docker swarm:
ab -rt 60 -n 300000 -c 4
Next, the test was repeated with the elastic load balancer enabled. The chart clearly shows that the load balancer is keeping the average network load on each container under control.
2016-09-24T00:57:10+0000 INFO: Listening, sFlow port 6343
2016-09-24T00:57:10+0000 INFO: Listening, HTTP port 8008
2016-09-24T00:57:10+0000 INFO: elb.js started
2016-09-24T01:00:17+0000 INFO: apache set replicas=2
2016-09-24T01:00:23+0000 INFO: apache set replicas=3
2016-09-24T01:00:27+0000 INFO: apache set replicas=4
2016-09-24T01:00:33+0000 INFO: apache set replicas=5
2016-09-24T01:00:41+0000 INFO: apache set replicas=6
2016-09-24T01:00:47+0000 INFO: apache set replicas=7
2016-09-24T01:00:59+0000 INFO: apache set replicas=8
2016-09-24T01:01:29+0000 INFO: apache set replicas=7
2016-09-24T01:01:33+0000 INFO: apache set replicas=6
2016-09-24T01:01:35+0000 INFO: apache set replicas=5
2016-09-24T01:01:39+0000 INFO: apache set replicas=4
2016-09-24T01:01:43+0000 INFO: apache set replicas=3
2016-09-24T01:01:45+0000 INFO: apache set replicas=2
2016-09-24T01:01:47+0000 INFO: apache set replicas=1
The sFlow-RT log shows that containers are added to the apache service to handle the increased load and removed once demand decreases.

This example relied on a small subset of the information available from the sFlow telemetry stream. In addition to container resource utilization, the Host sFlow agent exports an extensive set of metrics from the nodes in the Docker swarm cluster. If the nodes are virtual machines running in a public or private cloud, the metrics can be used to perform elastic load balancing of the virtual machine pool making up the cluster, increasing the cluster size if demand increases and reducing cluster size when demand decreases. In addition, poorly performing instances can be detected and removed from the cluster (see Stop thief! for an example).
The sFlow agents also efficiently report on traffic flowing within and between microservices running on the swarm cluster. For example, the following command:
docker run -p 6343:6343/udp -p 8008:8008 -d sflow/top-flows
launches the top-flows application to show an up to the second view of active flows in the network.

Comprehensive real-time analytics is critical to effectively managing agile container-bases infrastructure. Open source Host sFlow agents provide a lightweight method of instrumenting the infrastructure that unifies network and system monitoring to deliver a full set of standard metrics to performance management applications.

Monday, September 26, 2016

Asynchronous Docker metrics

Docker allows large numbers of lightweight containers can be started and stopped within seconds, creating an agile infrastructure that can rapidly adapt to changing requirements. However, the rapidly changing populating of containers poses a challenge to traditional methods of monitoring which struggle to keep pace with the changes. For example, periodic polling methods take time to detect new containers and can miss short lived containers entirely.

This article describes how the latest version of the Host sFlow agent is able to track the performance of a rapidly changing population of Docker containers and export a real-time stream of standard sFlow metrics.
The diagram above shows the life cycle status events associated with a container. The Docker Remote API provides a set of methods that allow the Host sFlow agent to communicate with the Docker to list containers and receive asynchronous container status events. The Host sFlow agent uses the events to keep track of running containers and periodically exports cpu, memory, network and disk performance counters for each container.

The diagram at the beginning of this article shows the sequence of messages, going from top to bottom, required to track a container. The Host sFlow agent first registers for container lifecycle events before asking for all the currently running containers. Later, when a new container is started, Docker immediately sends an event to the Host sFlow agent, which requests additional information (such as the container process identifier - PID) that it can use to retrieve performance counters from the operating system. Initial counter values are retrieved and exported along with container identity information as an sFlow counters message and a polling task for the new container is initiated. Container counters are periodically retrieved and exported while the container continues to run (2 polling intervals are shown in the diagram). When the Host sFlow agent receives an event from Docker indicating that the container is being stopped, it retrieves the final values of the performance counters, exports a final sFlow message, and removes the polling task for the container.

This method of asynchronously triggered periodic counter export allows an sFlow collector to accurately track rapidly changing container populations in large scale deployments. The diagram only shows the sequence of events relating to monitoring a single container. Docker network visibility demonstration shows the full range of network traffic and system performance information being exported.

Detailed real-time visibility is essential for fully realizing the benefits of agile container infrastructure, providing the feedback needed to track and automatically optimize the performance of large scale microservice deployments.

Saturday, September 17, 2016

Triggered remote packet capture using filtered ERSPAN

Packet brokers are typically deployed as a dedicated network connecting network taps and SPAN/mirror ports to packet analysis applications such as Wireshark, Snort, etc.

Traditional hierarchical network designs were relatively straightforward to monitor using a packet broker since traffic flowed through a small number of core switches and so a small number of taps provided network wide visibility. The move to leaf and spine fabric architectures eliminates the performance bottleneck of core switches to deliver low latency and high bandwidth connectivity to data center applications. However, traditional packet brokers are less attractive since spreading traffic across many links with equal cost multi-path (ECMP) routing means that many more links need to be monitored.

This article will explore how the remote Selective Spanning capability in Cumulus Linux 3.0 combined with industry standard sFlow telemetry embedded in commodity switch hardware provides a cost effective alternative to traditional packet brokers.

Cumulus Linux uses iptables rules to specify packet capture sessions. For example, the following rule forwards packets with source IP and destination IP to a packet analyzer on host
-A FORWARD --in-interface swp+ -s -d -j ERSPAN --src-ip --dst-ip
REST API for Cumulus Linux ACLs describes a simple Python wrapper that exposes IP tables through a RESTful API. For example, the following command remotely installs the capture rule on switch
curl -H "Content-Type:application/json" -X PUT --data \
  '["[iptables]","-A FORWARD --in-interface swp+ -s -d -j ERSPAN --src-ip --dst-ip"]' \
The following command deletes the rule:
curl -X DELETE
Selective Spanning makes it possible to turn every switch and port in the network into a capture device. However, it is import to carefully select which traffic to capture since the aggregate bandwidth of an ECMP fabric is measured in Terabits per second - far more traffic than can be handled by typical packet analyzers.
SDN packet broker describes an analogy for the role that sFlow plays in steering the capture network to that of a finderscope, the small wide-angle telescope used to provide an overview of the sky and guide a telescope to its target. The article goes on to describes some of the benefits of combining sFlow analytics with selective packet capture:
  1. Offload The capture network is a limited resource, both in terms of bandwidth and in the number of flows that can be simultaneously captured.  Offloading as many tasks as possible to the sFlow analyzer frees up resources in the capture network, allowing the resources to be applied where they add most value. A good sFlow analyzer delivers data center wide visibility that can address many traffic accounting, capacity planning and traffic engineering use cases. In addition, many of the packet analysis tools (such as Wireshark) can accept sFlow data directly, further reducing the cases where a full capture is required.
  2. Context Data center wide monitoring using sFlow provides context for triggering packet capture. For example, sFlow monitoring might show an unusual packet size distribution for traffic to a particular service. Queries to the sFlow analyzer can identify the set of switches and ports involved in providing the service and identify a set of attributes that can be used to selectively capture the traffic.
  3. DDoS Certain classes of event such as DDoS flood attacks may be too large for the capture network to handle. DDoS mitigation with Cumulus Linux frees the capture network to focus on identifying more serious application layer attacks.
The diagram at the top of this article shows an example of using sFlow to target selective capture of traffic to blacklisted addresses. In this example sFlow-RT is used to perform real-time sFlow analytics. The following emerging.js script instructs sFlow-RT to download the Emerging Threats blacklist and identify any local hosts that are communicating with addresses in the blacklist. A full packet capture is triggered when a potentially compromised host is detected:
var wireshark = '';
var idx=0;
function capture(localIP,remoteIP,agent) {
  var acl = [
    '# emerging threat capture',
    '-A FORWARD --in-interface swp+ -s '+localIP+' -d '+remoteIP 
    +' -j ERSPAN --src-ip '+agent+' --dst-ip '+wireshark,
    '-A FORWARD --in-interface swp+ -s '+remoteIP+' -d '+localIP 
    +' -j ERSPAN --src-ip '+agent+' --dst-ip '+wireshark
  var id = 'emrg'+idx++;
  logWarning('capturing '+localIP+' rule '+id+' on '+agent);

var groups = {};
function loadGroup(name,url) {
  try {
    var res, cidrs = [], str = http(url);
    var reg = /^(\d{1,3}\.){3}\d{1,3}(\/\d{1,2})?$/mg;
    while((res = reg.exec(str)) != null) cidrs.push(res[0]);
    if(cidrs.length > 0) groups[name]=cidrs;
  } catch(e) {
    logWarning("failed to load " + url + ", " + e);



setFlowHandler(function(rec) {
  var [localIP,remoteIP,group] = rec.flowKeys.split(',');
  try { capture(localIP,remoteIP,rec.agent); }
  catch(e) { logWarning("failed to capture " + e); }
Some comments about the script:
  1. The script uses sFlow telemetry to identify the potentially compromised host and the location (agent) observing the traffic.
  2. The location information is required so that the capture rule can be installed on a switch that is in the traffic path.
  3. The application has been simplified for clarity. In production, the blacklist information would be periodically updated and the capture sessions would be tracked so that they can be deleted when they they are no longer required.
  4. Writing Applications provides an introduction to sFlow-RT's API.
Configure sFlow on the Cumulus switches to stream telemetry to a host running Docker. Next, log into the host and run the following command in a directory containing the emerging.js script:
docker run -v "$PWD/emerging.js":/sflow-rt/emerging.js \
 -e "RTPROP=-Dscript.file=emerging.js" -p 6343:6343/udp sflow/sflow-rt
Note: Deploying analytics as a Docker service is a convenient method of packaging and running sFlow-RT. However, you can also download and install sFlow-RT as a package.

Once the software is running, you should see output similar to the following:
2016-09-17T22:19:16+0000 INFO: Listening, sFlow port 6343
2016-09-17T22:19:16+0000 INFO: Listening, HTTP port 8008
2016-09-17T22:19:16+0000 INFO: emerging.js started
2016-09-17T22:19:44+0000 WARNING: capturing rule emrg0 on
The last line shows that traffic from host to a blacklisted address has been detected and that selective spanning session has been configured on switch to capture packets and send them to the host running Wireshark ( for further analysis.

Wednesday, August 17, 2016

Real-time web analytics

The diagram shows a typical scale out web service with a load balancer distributing requests among a pool of web servers. The sFlow HTTP Structures standard is supported by commercial load balancers, including F5 and A10, and open source load balancers and web servers, including HAProxy, NGINX, Apache, and Tomcat.
The simplest way to try out the examples in this article is to download sFlow-RT and install the Host sFlow agent and Apache mod-sflow instrumentation on a Linux web server.

The following sFlow-RT metrics report request rates based on the standard sFlow HTTP counters:
  • http_method_option
  • http_method_get
  • http_method_head
  • http_method_post
  • http_method_put
  • http_method_delete
  • http_method_trace
  • http_method_connect
  • http_method_other
  • http_status_1xx
  • http_status_2xx
  • http_status_3xx
  • http_status_4xx
  • http_status_5xx
  • http_status_other
  • http_requests
In addition, mod-sflow exports the following standard thread pool metrics:
  • workers_active
  • workers_idle
  • workers_max
  • workers_utilization
  • req_delayed
  • req_dropped
Cluster performance metrics describes how sFlow-RT's REST API is used to compute summary statistics for a pool of servers. For example, the following query calculates the cluster wide total request rates:
More interesting is that the sFlow telemetry stream also includes randomly sampled HTTP request records with the following attributes:
  • protocol
  • serveraddress
  • serveraddress6
  • serverport
  • clientaddress
  • clientaddress6
  • clientport
  • proxyprotocol
  • proxyserveraddress
  • proxyserveraddress6
  • proxyserverport
  • proxyclientaddress
  • proxyclientaddress6
  • proxyclientport
  • httpmethod
  • httpprotocol
  • httphost
  • httpuseragent
  • httpxff
  • httpauthuser
  • httpmimetype
  • httpurl
  • httpreferer
  • httpstatus
  • bytes
  • req_bytes
  • resp_bytes
  • duration
  • requests
The sFlow-RT analytics pipeline is programmable. Defining Flows describes how to compute additional metrics based on the sampled requests. For example, the following flow definition creates a new metric called image_bytes that tracks the volume of image data in HTTP responses as a bytes/second value calculated over a 10 second window:
setFlow('image_bytes', {value:'resp_bytes',t:10,filter:'httpmimetype~image/.*'});
The new metric can be queries in exactly the same way as the counter based metrics above, e.g.:
The uri: function is used to extract parts of the httpurl or httpreferer URL fields. The following attributes can be extracted:
  • normalized
  • scheme
  • user
  • authority
  • host
  • port
  • path
  • file
  • extension
  • query
  • fragment
  • isabsolute
  • isopaque
For example, the following flow definition creates a metric called game_reqs that tracks the requests/second hitting the URL path with prefix /games:
setFlow('games_reqs', {value:'requests',t:10,filter:'uri:httpurl:path~/games/.*'});
Define flow keys to identify slowest requests, most popular URLs, etc. For example, the following definition tracks the top 5 longest duration requests:
setFlow('slow_reqs', {keys:'httpurl',value:'duration',t:10,n:5});
The following query retrieves the result:
$ curl "http://localhost:8008/activeflows/ALL/slow_reqs/json?maxFlows=5"
  "dataSource": "3.80",
  "flowN": 1,
  "value": 117009.24305622398,
  "agent": "",
  "key": "/login.php"
  "dataSource": "3.80",
  "flowN": 1,
  "value": 7413.476263017302,
  "agent": "",
  "key": "/games/animals.php"
  "dataSource": "3.80",
  "flowN": 1,
  "value": 4486.286259806839,
  "agent": "",
  "key": "/games/puzzles.php"
  "dataSource": "3.80",
  "flowN": 1,
  "value": 2326.33482623333,
  "agent": "",
  "key": "/sales/buy.php"
  "dataSource": "3.80",
  "flowN": 1,
  "value": 276.3486100676183,
  "agent": "",
  "key": "/index.php"
Sampled records are a useful complement to counter based metrics, making it possible to disaggregate counts and identify root causes. For example, suppose a spike in errors is identified through the http_status_4xx or http_status_5xx metrics. The following flow definition breaks out the most frequent failed requests by specific URL and error code:
setFlow('err_reqs', {keys:'httpurl,httpstatus',value:'requests',t:10,n:5,
Finally, the real-time HTTP analytics don't exist in isolation. The diagram shows how the sFlow-RT real-time analytics engine receives a continuous telemetry stream from sFlow instrumentation build into network, server and application infrastructure and delivers analytics through APIs and can easily be integrated with a wide variety of on-site and cloud, orchestration, DevOps and Software Defined Networking (SDN) tools.

Thursday, August 11, 2016

Network and system analytics as a Docker service

The diagram shows how new and existing cloud based or locally hosted orchestration, operations, and security tools can leverage the sFlow-RT analytics service to gain real-time visibility. Network visibility with Docker describes how to install open source sFlow agents to monitor network activity in a Docker environment in order to gain visibility into Docker Microservices.

The sFlow-RT analytics software is now on Docker Hub, making it easy to deploy real-time sFlow analytics as a Docker service:
docker run -p 8008:8008 -p 6343:6343/udp -d sflow/sflow-rt
Configure standard sFlow Agents to stream telemetry to the analyzer and retrieve analytics using the REST API on port 8008.

Increase memory from default 1G to 2G:
docker run -e "RTMEM=2G" -p 8008:8008 -p 6343:6343/udp -d sflow/sflow-rt
Set System Property to enable country lookups when Defining Flows:
docker run -e "RTPROP=-Dgeo.country=resources/config/GeoIP.dat" -p 8008:8008 -p 6343:6343/udp -d sflow/sflow-rt
Run sFlow-RT Application. Drop the -d option while developing an application to see output of logging commands and use control-c to stop the container.
docker run -v /Users/pp/my-app:/sflow-rt/app/my-app -p 8008:8008 -p 6343:6343/udp -d sflow/sflow-rt
A simple Dockerfile can be used to generate a new image that includes the application:
FROM sflow/sflow-rt:latest
COPY /Users/pp/my-app /sflow-rt/app
Similarly, a Dockerfile can be used to generate a new image from published applications. Any required System Properties can also be set in the Dockerfile.
FROM sflow/sflow-rt:latest
ENV RTPROP="-Dgeo.country=resources/config/GeoIP.dat"
RUN /sflow-rt/get-app.sh sflow-rt top-flows
This solution is extremely scaleable, a single sFlow-RT instance can monitor thousands of servers and the network devices connecting them.

Wednesday, July 20, 2016

Internet router using Cumulus Linux

Internet router using merchant silicon describes how an inexpensive white box switch running Linux can be used to replace a much costlier Internet router. This article will describe the steps needed to install the software on an x86 based white box switch running Cumulus Linux 3.0.

First, add the Debian Jessie repository:
sudo sh -c 'echo "deb http://ftp.us.debian.org/debian jessie main contrib" > \
Next, install Host sFlow, Java, and Bird:
sudo apt-get update
sudo apt-get install hsflowd
sudo apt-get install unzip
sudo apt-get install default-jre-headless
sudo apt-get install bird
Install sFlow-RT (the latest version is available at sFlow-RT.com):
wget http://www.inmon.com/products/sFlow-RT/sflow-rt_2.0-1116.deb
sudo dpkg -i sflow-rt_2.0-1116.deb
Increase the default virtual memory limit for sflowrt (needs to be greater than 1/3 amount of RAM on system to start Java virtual machine, see Giant Bug: Cannot run java with a virtual mem limit (ulimit -v)):
sudo sh -c 'echo "sflowrt soft as 2000000" > \
Note: Maximum Java heap memory has a default of 1G and is controlled by settings in /usr/local/sflow-rt/conf.d/sflow-rt.jvm file.

Install the Active Route Manager application:
sudo sh -c "/usr/local/sflow-rt/get-app.sh sflow-rt active-routes"
Cumulus Networks, sFlow and data center automation describes how to configure the sFlow agent (hsflowd). The sFlow collector address should be set to

Finally, configure Bird and sFlow-RT as described in Internet router using merchant silicon.

The instructions were tested on a Cumulus VX virtual machine, but should work on physical switches. Cumulus VX is free and provides a convenient way to try out Cumulus Linux and create virtual networks to test configurations.

If you are going to experiment with the solution on CumulusVX then the following command is needed to enable sFlow traffic monitoring:
sudo iptables -I FORWARD -j NFLOG --nflog-group 1 --nflog-prefix SFLOW
On physical switches the sFlow agent automatically configures packet sampling in the ASIC and is able to monitor all packets (not just the routed packets captured by the iptables command above).

Monday, July 18, 2016

World map

World Map has been released on GitHub, https://github.com/sflow-rt/world-map. The application displays an up to the second view of traffic as animated bubbles overlaid on a world map.

Download and install sFlow-RT to run the world-map application. Enable the System Property, geo.country=resources/config/GeoIP.dat, to allow the application to identify countries based on IP addresses.