Saturday, August 27, 2011


The node-sflow-module project is an open source implementation of sFlow monitoring for node.js, an open source event-based environment environment for creating network applications that built on Google's high performance V8 JavaScript Engine.

The advantage of using sFlow is the scalability it offers for monitoring the performance of large web server clusters or load balancers where request rates are high and conventional logging solutions generate too much data or impose excessive overhead. Real-time monitoring of HTTP provides essential visibility into the performance of large-scale, complex, multi-layer services constructed using Representational State Transfer (REST) architectures. In addition, monitoring HTTP services using sFlow is part of an integrated performance monitoring solution that provides real-time visibility into applications, servers and switches (see sFlow Host Structures).

The node-sflow-module software (sflow.js) is designed to integrate with the Host sFlow agent to provide a complete picture of server performance. Download, install and configure Host sFlow before proceeding to install node.js - see Installing Host sFlow on a Linux Server. There are a number of options for analyzing cluster performance using Host sFlow, including Ganglia and sFlowTrend.

Next, download the sflow.js file from Copy the sflow.js file into the same directory as your node.js application. Including the sFlow instrumentation is a one line change to the application, see the simple Hello World example below:

var http = require("http");

http.createServer(function (req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('Hello World\n');
}).listen(1337, "");
console.log('Server running at');

Once installed, the sflow.js will stream measurements to a central sFlow Analyzer. Currently the only software that can decode HTTP sFlow is sflowtool. Download, compile and install the latest sflowtool sources on the system your are using to receive sFlow from the servers in the node.js cluster.

Running sflowtool will display output of the form:

[pp@pcentos ~]$ sflowtool
startDatagram =================================
datagramSize 116
unixSecondsUTC 1314458638
datagramVersion 5
agentSubId 8124
packetSequenceNo 1
sysUpTime 22002
samplesInPacket 1
startSample ----------------------
sampleType_tag 0:2
sampleSequenceNo 1
sourceId 3:8124
counterBlock_tag 0:2201
http_method_option_count 0
http_method_get_count 2
http_method_head_count 0
http_method_post_count 0
http_method_put_count 0
http_method_delete_count 0
http_method_trace_count 0
http_methd_connect_count 0
http_method_other_count 0
http_status_1XX_count 0
http_status_2XX_count 2
http_status_3XX_count 0
http_status_4XX_count 0
http_status_5XX_count 0
http_status_other_count 0
endSample   ----------------------
endDatagram   =================================
startDatagram =================================
datagramSize 236
unixSecondsUTC 1314458652
datagramVersion 5
agentSubId 8124
packetSequenceNo 2
sysUpTime 35729
samplesInPacket 1
startSample ----------------------
sampleType_tag 0:1
sampleSequenceNo 0
sourceId 3:8124
meanSkipCount 6
samplePool 6
dropEvents 0
inputPort 0
outputPort 1073741823
flowBlock_tag 0:2201
flowSampleType http
http_method 2
http_protocol 1001
http_uri /
http_useragent Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1) AppleWebKit/534.
http_bytes 0
http_duration_uS 0
http_status 200
flowBlock_tag 0:2100
extendedType socket4
socket4_ip_protocol 6
socket4_local_port 8124
socket4_remote_port 52609
endSample   ----------------------
endDatagram   =================================

The -H option causes sflowtool to output the HTTP request samples using the combined log format:

[pp@pcentos ~]$ sflowtool -H - - [27/Aug/2011:08:26:52 -0700] "GET / HTTP/1.1" 200 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1) AppleWebKit/534."

Converting sFlow to combined logfile format allows existing log analyzers to be used to analyze the sFlow data. For example, the following commands use sflowtool and webalizer to create reports:

The resulting webalizer report shows top URLs:

Finally, the real potential of HTTP sFlow is as part of a broader performance management system providing real-time visibility into applications, servers, storage and networking across the entire data center.