Friday, January 7, 2011


The mod-sflow project is an open source implementation of sFlow monitoring for the Apache web server. The module exports the counter and transaction structures discussed in sFlow for HTTP.

The advantage of using sFlow is the scalability it offers for monitoring the performance of large web server clusters or load balancers where request rates are high and conventional logging solutions generate too much data or impose excessive overhead. Real-time monitoring of HTTP provides essential visibility into the performance of large-scale, complex, multi-layer services constructed using Representational State Transfer (REST) architectures. In addition, monitoring HTTP services using sFlow is part of an overall performance monitoring solution that provides real-time visibility into applications, servers and switches (see sFlow Host Structures).

The mod-sflow software is designed to integrate with the Host sFlow agent to provide a complete picture of server performance. Download, install and configure Host sFlow before proceeding to install mod-sflow - see Installing Host sFlow on a Linux Server. There are a number of options for analyzing cluster performance using Host sFlow, including Ganglia and sFlowTrend.

Note: mod-sflow picks up its configuration from the Host sFlow agent. The Host sFlow sampling.http setting can be used to override the default sampling setting to set a specific sampling rate for HTTP requests.

Next, download the mod-sflow sources from The following commands compile and install mod-sflow:

tar -xvzf mod-sflow-XXX.tar.gz
cd mod-sflow-XXX
apxs -c -i -a mod_sflow.c sflow_api.c
service httpd restart

Once installed, mod-sflow will stream measurements to a central sFlow Analyzer.  Currently the only software that can decode HTTP sFlow is sflowtool. Download, compile and install the latest sflowtool sources on the system your are using to receive sFlow from the servers in the Apache cluster.

Running sflowtool will display output of the form:

[pp@test]$ /usr/local/bin/sflowtool
startDatagram =================================
datagramSize 116
unixSecondsUTC 1294273499
datagramVersion 5
agentSubId 6486
packetSequenceNo 6
sysUpTime 44000
samplesInPacket 1
startSample ----------------------
sampleType_tag 0:2
sampleSequenceNo 6
sourceId 3:65537
counterBlock_tag 0:2201
http_method_option_count 0
http_method_get_count 247
http_method_head_count 0
http_method_post_count 2
http_method_put_count 0
http_method_delete_count 0
http_method_trace_count 0
http_methd_connect_count 0
http_method_other_count 0
http_status_1XX_count 0
http_status_2XX_count 214
http_status_3XX_count 35
http_status_4XX_count 0
http_status_5XX_count 0
http_status_other_count 0
endSample   ----------------------
startSample ----------------------
sampleType_tag 0:1
sampleSequenceNo 3434
sourceId 3:65537
meanSkipCount 2
samplePool 7082
dropEvents 0
inputPort 0
outputPort 1073741823
flowBlock_tag 0:2100
extendedType socket4
socket4_ip_protocol 6
socket4_local_port 80
socket4_remote_port 61401
flowBlock_tag 0:2201
flowSampleType http
http_method 2
http_protocol 1001
http_uri /favicon.ico
http_useragent Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; en-us) AppleW
http_bytes 284
http_duration_uS 335
http_status 404
endSample   ----------------------
endDatagram   =================================

The -H option causes sflowtool to output the HTTP request samples using the combined log format:

[pp@test]$ /usr/local/bin/sflowtool -H - - [05/Jan/2011:22:39:50 -0800] "GET /membase.php HTTP/1.1" 200 3494 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; en-us) AppleW" - - [05/Jan/2011:22:39:50 -0800] "GET /favicon.ico HTTP/1.1" 404 284 "" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; en-us) AppleW"

Converting sFlow to combined logfile format allows existing log analyzers to be used to analyze the sFlow data. For example, the following commands use sflowtool and webalizer to create reports:

/usr/local/bin/sflowtool -H | rotatelogs log/http_log &
webalizer -o report log/*

The resulting webalizer report shows top URLs:

Note: The log analyzer reports are useful for identifying top URLs, clients etc. However, values will need to be scaled up by the sampling rate - see Packet Sampling Basics to see how to properly scale sFlow data.

The mod-sflow performance counters can be retrieved using HTTP in addition to being exported by sFlow. To enable web access to the counters, create the following /etc/httpd/conf.d/sflow.conf file and restart httpd:

<IfModule mod_sflow.c>
  <Location /sflow>
    SetHandler sflow

Note: Use Apache Allow/Deny directives to limit access to the counter page.

The counters are now accessible using the URL, http://<server>/sflow

Web access to the counters makes them accessible to a wide variety of performance monitoring tools. For example, the following Perl script allows Cacti to make web requests to mod-sflow and trend HTTP counters:


use LWP::UserAgent;

if($#ARGV == -1) { exit 1; }

my $host = $ARGV[0];
my $ua = new LWP::UserAgent;
my $req = new HTTP::Request GET => "http://$host/sflow";
my $res = $ua->request($req);
if(!$res->is_success) { exit 1; }

my %count = ();
foreach $line (split /\n/, $res->content) {
  my @toks = split(' ', $line);
  $count{$toks[1]} = $toks[2];

print "option:$count{'method_option_count'} get:$count{'method_get_count'} head:
$count{'method_head_count'} post:$count{'method_post_count'} put:$count{'method_
put_count'} delete:$count{'method_delete_count'} trace:$count{'method_trace_coun
t'} connect:$count{'method_connect_count'} other:$count{'method_other_count'}";
The article Simplest Method of Going from Script to Graph (Walkthrough) provides instructions for installing the script and configuring Cacti charting. The following screen capture shows Cacti trend charts of the HTTP performance counters:

Finally, the real potential of HTTP sFlow is as part of a broader performance management system providing real-time visibility into applications, servers, storage and networking across the entire data center.

For example, the diagram above shows typical elements in a Web 2.0 data center (e.g. Facebook, Twitter, Wikipedia, Youtube, etc.). A cluster of web servers handles requests from users. Typically, the application logic for the web site will run on the web servers in the form of server side scripts (PHP, Ruby, ASP etc). The web applications access the database to retrieve and update user data. However, the database can quickly become a bottleneck, so a cache is used to store the results of database queries. The combination of sFlow from all the web servers, Memcached servers and network switches provides end-to-end visibility into performance that scales to handle even the largest data center.

No comments:

Post a Comment