scalability it offers when request rates are high and conventional logging solutions generate too much data or impose excessive overhead. In addition, monitoring HTTP services using sFlow is part of an integrated monitoring system that spans the data center, providing real-time visibility into application, server and network performance.
The sflow/haproxy software is designed to integrate with the Host sFlow agent to provide a complete picture of proxy performance. Download, install and configure Host sFlow before proceeding to install sflow/haproxy - see Installing Host sFlow on a Linux Server.
Note: the sflow/haproxy agent picks up its configuration from the Host sFlow agent. The sampling.http setting can be used to override the default sampling setting to set a specific sampling rate for HTTP requests.
The following commands download and install the sFlow instrumented version of HAProxy on a Linux server:
git clone https://github.com/sflow/haproxy.git cd haproxy make TARGET=linux26 USE_SFLOW=yes make installOnce installed and configured, HAProxy will stream measurements to a central sFlow Analyzer. Download, compile and install the sflowtool on the system your are using to receive sFlow to see the raw data and verify that the measurements are being received.
Running sflowtool will display output of the form:
$ sflowtool startDatagram ================================= datagramSourceIP 10.0.0.153 datagramSize 564 unixSecondsUTC 1368058148 datagramVersion 5 agentSubId 80 agent 10.0.0.153 packetSequenceNo 23 sysUpTime 417000 samplesInPacket 2 startSample ---------------------- sampleType_tag 0:2 sampleType COUNTERSSAMPLE sampleSequenceNo 1 sourceId 3:80 counterBlock_tag 0:2201 http_method_option_count 0 http_method_get_count 71 http_method_head_count 0 http_method_post_count 0 http_method_put_count 0 http_method_delete_count 0 http_method_trace_count 0 http_methd_connect_count 0 http_method_other_count 2 http_status_1XX_count 0 http_status_2XX_count 26 http_status_3XX_count 24 http_status_4XX_count 23 http_status_5XX_count 0 http_status_other_count 0 endSample ---------------------- startSample ---------------------- sampleType_tag 0:1 sampleType FLOWSAMPLE sampleSequenceNo 71 sourceId 3:80 meanSkipCount 1 samplePool 71 dropEvents 0 inputPort 0 outputPort 1073741823 flowBlock_tag 0:2102 extendedType proxy_socket4 proxy_socket4_ip_protocol 6 proxy_socket4_local_ip 0.0.0.0 proxy_socket4_remote_ip 10.0.0.150 proxy_socket4_local_port 0 proxy_socket4_remote_port 80 flowBlock_tag 0:2100 extendedType socket4 socket4_ip_protocol 6 socket4_local_ip 0.0.0.0 socket4_remote_ip 10.0.0.70 socket4_local_port 0 socket4_remote_port 57642 flowBlock_tag 0:2206 flowSampleType http http_method 2 http_protocol 1001 http_uri GET /games/animals.php HTTP/1.1 http_host 10.0.0.153 http_referrer http://10.0.0.153/games/ http_useragent Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.31 (KHTML http_mimetype text/html; charset=UTF-8 http_request_bytes 346 http_bytes 487 http_duration_uS 13000 http_status 200 endSample ---------------------- endDatagram =================================There are two types of sFlow record shown: COUNTERSAMPLE and FLOWSAMPLE data. The counters are useful for trending overall performance using tools like Ganglia and Graphite. Using sflowtool to output combined logfile format makes the data available to most logfile analyzers.
Note: The highlighted IP addresses in the FLOWSAMPLE correspond to addresses in the diagram and illustrate how request records from the proxy link clients to the back end servers.A native sFlow analyzer like sFlowTrend can combine the counters, flows and host performance metrics to provide an integrated view of performance.
Apache, NGINX, Tomcat and node.js. Application logic running on the servers can also be instrumented with sFlow, see Scripting languages. Back end Memcache, Java and virtualization pools can also be instrumented with sFlow. sFlow agents embedded in physical and virtual switches provide visibility into the network.
Comprehensive end to end visibility in multi-tiered environments allows the powerful control capabilities of the load balancers to be used to greatest effect: regulating traffic between tiers, protecting overloaded backend systems, defending against denial of service attacks, moving resources from over provisioned pools to under provisioned pools.
The sFlow-RT real-time analytics engine makes the full set of sFlow metrics accessible through a RESTful API so that they can be used to drive automation. A future article will explore how sFlow metrics can be used to control HAProxy behavior (by issuing UnixSocketCommands).