Thursday, October 21, 2010

Installing Host sFlow on a Linux server

The Host sFlow agent supports Linux performance monitoring, providing a lightweight, scalable solution for monitoring large numbers of Linux servers.

The following steps demonstrate how to install and configure the Host sFlow agent on a Linux server, sending sFlow to an analyzer with IP address 10.0.0.50.

Note: If there are any firewalls between the Linux servers and the sFlow analyzer, you will need to ensure that packets to the sFlow analyzer (UDP port 6343) are permitted.

First go to the Host sFlow web site and download the RPM file for your Linux distribution. If an RPM doesn't exist, you will need to download the source code.

If you are installing from RPM, the following commands will install and start the Host sFlow agent:

rpm -Uvh hsflowd_XXX.rpm
service hsflowd start

If you are building from sources, then using the following commands:

tar -xzf hsflowd-X.XX.tar.gz
cd hsflowd-X.XX
make
make install
make schedule
service hsflowd start

The default configuration method used for sFlow is DNS-SD; enter the following DNS settings in the site DNS server:

analyzer A 10.0.0.50

_sflow._udp SRV 0 0 6343 analyzer
_sflow._udp TXT (
"txtvers=1"
"polling=20"
"sampling=512"
)

Note: These changes must be made to the DNS zone file corresponding to the search domain in the Linux server's /etc/resolv.conf file. Alternatively, you can explicitly configure the domain using the DNSSD_domain setting in /etc/hsflowd.conf.

Once the sFlow settings are added to the DNS server, they will be automatically picked up by the Host sFlow agents. If you need to change the sFlow settings, simply change them on the DNS server and the change will automatically be applied to all the Linux systems in the data center.

Manual configuration is an option if you do not want to use DNS-SD. Edit the Host sFlow agent configuration file, /etc/hsflowd.conf, on each Linux server:

sflow{
  DNSSD = off
  polling = 20
  sampling = 512
  collector{
   ip = 10.0.0.50
   udpport = 6343
  }
}

After editing the configuration file you will need to restart the Host sFlow agent:

service hsflowd restart

For a complete sFlow monitoring solution you should also collect sFlow from the switches connecting the servers to the network (see Hybrid server monitoring). The sFlow standard is designed to seamlessly integrate monitoring of networks and servers (see sFlow Host Structures).

An sFlow analyzer is needed to receive the sFlow data and report on performance (see Choosing an sFlow analyzer). The free sFlowTrend analyzer is a great way to get started, see sFlowTrend adds server performance monitoring to see examples.

Update: The inclusion of iptables/ULOG support in the Host sFlow agent provides an efficient way to monitor detailed traffic flows if you can't monitor your top of rack switches or if you have virtual machines in a public cloud (see Amazon Elastic Compute Cloud (EC2) and Rackspace cloud servers).

Update: See Configuring Host sFlow for Linux via /etc/hsflowd.conf for the latest configuration information. The Host sFlow agent now supports Linux bridge, macvlan, ipvlan, adapters, Docker, and TCP round trip time.

26 comments:

  1. Hi, does hsflowd support multiple connector in manual mode? Sort of a backup, if the first one isn't available.
    Thanks.

    ReplyDelete
  2. You can define multiple collector{} sections in the configuration file if you are manually configuring hsflowd, or you can have multiple SRV records if you are using DNS-SD. In either case, hsflowd will send each sFlow datagram to all the listed sFlow collectors.

    Another way to achieve redundancy is to duplicate the sFlow datagrams at the receiving end, using sflowtool, see Forwarding using sflowtool, or any other UDP replicator.

    You can also send the sFlow to a virtual IP address and move the address to a different server in the event of a failure.

    ReplyDelete
  3. Hi, I am evaluating sFlow for monitoring my hadoop cluster running on KVM for some research purpose. I have installed sFlowTrend successfully, however, when I try to install hsflowd I get "missing dependencies". I am searching in google since yesterday but could not find how to deploy these dependencies. Please provide me some guidance regarding deploying dependencies

    Steps I have followed:

    root@rkmalaiya:~# rpm -Uvh hsflowd_KVM-1.22.2-1.x86_64.rpm
    rpm: RPM should not be used directly install RPM packages, use Alien instead!
    rpm: However assuming you know what you are doing...
    error: Failed dependencies:
    /bin/sh is needed by hsflowd-1.22.2-1.x86_64
    chkconfig is needed by hsflowd-1.22.2-1.x86_64
    libc.so.6()(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libc.so.6(GLIBC_2.2.5)(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libc.so.6(GLIBC_2.3)(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libm.so.6()(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libm.so.6(GLIBC_2.2.5)(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libpthread.so.0()(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libpthread.so.0(GLIBC_2.2.5)(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libresolv.so.2()(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libresolv.so.2(GLIBC_2.2.5)(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libvirt.so.0()(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libvirt.so.0(LIBVIRT_0.0.3)(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libvirt.so.0(LIBVIRT_0.0.5)(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libvirt.so.0(LIBVIRT_0.1.0)(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libvirt.so.0(LIBVIRT_0.3.2)(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libvirt.so.0(LIBVIRT_0.4.1)(64bit) is needed by hsflowd-1.22.2-1.x86_64
    libxml2.so.2()(64bit) is needed by hsflowd-1.22.2-1.x86_64
    rtld(GNU_HASH) is needed by hsflowd-1.22.2-1.x86_64
    root@rkmalaiya:~#


    ReplyDelete
  4. It looks like you have a Debian / Ubuntu system. For that you should download and install the .deb package for hsflowd, instead of the .rpm.

    Alternatively, you could download the sources and run "make deb".

    For Hadoop (JVM) monitoring you might want to incorporate this sub-agent as well:
    https://code.google.com/p/jmx-sflow-agent/

    It learns the settings from hsflowd, and then pushes standard sFlow performance stats straight from the JVM. It is many times more efficient than polling remotely with JMX.

    Neil


    ReplyDelete
  5. tdw-10-136-149-98:/data/tdwsst/hsflowd-1.23.2 # make
    cd src/sflow; make
    make[1]: Entering directory `/data/tdwsst/hsflowd-1.23.2/src/sflow'
    make[1]: `libsflow.a' is up to date.
    make[1]: Leaving directory `/data/tdwsst/hsflowd-1.23.2/src/sflow'
    cd src/json; make
    make[1]: Entering directory `/data/tdwsst/hsflowd-1.23.2/src/json'
    make[1]: `libjson.a' is up to date.
    make[1]: Leaving directory `/data/tdwsst/hsflowd-1.23.2/src/json'
    PLATFORM=`uname`; \
    MYVER=`./getVersion`; \
    MYREL=`./getRelease`; \
    cd src/$PLATFORM; make VERSION=$MYVER RELEASE=$MYREL
    make[1]: Entering directory `/data/tdwsst/hsflowd-1.23.2/src/Linux'
    gcc -std=gnu99 -I. -I../sflow -O3 -DNDEBUG -Wall -Wstrict-prototypes -Wunused-value -D_GNU_SOURCE -DHSP_VERSION=1.23.2 -DUTHEAP -DHSF_ULOG -DHSF_JSON -I../json -c readInterfaces.c
    In file included from readInterfaces.c:15:
    /usr/include/linux/ethtool.h:17: error: expected specifier-qualifier-list before ‘u32’
    /usr/include/linux/ethtool.h:34: error: expected specifier-qualifier-list before ‘u32’
    /usr/include/linux/ethtool.h:51: error: expected specifier-qualifier-list before ‘u32’
    /usr/include/linux/ethtool.h:59: error: expected specifier-qualifier-list before ‘u32’
    /usr/include/linux/ethtool.h:65: error: expected specifier-qualifier-list before ‘u32’
    /usr/include/linux/ethtool.h:73: error: expected specifier-qualifier-list before ‘u32’
    /usr/include/linux/ethtool.h:82: error: expected specifier-qualifier-list before ‘u32’
    /usr/include/linux/ethtool.h:178: error: expected specifier-qualifier-list before ‘u32’
    /usr/include/linux/ethtool.h:200: error: expected specifier-qualifier-list before ‘u32’
    /usr/include/linux/ethtool.h:225: error: expected specifier-qualifier-list before ‘u32’
    /usr/include/linux/ethtool.h:238: error: expected specifier-qualifier-list before ‘u32’
    /usr/include/linux/ethtool.h:247: error: expected specifier-qualifier-list before ‘u32’
    /usr/include/linux/ethtool.h:253: error: expected specifier-qualifier-list before ‘u32’
    /usr/include/linux/ethtool.h:261: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘ethtool_op_get_link’
    /usr/include/linux/ethtool.h:262: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘ethtool_op_get_tx_csum’
    /usr/include/linux/ethtool.h:263: error: expected declaration specifiers or ‘...’ before ‘u32’
    /usr/include/linux/ethtool.h:264: error: expected declaration specifiers or ‘...’ before ‘u32’
    /usr/include/linux/ethtool.h:265: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘ethtool_op_get_sg’
    /usr/include/linux/ethtool.h:266: error: expected declaration specifiers or ‘...’ before ‘u32’
    /usr/include/linux/ethtool.h:267: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘ethtool_op_get_tso’
    /usr/include/linux/ethtool.h:268: error: expected declaration specifiers or ‘...’ before ‘u32’
    /usr/include/linux/ethtool.h:270: error: expected declaration specifiers or ‘...’ before ‘u8’
    /usr/include/linux/ethtool.h:271: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘ethtool_op_get_ufo’
    /usr/include/linux/ethtool.h:272: error: expected declaration specifiers or ‘...’ before ‘u32’
    /usr/include/linux/ethtool.h:340: error: expected specifier-qualifier-list before ‘u32’
    readInterfaces.c: In function ‘readInterfaces’:
    readInterfaces.c:331: warning: excess elements in struct initializer
    readInterfaces.c:331: warning: (near initialization for ‘ecmd’)
    readInterfaces.c:332: error: ‘struct ethtool_cmd’ has no member named ‘cmd’
    readInterfaces.c:335: error: ‘struct ethtool_cmd’ has no member named ‘duplex’
    readInterfaces.c:336: error: ‘struct ethtool_cmd’ has no member named ‘speed’
    make[1]: *** [readInterfaces.o] Error 1
    make[1]: Leaving directory `/data/tdwsst/hsflowd-1.23.2/src/Linux'
    make: *** [hsflowd] Error 2

    ReplyDelete
  6. What linux is this?

    It seems like the compiler has no definition for the types used in /usr/include/linux/ethtool.h. On my system these types are __u8, __u16 and __u32, and they are defined by /usr/include/linux/types.h, which is included by ethtool.h (and also by hsflowd's readInterfaces.c).

    It's possible that some other include file (e.g. sys/sysctl.h) is defining _LINUX_TYPES_H without actually including it, which would prevent the __u8, __u16 and __u32 types from being defined, so please try changing readInterfaces.c so that the lines:

    #include
    #include

    appear before:

    #include "hsflowd.h"

    and let me know if that helps.

    Neil

    ReplyDelete
    Replies
    1. I just noticed that the web-formatting removed the include file names in angle-brackets, sorry. The change I suggested was to ensure that linux/types.h and linux/ethtool.h were included before hsflowd.h.

      Neil

      Delete
  7. Hi, I try to use Ganglia and Host sFlow to monitor physical and virtual machine data. My Ganglia version is 3.2.0, and I use Manual (not DSN-SD) way to link Host sFlow with gmond, but now, at the Ganglia web, I could get the physical machine's statistic, where are the virtuals'? My virtual machines are produced by a cloud platform------Eucalyptus, and use the KVM. Must I use the Open vSwitch software or any other ones?
    Would you please tell me where my faults are?

    ReplyDelete
    Replies
    1. You need to install the KVM version of Host sFlow (or build from sources). The KVM version uses libvirt to export per VM statistics in addition to the physical host statistics.

      Delete
    2. This comment has been removed by the author.

      Delete
    3. Can you please run the following test and email the results to the Host sFlow mailing list (https://lists.sourceforge.net/lists/listinfo/host-sflow-discuss):

      /etc/init.d/hsflowd stop
      /usr/sbin/hsflowd -dd 2>&1 | tee /tmp/hsflowd.log

      and let it run for two minutes. Then stop it with control-c and:

      gzip /tmp/hsflowd.log

      then send the (copious) logging info that is now in /tmp/hsflowd.log.gz.

      Please confirm that /etc/gmond.conf still has "accept_vm_metrics = yes" in the sflow { } block.

      Delete
  8. how do you setup dns-sd to work with this via dnsmasq?

    i am trying this right now:
    ptr-record=_sflow._udp 300 SRV 0 0 6343 10.0.1.224.
    ptr-record=_sflow._udp TXT ("txtvers=1", "sampling=2048", "polling=20" )

    ReplyDelete
    Replies
    1. I don't have any experience with dnsmasq. However, you might find the following presentation helpful, it shows how to use dig to verify the DNS settings:
      DNS Service Discovery

      Also on the server you can check the /etc/hsflowd.auto file to verify that the settings have been retrieved.

      Delete
    2. Here's how I did it on a CentoS 6.4 based system:

      [root@gateway dnsmasq.d]# cat /etc/dnsmasq.d/sflow.conf
      srv-host=_sflow._udp.example.com,192.168.1.50,6343,0,0
      txt-record=_sflow._udp.example.com,("txtvers=1" "polling=20" "sampling=512")
      [root@gateway dnsmasq.d]# service dnsmasq restart
      Shutting down dnsmasq: [ OK ]
      Starting dnsmasq: [ OK ]
      [root@gateway dnsmasq.d]# host -t TXT _sflow._udp.example.com
      _sflow._udp.example.com descriptive text "(txtvers=1 polling=30 sampling=100)"
      [root@gateway dnsmasq.d]# host -t SRV _sflow._udp.example.com
      _sflow._udp.example.com has SRV record 0 0 6343 192.168.1.50.

      Delete
  9. Can Host sFlow support different sampling setting on different traffic, like 0.1 sample rate for http traffic (port 80) and 0.01 sample rate for others? Can iptable rules support this for sFlow?

    ReplyDelete
    Replies
    1. Host sFlow supports a single packet sampling rate (set using sampling=N in the hsflow.conf file, or via DNS-SD). However, there are a number of modules that work with the Host sFlow agent to extend visibility (see Host sFlow distributed agent).

      You were asking about setting a sampling rate for port 80 (http). If you are using an Apache web server you could install mod-sflow and configure a sampling rate in the hsflowd.conf file for HTTP operations (sampling.http=M). You will then get application transaction samples and counters for the HTTP traffic, see HTTP

      Delete
  10. I have a pretty generic CentOS 6.5 system that refuses to compile hsflowd with DOCKER=yes. I've installed the libcap-devel package, and it apparently doesn't like capability.h:

    make[1]: Entering directory `/root/host-sflow-code/src/sflow'
    make[1]: `libsflow.a' is up to date.
    make[1]: Leaving directory `/root/host-sflow-code/src/sflow'
    cd src/json; make
    make[1]: Entering directory `/root/host-sflow-code/src/json'
    make[1]: `libjson.a' is up to date.
    make[1]: Leaving directory `/root/host-sflow-code/src/json'
    PLATFORM=`uname`; \
    MYVER=`./getVersion`; \
    MYREL=`./getRelease`; \
    cd src/$PLATFORM; make VERSION=$MYVER RELEASE=$MYREL
    make[1]: Entering directory `/root/host-sflow-code/src/Linux'
    gcc -std=gnu99 -I. -I../sflow -O3 -DNDEBUG -Wall -Wstrict-prototypes -Wunused-value -D_GNU_SOURCE -DHSP_VERSION=1.26.0 -DUTHEAP -DHSF_ULOG -DHSF_JSON -I../json -DHSF_DOCKER -I../json -c hsflowconfig.c
    In file included from hsflowd.h:113,
    from hsflowconfig.c:9:
    /usr/include/linux/netlink.h:44: error: expected specifier-qualifier-list before ‘__u16’
    /usr/include/linux/netlink.h:134: error: expected specifier-qualifier-list before ‘__u16’
    make[1]: *** [hsflowconfig.o] Error 1
    make[1]: Leaving directory `/root/host-sflow-code/src/Linux'
    make: *** [hsflowd] Error 2

    I modified capability.h and added:

    #include

    and commented out:

    #define _LINUX_TYPES_H
    #define _LINUX_FS_H
    #define __LINUX_COMPILER_H
    #define __user
    #define _ASM_X86_SIGCONTEXT_H
    #define _ASM_POWERPC_SIGCONTEXT_H
    #define _SPARC_SIGCONTEXT_H

    and it gets a little further:

    cd src/sflow; make
    make[1]: Entering directory `/root/host-sflow-code/src/sflow'
    make[1]: `libsflow.a' is up to date.
    make[1]: Leaving directory `/root/host-sflow-code/src/sflow'
    cd src/json; make
    make[1]: Entering directory `/root/host-sflow-code/src/json'
    make[1]: `libjson.a' is up to date.
    make[1]: Leaving directory `/root/host-sflow-code/src/json'
    PLATFORM=`uname`; \
    MYVER=`./getVersion`; \
    MYREL=`./getRelease`; \
    cd src/$PLATFORM; make VERSION=$MYVER RELEASE=$MYREL
    make[1]: Entering directory `/root/host-sflow-code/src/Linux'
    gcc -std=gnu99 -I. -I../sflow -O3 -DNDEBUG -Wall -Wstrict-prototypes -Wunused-value -D_GNU_SOURCE -DHSP_VERSION=1.26.0 -DUTHEAP -DHSF_ULOG -DHSF_JSON -I../json -DHSF_DOCKER -I../json -c hsflowconfig.c
    gcc -std=gnu99 -I. -I../sflow -O3 -DNDEBUG -Wall -Wstrict-prototypes -Wunused-value -D_GNU_SOURCE -DHSP_VERSION=1.26.0 -DUTHEAP -DHSF_ULOG -DHSF_JSON -I../json -DHSF_DOCKER -I../json -c dnsSD.c
    gcc -std=gnu99 -I. -I../sflow -O3 -DNDEBUG -Wall -Wstrict-prototypes -Wunused-value -D_GNU_SOURCE -DHSP_VERSION=1.26.0 -DUTHEAP -DHSF_ULOG -DHSF_JSON -I../json -DHSF_DOCKER -I../json -c hsflowd.c
    gcc -std=gnu99 -I. -I../sflow -O3 -DNDEBUG -Wall -Wstrict-prototypes -Wunused-value -D_GNU_SOURCE -DHSP_VERSION=1.26.0 -DUTHEAP -DHSF_ULOG -DHSF_JSON -I../json -DHSF_DOCKER -I../json -c util.c
    gcc -std=gnu99 -I. -I../sflow -O3 -DNDEBUG -Wall -Wstrict-prototypes -Wunused-value -D_GNU_SOURCE -DHSP_VERSION=1.26.0 -DUTHEAP -DHSF_ULOG -DHSF_JSON -I../json -DHSF_DOCKER -I../json -c readInterfaces.c
    readInterfaces.c: In function ‘read_ethtool_info’:
    readInterfaces.c:265: error: field ‘ssi’ has incomplete type
    readInterfaces.c:269: error: ‘ETHTOOL_GSSET_INFO’ undeclared (first use in this function)
    readInterfaces.c:269: error: (Each undeclared identifier is reported only once
    readInterfaces.c:269: error: for each function it appears in.)
    readInterfaces.c: In function ‘readContainerInterfaces’:
    readInterfaces.c:606: warning: implicit declaration of function ‘setns’
    make[1]: *** [readInterfaces.o] Error 1
    make[1]: Leaving directory `/root/host-sflow-code/src/Linux'
    make: *** [hsflowd] Error 2

    I was hoping somebody might have seen this before? Is this even supposed to compile on CentOS 6?

    Thanks.

    ReplyDelete
  11. It looks like linux/types.h needs to be included before sys/capability.h, but that only takes care of the first problem. The version of ethtool that comes with CentOS 6 seems to be a few years old, and so we can't use ETHTOOL_GSSET_INFO. We have to fall back on the old approach that uses ETHTOOL_GDRVINFO. I'll see if I can get this fixed today...

    ReplyDelete
  12. I checked changes into the trunk that allow hsflowd to compile on CentOS 6. To check out the trunk sources, make sure "subversion" is installed, then run:

    svn checkout http://svn.code.sf.net/p/host-sflow/code/trunk host-sflow-code

    Please confirm that it works for you.

    Neil

    ReplyDelete
  13. It now compiles with no warnings or errors. Thank you very much for the fix, it's greatly appreciated!

    ReplyDelete
  14. how can i send packets with CounterRecordType = 5 ONLY!!

    ReplyDelete
    Replies
    1. The Host sFlow agent sends counters described in the sFlow Host Structures and, depending on the configuration, may also export packet samples and generic interface counters for selected host adapters.

      CounterRecordType=5 is the vlan_counters structure. I don't understand how the VLAN counters are applicable to the Host sFlow agent. Can you explain your requirement?

      Delete
  15. How can i add dscp tag to detected elephant flows using hsflow

    ReplyDelete
    Replies
    1. Have you enabled packet sampling? The simplest method is to enable the pcap{} module, see Configuring Host sFlow for Linux via /etc/hsflowd.conf. The sampled packet headers include the DSCP tag.

      Your sFlow analyzer should be able to include dscp information in the flows. For example, using the following sFlow-RT flow definition would identify large flows broken out by dscp, keys='ipsource,ipdestination,ipdscp', see Defining Flows

      Delete
  16. Trying to setup Host SFlow on multiple hosts in Mininet but the Agent IP is always one IP, and modifying the /etc/hsflowd.conf file modifies the IP for all the hosts, so the agent IP is always the same. Is there a way to uniquely identify the host generating the traffic coming to SFlow?

    ReplyDelete