Friday, December 17, 2010

ULOG

(Netfilter diagram from Wikimedia)

The Host sFlow agent recently added support for netfilter based traffic monitoring. The netfilter/iptables packet filtering framework is an integral part of recent Linux kernels, providing the mechanisms needed to implement firewalls and perform address translation.

Included within the netfilter framework is a packet sampling facility. In addition to sampling packets, the netfilter framework captures the forwarding path associated with each sampled packet, providing the essential elements needed to implement sFlow standard traffic monitoring on a Linux system.

Instructions for installing Host sFlow are provided in the article, Installing Host sFlow on a Linux server. In many cases configuring traffic monitoring on servers is unnecessary since sFlow capable physical and virtual switches already provide end-to-end network visibility (see Hybrid server monitoring). However, if traffic data isn't available from the switches, either because they don't support sFlow, or because they are managed by a different organization, then traffic monitoring on the servers is required.

This article describes the additional steps needed to configure sFlow traffic monitoring using netfilter. The following steps configure 1-in-1000 sampling of packets on a Fedora 14 server. The sampling rate of 1-in-1000 was selected based on the 1Gbit speed of the network adapter. See the article, Sampling rates, for suggested sampling rates.

First, list the existing iptables rules:

[root@fedora14 ~]# iptables --list --line-numbers --verbose
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
num   pkts bytes target     prot opt in     out     source               destination         
1        0     0 ACCEPT     all  --  lo     any     anywhere             anywhere            
2       93  8415 ACCEPT     all  --  any    any     anywhere             anywhere            state RELATED,ESTABLISHED 
3        1    84 ACCEPT     icmp --  any    any     anywhere             anywhere            
4        1    64 ACCEPT     tcp  --  any    any     anywhere             anywhere            state NEW tcp dpt:ssh 
5        9  1138 REJECT     all  --  any    any     anywhere             anywhere            reject-with icmp-host-prohibited 

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
num   pkts bytes target     prot opt in     out     source               destination         
1        0     0 REJECT     all  --  any    any     anywhere             anywhere            reject-with icmp-host-prohibited 

Chain OUTPUT (policy ACCEPT 68 packets, 9509 bytes)
num   pkts bytes target     prot opt in     out     source               destination

Rules are evaluated in order, so it is important to find the correct place to apply sampling. The first rule in the INPUT chain accepts all traffic associated with the internal loopback interface (lo). This rule is needed because many applications use the loopback interface for inter-process communications. Since we are only interested in external traffic, the ULOG rule should be inserted as rule 2 in this rule chain:

iptables -I INPUT 2 -m statistic --mode random --probability 0.001 -j ULOG --ulog-nlgroup 5

There are currently no rules in the OUTPUT chain, so we can simply add the ULOG rule:

iptables -A OUTPUT -m statistic --mode random --probability 0.001 -j ULOG --ulog-nlgroup 5

Note: Sampling rates are expressed as probabilities, so the sampling rate of 1-in-1000 translates to a probability of 0.001. Only add one sFlow sampling rule to each chain. Duplicate sampling rules will result in biased measurements since the probability of sampling a packet will vary depending on where it matches in the chain. Use the same sampling probability in both INPUT and OUTPUT chains for the same reason.

Note: There are 32 netlink groups (1-32) that can be used to transmit ULOG messages. Check to see if there are any other ULOG statements in iptables and make sure to select a distinct group for sFlow sampling. In this case group 5 has been selected.

Listing the table again confirms that the changes are correct:

[root@fedora14 ~]# iptables --list
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     all  --  anywhere             anywhere            
ULOG       all  --  anywhere             anywhere            statistic mode random probability 0.001000 ULOG copy_range 0 nlgroup 5 queue_threshold 1 
ACCEPT     all  --  anywhere             anywhere            state RELATED,ESTABLISHED 
ACCEPT     icmp --  anywhere             anywhere            
ACCEPT     tcp  --  anywhere             anywhere            state NEW tcp dpt:ssh 
REJECT     all  --  anywhere             anywhere            reject-with icmp-host-prohibited 

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
REJECT     all  --  anywhere             anywhere            reject-with icmp-host-prohibited 

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
ULOG       all  --  anywhere             anywhere            statistic mode random probability 0.001000 ULOG copy_range 0 nlgroup 5 queue_threshold 1 

In many deployments, servers are running in a secure network behind a firewall and so the overhead of running a stateful firewall on each server is unnecessary. In this case a very simple, monitoring only, configuration of iptables provides traffic visibility with minimal impact on server performance:

[root@fedora14 ~]# iptables --list 
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ULOG       all  --  anywhere             anywhere            statistic mode random probability 0.001000 ULOG copy_range 0 nlgroup 5 queue_threshold 1 

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         
ULOG       all  --  anywhere             anywhere            statistic mode random probability 0.001000 ULOG copy_range 0 nlgroup 5 queue_threshold 1 

Once the rules are correct, they should be saved so that they will automatically be reinstalled if the server is rebooted.

[root@fedora14 ~]# service iptables save

The Host sFlow agent needs to be configured to export the samples (by editing the /etc/hsflowd.conf file). The following configuration instructs the Host sFlow agent to use DNS-SD to automatically configure sFlow receivers and polling intervals. The additional ULOG settings tell the agent which ULOG nlgroup to listen to for packet samples as well as the sampling probability that was configured in iptables:

sflow {
  DNSSD = on

  # ULOG settings
  ulogProbability = 0.001
  ulogGroup = 5
}

Note: Make sure that the sampling probability specified in the Host sFlow configuration matches the probability used in the iptables rules. Any discrepancies will result in incorrectly scaled traffic measurements.

Next, restart the Host sFlow agent so that it picks up the new configuration:

[root@fedora14 ~]# service hsflowd restart

Note: The Host sFlow agent can resample ULOG captured packets in order to achieve the sampling rate specified using DNS-SD, or through the sampling setting in the /etc/hsflowd.conf file. Choose a relatively aggressive ULOG sampling probability that reduces the overhead of monitoring, but allows a wide range of sampling rates to be set. For example, configuring the ULOG probability to 0.01 will allow Host sFlow agent sampling rates to be set to 100, 200, 300, 400 etc. The Host sFlow agent will choose the nearest sampling rate it can achieve, so if you configure a sampling rate of 290, it would actually sample with a rate of 300 (i.e. sample every third ULOG packet).

At this point traffic data from the server should start appearing in the sFlow analyzer. The following chart shows top connections monitored using ULOG/Host sFlow:


Finally,  sFlow monitoring of servers is part of an overall solution that simplifies management by unifying network, storage, server and application performance monitoring within a single scalable system (see sFlow Host Structures). Implementing an sFlow monitoring solution helps break down management silos, ensuring the coordination of resources needed to manage a converged infrastructure.

13 comments:

  1. I was not able to get this working on Ubuntu 12.04. iptables modules for statistics are not built/linked correctly so ulog will not work. I used iptables packages for 14.04 that were supposed to be fixed/patched. I can not get anything other than the normal base host sflow metrics to report.

    Other notes -- I have tried this with and without DNSSD = on/off and various forms of iptables commands -- nothing.

    Is DNSSD a must for this to work?

    ReplyDelete
  2. Ubuntu 12.04 has a busted implementation/build of the stats module, ie- ulog does not work. I tried 14.04 pkgs on 12, they installed, but I get no stats reporting in other than host sflow stats.

    Can anyone advise if they've had success on 12.04 (without having to go build patched up iptables packages). I'm also curious if you must have DNSSD = on. I've tried both with no success.

    Thanks!

    ReplyDelete
  3. ULOG target was removed since 3.17.0 kernel release. See:
    http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7200135bc1e61f1437dc326ae2ef2f310c50b4eb
    will it be updated?

    ReplyDelete
    Replies
    1. Yes. We hope to update it to use the newer mechanism. Probably before the end of this month.

      Neil

      Delete
  4. What is that sflow analyzer in the picture?

    ReplyDelete
  5. Looks indeed like Traffic Sentinel (not free), sflow trend cannot do iptables logs. I tried. So have to find a free tool to do the same. So far no luck.

    ReplyDelete
    Replies
    1. sFlowTrend will work, because it's a standard sFlow feed. You just have to get hsflowd configured appropriately for your servers. If ULOG is not working you can now try NFLOG instead. You may need to do something like this before you build hsflowd from the latest sources:

      sudo apt-get install libnfnetlink-dev

      then hsflowd will compile with the hooks that are required. See the latest hsflowd.conf for a commented-out example of using iptables to send packets to NFLOG:

      https://github.com/sflow/host-sflow/blob/master/src/Linux/scripts/hsflowd.conf#L96

      Delete
  6. Hi Neil. I have hsflowd configured to receive NFLOG from iptables but how is it supposed to appear on sflowtrend? I have the free version and it looks like it can only receive data from host (cpu, memory etc.) or from SNMP (routers).

    I did not have the libnfnetlink-dev installed so I will give it another try.

    ReplyDelete
  7. Hi Neil, I used your indications to set my hsflowd deamon with the NFLOG support.

    By launching the deamon manually, I see that the NFLOG socket are ready :

    configVMs
    NFLOG socket fd=7
    initAgent suceeded
    Arena 0:
    system bytes = 135168
    in use bytes = 25728
    Total (incl. mmap):
    system bytes = 135168
    in use bytes = 25728
    max mmap regions = 0
    max mmap bytes = 0
    drop_priviliges: getuid=0
    getrlimit(__RLIMIT_MEMLOCK) = 65536 (max=65536)
    getrlimit(__RLIMIT_NPROC) = 7339 (max=7339)
    getrlimit(RLIMIT_STACK) = 8388608 (max=4294967295)
    getrlimit(RLIMIT_CORE) = 0 (max=4294967295)
    getrlimit(RLIMIT_CPU) = 4294967295 (max=4294967295)
    getrlimit(RLIMIT_DATA) = 4294967295 (max=4294967295)
    getrlimit(RLIMIT_FSIZE) = 4294967295 (max=4294967295)
    getrlimit(__RLIMIT_RSS) = 4294967295 (max=4294967295)
    getrlimit(RLIMIT_NOFILE) = 1024 (max=4096)
    getrlimit(RLIMIT_AS) = 4294967295 (max=4294967295)
    getrlimit(__RLIMIT_LOCKS) = 4294967295 (max=4294967295)
    state -> RUN
    polling interval changed from 0 to 30
    syncOutputFile
    configVMs
    my_os_calloc(128)
    my_os_calloc(256)
    setAddressPriorities
    interfaces added: 0 removed: 0 cameup: 0 wentdown: 0 changed: 0
    selectAgentAddress
    selectAgentAddress selected agentIP with highest priority
    agentAddressChanged=YES


    and IPTABLES seems to be good too :
    Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
    pkts bytes target prot opt in out source destination
    10403 1239K NFLOG all -- any any anywhere anywhere


    But on my collector, sflowtool show me only 0,0,0... so I suppose I missed something, but I don't know where or what ?

    # sflowtool -c 172.31.1.149 -d 6343 -l -4
    CNTR,172.31.1.149,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
    CNTR,172.31.1.149,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
    CNTR,172.31.1.149,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
    CNTR,172.31.1.149,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
    CNTR,172.31.1.149,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

    Do you have an Idea ?

    Thanks for your help
    Thierry

    ReplyDelete
  8. Three suggestions:

    (1) is iptables seeing packets? When I run "iptables --list --versbose" on a test server here I see:

    Chain INPUT (policy ACCEPT 448M packets, 282G bytes)
    pkts bytes target prot opt in out source destination
    118M 61G NFLOG all -- any any anywhere anywhere statistic mode random probability 0.10000000009 nflog-prefix SFLOW nflog-group 5

    (and hsflowd.conf has "nflogGroup=5" and "nflogProbability=0.1")

    (2) If you run it with "hsflowd -ddd" you should see individual messages for every packet received on the NFLOG channel:

    netlink (228 bytes left) msg [len=208 type=1024 flags=0x0 seq=0 pid=0]

    (3) if you check out the very latest sources from github, and install libpcap-dev(el) then you can "make PCAP=yes" and use this in hsflowd.conf as another way to get packets (alternative to ULOG/NFLOG):

    pcap { dev = eth0 }

    If your kernel is 3.19 or later then this works out to be very efficient:
    https://drive.google.com/a/inmon.com/file/d/0B7iu87Nt-FO9UWw1UE50MzdKLVU/view

    Neil

    ReplyDelete
    Replies
    1. Hi Neil,

      I also have the same issue as tbriche.

      $ sflowtool -l
      CNTR,114.212.80.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
      CNTR,114.212.80.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
      CNTR,114.212.80.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
      CNTR,114.212.80.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
      CNTR,114.212.80.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
      CNTR,114.212.80.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

      I have verify the output of my iptables --list --verbose --line-numbers
      Chain INPUT (policy ACCEPT 999 packets, 128K bytes)
      num pkts bytes target prot opt in out source destination
      1 60 48542 NFLOG all -- any any anywhere anywhere statistic mode random probability 0.002500 nflog-prefix "SFLOW" nflog-group 5

      But when I run "hsflowd -ddd", I haven't the output as you said:
      netlink (228 bytes left) msg [len=208 type=1024 flags=0x0 seq=0 pid=0]

      I don't know why. Do you have an idea?

      I also check out the latest sources from github and try using PCAP, but it doesn't work.

      How should I do and could you please help me?

      Thanks,
      Hanyang

      Delete
  9. Hi Neil,
    many thanks for your reply,

    as you suggested, I have verify the output of my iptables --list --verbose, and I noticed that no options were passed. That 's why I forced manually those params into the iptables conf file.

    After that everything worked fine.

    Many thanks again.

    Thierry

    ReplyDelete