(Netfilter diagram from Wikimedia)
The Host sFlow agent recently added support for netfilter based traffic monitoring. The netfilter/iptables packet filtering framework is an integral part of recent Linux kernels, providing the mechanisms needed to implement firewalls and perform address translation.
Included within the netfilter framework is a packet sampling facility. In addition to sampling packets, the netfilter framework captures the forwarding path associated with each sampled packet, providing the essential elements needed to implement sFlow standard traffic monitoring on a Linux system.
Instructions for installing Host sFlow are provided in the article, Installing Host sFlow on a Linux server. In many cases configuring traffic monitoring on servers is unnecessary since sFlow capable physical and virtual switches already provide end-to-end network visibility (see Hybrid server monitoring). However, if traffic data isn't available from the switches, either because they don't support sFlow, or because they are managed by a different organization, then traffic monitoring on the servers is required.
This article describes the additional steps needed to configure sFlow traffic monitoring using netfilter. The following steps configure 1-in-1000 sampling of packets on a Fedora 14 server. The sampling rate of 1-in-1000 was selected based on the 1Gbit speed of the network adapter. See the article, Sampling rates, for suggested sampling rates.
First, list the existing iptables rules:
[root@fedora14 ~]# iptables --list --line-numbers --verbose
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
num pkts bytes target prot opt in out source destination
1 0 0 ACCEPT all -- lo any anywhere anywhere
2 93 8415 ACCEPT all -- any any anywhere anywhere state RELATED,ESTABLISHED
3 1 84 ACCEPT icmp -- any any anywhere anywhere
4 1 64 ACCEPT tcp -- any any anywhere anywhere state NEW tcp dpt:ssh
5 9 1138 REJECT all -- any any anywhere anywhere reject-with icmp-host-prohibited
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
num pkts bytes target prot opt in out source destination
1 0 0 REJECT all -- any any anywhere anywhere reject-with icmp-host-prohibited
Chain OUTPUT (policy ACCEPT 68 packets, 9509 bytes)
num pkts bytes target prot opt in out source destination
Rules are evaluated in order, so it is important to find the correct place to apply sampling. The first rule in the INPUT chain accepts all traffic associated with the internal loopback interface (lo). This rule is needed because many applications use the loopback interface for inter-process communications. Since we are only interested in external traffic, the ULOG rule should be inserted as rule 2 in this rule chain:
iptables -I INPUT 2 -m statistic --mode random --probability 0.001 -j ULOG --ulog-nlgroup 5
There are currently no rules in the OUTPUT chain, so we can simply add the ULOG rule:
iptables -A OUTPUT -m statistic --mode random --probability 0.001 -j ULOG --ulog-nlgroup 5
Note: Sampling rates are expressed as probabilities, so the sampling rate of 1-in-1000 translates to a probability of 0.001. Only add one sFlow sampling rule to each chain. Duplicate sampling rules will result in biased measurements since the probability of sampling a packet will vary depending on where it matches in the chain. Use the same sampling probability in both INPUT and OUTPUT chains for the same reason.
Note: There are 32 netlink groups (1-32) that can be used to transmit ULOG messages. Check to see if there are any other ULOG statements in iptables and make sure to select a distinct group for sFlow sampling. In this case group 5 has been selected.
Listing the table again confirms that the changes are correct:
[root@fedora14 ~]# iptables --list Chain INPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere ULOG all -- anywhere anywhere statistic mode random probability 0.001000 ULOG copy_range 0 nlgroup 5 queue_threshold 1 ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED ACCEPT icmp -- anywhere anywhere ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:ssh REJECT all -- anywhere anywhere reject-with icmp-host-prohibited Chain FORWARD (policy ACCEPT) target prot opt source destination REJECT all -- anywhere anywhere reject-with icmp-host-prohibited Chain OUTPUT (policy ACCEPT) target prot opt source destination ULOG all -- anywhere anywhere statistic mode random probability 0.001000 ULOG copy_range 0 nlgroup 5 queue_threshold 1
In many deployments, servers are running in a secure network behind a firewall and so the overhead of running a stateful firewall on each server is unnecessary. In this case a very simple, monitoring only, configuration of iptables provides traffic visibility with minimal impact on server performance:
[root@fedora14 ~]# iptables --list Chain INPUT (policy ACCEPT) target prot opt source destination ULOG all -- anywhere anywhere statistic mode random probability 0.001000 ULOG copy_range 0 nlgroup 5 queue_threshold 1 Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination ULOG all -- anywhere anywhere statistic mode random probability 0.001000 ULOG copy_range 0 nlgroup 5 queue_threshold 1
Once the rules are correct, they should be saved so that they will automatically be reinstalled if the server is rebooted.
[root@fedora14 ~]# service iptables save
The Host sFlow agent needs to be configured to export the samples (by editing the /etc/hsflowd.conf file). The following configuration instructs the Host sFlow agent to use DNS-SD to automatically configure sFlow receivers and polling intervals. The additional ULOG settings tell the agent which ULOG nlgroup to listen to for packet samples as well as the sampling probability that was configured in iptables:
sflow { DNSSD = on # ULOG settings ulogProbability = 0.001 ulogGroup = 5 }
Note: Make sure that the sampling probability specified in the Host sFlow configuration matches the probability used in the iptables rules. Any discrepancies will result in incorrectly scaled traffic measurements.
Next, restart the Host sFlow agent so that it picks up the new configuration:
[root@fedora14 ~]# service hsflowd restart
Note: The Host sFlow agent can resample ULOG captured packets in order to achieve the sampling rate specified using DNS-SD, or through the sampling setting in the /etc/hsflowd.conf file. Choose a relatively aggressive ULOG sampling probability that reduces the overhead of monitoring, but allows a wide range of sampling rates to be set. For example, configuring the ULOG probability to 0.01 will allow Host sFlow agent sampling rates to be set to 100, 200, 300, 400 etc. The Host sFlow agent will choose the nearest sampling rate it can achieve, so if you configure a sampling rate of 290, it would actually sample with a rate of 300 (i.e. sample every third ULOG packet).
At this point traffic data from the server should start appearing in the sFlow analyzer. The following chart shows top connections monitored using ULOG/Host sFlow:
Finally, sFlow monitoring of servers is part of an overall solution that simplifies management by unifying network, storage, server and application performance monitoring within a single scalable system (see sFlow Host Structures). Implementing an sFlow monitoring solution helps break down management silos, ensuring the coordination of resources needed to manage a converged infrastructure.
I was not able to get this working on Ubuntu 12.04. iptables modules for statistics are not built/linked correctly so ulog will not work. I used iptables packages for 14.04 that were supposed to be fixed/patched. I can not get anything other than the normal base host sflow metrics to report.
ReplyDeleteOther notes -- I have tried this with and without DNSSD = on/off and various forms of iptables commands -- nothing.
Is DNSSD a must for this to work?
Ubuntu 12.04 has a busted implementation/build of the stats module, ie- ulog does not work. I tried 14.04 pkgs on 12, they installed, but I get no stats reporting in other than host sflow stats.
ReplyDeleteCan anyone advise if they've had success on 12.04 (without having to go build patched up iptables packages). I'm also curious if you must have DNSSD = on. I've tried both with no success.
Thanks!
ULOG target was removed since 3.17.0 kernel release. See:
ReplyDeletehttp://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7200135bc1e61f1437dc326ae2ef2f310c50b4eb
will it be updated?
Yes. We hope to update it to use the newer mechanism. Probably before the end of this month.
DeleteNeil
What is that sflow analyzer in the picture?
ReplyDeleteIt looks like the chart was captured from sFlowTrend or Traffic Sentinel
DeleteLooks indeed like Traffic Sentinel (not free), sflow trend cannot do iptables logs. I tried. So have to find a free tool to do the same. So far no luck.
ReplyDeletesFlowTrend will work, because it's a standard sFlow feed. You just have to get hsflowd configured appropriately for your servers. If ULOG is not working you can now try NFLOG instead. You may need to do something like this before you build hsflowd from the latest sources:
Deletesudo apt-get install libnfnetlink-dev
then hsflowd will compile with the hooks that are required. See the latest hsflowd.conf for a commented-out example of using iptables to send packets to NFLOG:
https://github.com/sflow/host-sflow/blob/master/src/Linux/scripts/hsflowd.conf#L96
Hi Neil. I have hsflowd configured to receive NFLOG from iptables but how is it supposed to appear on sflowtrend? I have the free version and it looks like it can only receive data from host (cpu, memory etc.) or from SNMP (routers).
ReplyDeleteI did not have the libnfnetlink-dev installed so I will give it another try.
Hi Neil, I used your indications to set my hsflowd deamon with the NFLOG support.
ReplyDeleteBy launching the deamon manually, I see that the NFLOG socket are ready :
configVMs
NFLOG socket fd=7
initAgent suceeded
Arena 0:
system bytes = 135168
in use bytes = 25728
Total (incl. mmap):
system bytes = 135168
in use bytes = 25728
max mmap regions = 0
max mmap bytes = 0
drop_priviliges: getuid=0
getrlimit(__RLIMIT_MEMLOCK) = 65536 (max=65536)
getrlimit(__RLIMIT_NPROC) = 7339 (max=7339)
getrlimit(RLIMIT_STACK) = 8388608 (max=4294967295)
getrlimit(RLIMIT_CORE) = 0 (max=4294967295)
getrlimit(RLIMIT_CPU) = 4294967295 (max=4294967295)
getrlimit(RLIMIT_DATA) = 4294967295 (max=4294967295)
getrlimit(RLIMIT_FSIZE) = 4294967295 (max=4294967295)
getrlimit(__RLIMIT_RSS) = 4294967295 (max=4294967295)
getrlimit(RLIMIT_NOFILE) = 1024 (max=4096)
getrlimit(RLIMIT_AS) = 4294967295 (max=4294967295)
getrlimit(__RLIMIT_LOCKS) = 4294967295 (max=4294967295)
state -> RUN
polling interval changed from 0 to 30
syncOutputFile
configVMs
my_os_calloc(128)
my_os_calloc(256)
setAddressPriorities
interfaces added: 0 removed: 0 cameup: 0 wentdown: 0 changed: 0
selectAgentAddress
selectAgentAddress selected agentIP with highest priority
agentAddressChanged=YES
and IPTABLES seems to be good too :
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
10403 1239K NFLOG all -- any any anywhere anywhere
But on my collector, sflowtool show me only 0,0,0... so I suppose I missed something, but I don't know where or what ?
# sflowtool -c 172.31.1.149 -d 6343 -l -4
CNTR,172.31.1.149,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
CNTR,172.31.1.149,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
CNTR,172.31.1.149,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
CNTR,172.31.1.149,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
CNTR,172.31.1.149,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Do you have an Idea ?
Thanks for your help
Thierry
Three suggestions:
ReplyDelete(1) is iptables seeing packets? When I run "iptables --list --versbose" on a test server here I see:
Chain INPUT (policy ACCEPT 448M packets, 282G bytes)
pkts bytes target prot opt in out source destination
118M 61G NFLOG all -- any any anywhere anywhere statistic mode random probability 0.10000000009 nflog-prefix SFLOW nflog-group 5
(and hsflowd.conf has "nflogGroup=5" and "nflogProbability=0.1")
(2) If you run it with "hsflowd -ddd" you should see individual messages for every packet received on the NFLOG channel:
netlink (228 bytes left) msg [len=208 type=1024 flags=0x0 seq=0 pid=0]
(3) if you check out the very latest sources from github, and install libpcap-dev(el) then you can "make PCAP=yes" and use this in hsflowd.conf as another way to get packets (alternative to ULOG/NFLOG):
pcap { dev = eth0 }
If your kernel is 3.19 or later then this works out to be very efficient:
https://drive.google.com/a/inmon.com/file/d/0B7iu87Nt-FO9UWw1UE50MzdKLVU/view
Neil
Hi Neil,
DeleteI also have the same issue as tbriche.
$ sflowtool -l
CNTR,114.212.80.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
CNTR,114.212.80.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
CNTR,114.212.80.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
CNTR,114.212.80.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
CNTR,114.212.80.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
CNTR,114.212.80.2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
I have verify the output of my iptables --list --verbose --line-numbers
Chain INPUT (policy ACCEPT 999 packets, 128K bytes)
num pkts bytes target prot opt in out source destination
1 60 48542 NFLOG all -- any any anywhere anywhere statistic mode random probability 0.002500 nflog-prefix "SFLOW" nflog-group 5
But when I run "hsflowd -ddd", I haven't the output as you said:
netlink (228 bytes left) msg [len=208 type=1024 flags=0x0 seq=0 pid=0]
I don't know why. Do you have an idea?
I also check out the latest sources from github and try using PCAP, but it doesn't work.
How should I do and could you please help me?
Thanks,
Hanyang
Hi Neil,
ReplyDeletemany thanks for your reply,
as you suggested, I have verify the output of my iptables --list --verbose, and I noticed that no options were passed. That 's why I forced manually those params into the iptables conf file.
After that everything worked fine.
Many thanks again.
Thierry