Friday, June 7, 2013

Large flow detection

The familiar television test pattern is used to measure display resolution, linearity and calibration. Since fast and accurate detection of large flows is a pre-requisite for developing load balancing SDN controllers, this article will develop a large flow test pattern and use it to examining the speed and accuracy of large flow detection based on the sFlow standard.
Step Response from Wikipedia
Step or square wave signals are widely used in electrical and control engineering to monitor the responsiveness of a system. In this case we are interested in detecting large flows, defined as a flow consuming at least 10% of a link's bandwidth, see SDN and large flows.

The article, Flow collisions, describes a Mininet 2.0 test bed that realistically emulates network performance. In the test bed, link speed are scaled down to 10Mbit/s so that they can be accurately emulated in software. Therefore, a large flow in the test bed is any flow of 1Mbit/s or greater. The following script uses iperf to generate a test pattern, consisting of 20 second constant rate traffic flows ranging from 1Mbit/s to 10Mbit/s:
iperf -c 10.0.0.3 -t 20 -u -b 1M
sleep 10
iperf -c 10.0.0.3 -t 20 -u -b 2M
sleep 10
iperf -c 10.0.0.3 -t 20 -u -b 3M
sleep 10
iperf -c 10.0.0.3 -t 20 -u -b 4M
sleep 10
iperf -c 10.0.0.3 -t 20 -u -b 5M
sleep 10
iperf -c 10.0.0.3 -t 20 -u -b 6M
sleep 10
iperf -c 10.0.0.3 -t 20 -u -b 7M
sleep 10
iperf -c 10.0.0.3 -t 20 -u -b 8M
sleep 10
iperf -c 10.0.0.3 -t 20 -u -b 9M
sleep 10
iperf -c 10.0.0.3 -t 20 -u -b 10M
The following command configures sFlow on the virtual switch with a 1-in-10 sampling probability and a 1 second counter export interval:
ovs-vsctl -- --id=@sflow create sflow agent=eth0 target=127.0.0.1 \
sampling=10 polling=1 -- \
-- set bridge s1 sflow=@sflow \
-- set bridge s2 sflow=@sflow \
-- set bridge s3 sflow=@sflow \
-- set bridge s4 sflow=@sflow
The following sFlow-RT chart shows a second by second view of the test pattern flows constructed from the real-time sFlow data exported by the virtual switch:
The chart clearly shows the test pattern, a sequence of 10 flows starting at 1Mbit/s. Each large flow is detected within a second or two: the minimum size large flow (1Mbit/s) takes the longest to determine as a large flow (i.e. cross the 1Mbit/s line) and larger flows take progressively less time to classify (the largest flow is determined to be large in under a second). The chart displays not just the volume of each flow, but also identifies the source and destination MAC addresses, IP addresses, and UDP ports - the detailed information needed to configure control actions to steer the large flows, see Load balancing LAG/ECMP groups and ECMP load balancing.

The results can be further validated using output from iperf. The iperf tool consistes of a traffic source (client) and a target (server). The following reports from the server confirm the flow volumes, IP addresses and port numbers:
iperf -su
------------------------------------------------------------
Server listening on UDP port 5001
Receiving 1470 byte datagrams
UDP buffer size:  208 KByte (default)
------------------------------------------------------------
[  3] local 10.0.0.3 port 5001 connected with 10.0.0.1 port 37645
[ ID] Interval       Transfer     Bandwidth        Jitter   Lost/Total Datagrams
[  3]  0.0-20.0 sec  2.39 MBytes  1.00 Mbits/sec   0.033 ms    0/ 1702 (0%)
[  4] local 10.0.0.3 port 5001 connected with 10.0.0.1 port 43101
[  4]  0.0-20.0 sec  4.77 MBytes  2.00 Mbits/sec   0.047 ms    0/ 3402 (0%)
[  4]  0.0-20.0 sec  1 datagrams received out-of-order
[  3] local 10.0.0.3 port 5001 connected with 10.0.0.1 port 49970
[  3]  0.0-20.0 sec  7.15 MBytes  3.00 Mbits/sec   0.023 ms    0/ 5102 (0%)
[  3]  0.0-20.0 sec  1 datagrams received out-of-order
[  4] local 10.0.0.3 port 5001 connected with 10.0.0.1 port 46495
[  4]  0.0-20.0 sec  9.54 MBytes  4.00 Mbits/sec   0.033 ms    0/ 6804 (0%)
[  3] local 10.0.0.3 port 5001 connected with 10.0.0.1 port 34667
[  3]  0.0-20.0 sec  11.9 MBytes  5.00 Mbits/sec   0.050 ms    1/ 8504 (0.012%)
[  3]  0.0-20.0 sec  1 datagrams received out-of-order
[  4] local 10.0.0.3 port 5001 connected with 10.0.0.1 port 47284
[  4]  0.0-20.0 sec  14.3 MBytes  6.00 Mbits/sec   0.050 ms    0/10205 (0%)
[  3] local 10.0.0.3 port 5001 connected with 10.0.0.1 port 55425
[  3]  0.0-20.0 sec  16.7 MBytes  7.00 Mbits/sec   0.028 ms    1/11905 (0.0084%)
[  3]  0.0-20.0 sec  1 datagrams received out-of-order
[  4] local 10.0.0.3 port 5001 connected with 10.0.0.1 port 59881
[  4]  0.0-20.0 sec  19.1 MBytes  8.00 Mbits/sec   0.029 ms    2/13605 (0.015%)
[  4]  0.0-20.0 sec  2 datagrams received out-of-order
[  3] local 10.0.0.3 port 5001 connected with 10.0.0.1 port 44822
[  3]  0.0-20.0 sec  21.5 MBytes  9.00 Mbits/sec   0.037 ms    0/15314 (0%)
[  3]  0.0-20.0 sec  1 datagrams received out-of-order
[  4] local 10.0.0.3 port 5001 connected with 10.0.0.1 port 48150
[  4]  0.0-20.1 sec  23.3 MBytes  9.73 Mbits/sec   0.415 ms    1/16631 (0.006%)
[  4]  0.0-20.1 sec  2 datagrams received out-of-order
A further validation of the results is possible using interface counters exported by sFlow (which were configured to export at 1 second intervals):
The chart shows that the flow measurements (based on packet samples) correspond closely to the measurements based on the periodic interface counter exports (which report the 100% accurate interface counters maintained by the switch ports).

Note: Normally one would not use 1 second counter export with sFlow, the default interval is 30 seconds and values in the range 15 - 30 seconds typically satisfy most requirements, see Measurement delay, counters vs. packet samples.



Link SpeedLarge FlowSampling RatePolling Interval
10 Mbit/s>= 1 Mbit/s1-in-1020 seconds
100 Mbit/s>= 10 Mbit/s1-in-10020 seconds
1 Gbit/s>= 100 Mbit/s1-in-1,00020 seconds
10 Gbit/s>= 1 Gbit/s1-in-10,00020 seconds
40 Gbit/s>= 4 Gbit/s1-in-40,00020 seconds
100 Gbit/s>= 10 Gbit/s1-in-100,00020 seconds

The results scale to higher link speeds using the settings for the table above. Configuring sampling rates from this table ensures that large flows (defined as 10% of link bandwidth) are quickly detected and tracked.

Note: Readers may be wondering if other approaches to large flow detection such as OpenFlow metering, NetFlow, or IPFIX might be suitable for SDN control. These technologies operate by maintaining a flow table within the switch which can be polled, periodically exported, or exported when the flow ends. In all cases the measurements are delayed, limiting the value of the measurements for SDN control applications like load balancing, see Rapidly detecting large flows, sFlow vs. NetFlow/IPFIX. The large flow test pattern described in this article can be used to test the fidelity of large flow detection systems and compare their performance.

Looking at the table, the definition of a large flow on a 100 Gbit/s link is any flow greater than or equal to 10Gbit/s. This may seem like a large number. However, as link speeds increase, applications are being developed to fully utilize their capacity.
From Monitoring at 100 Gigabits/s
The chart from Monitoring at 100 Gigabits/s shows a link carrying thee large flows, each around 40 Gigabits/s. In addition, a flow doesn't have to correspond to an individual UDP/TCP connection. An Internet exchange might define flows as traffic between pairs of MAC addresses, or an Internet Service Provider (ISP) might define flows based on destination BGP AS numbers. The software defined analytics architecture supported by sFlow allows flows to be flexibly defined to suite each environment.

Tailoring flow definitions to minimize the number of flows that need to be managed reduces complexity and churn in the controller, and makes most efficient use of the hardware flow steering capabilities of network switches which currently support a limited number of general match forwarding rules (see OpenFlow Switching Performance: Not All TCAM Is Created Equal). Using our definition of large flows (>=10% of link bandwidth), a 48 port switch would require a maximum of 480 general match rules in order to steer all large flows, which is well within the capabilities of current hardware, while leaving small flows to the normal forwarding logic in the switch, see Pragmatic software defined networking.

No comments:

Post a Comment