sFlow: 2011

Friday, December 30, 2011

Using Ganglia to monitor Memcache clusters

The Ganglia charts show Memcache performance metrics collected using sFlow. Enabling sFlow monitoring in Memcache servers provides a highly scalable solution for monitoring the performance of large Memcache clusters. Embedded sFlow monitoring simplifies deployments by eliminating the need to poll for metrics. Instead, metrics are pushed directly from each Memcache server to the central Ganglia collector. Currently, there is an implementation of sFlow for Memcached, see http://host-sflow.sourceforge.net/relatedlinks.php.

The article, Ganglia 3.2 released, describes the basic steps needed to configure Ganglia as an sFlow collector. Once configured, Ganglia will automatically discover and track new Memcache servers as they are added to the network.

Note: To try out Ganglia's sFlow/Memcache reporting, you will need to download Ganglia 3.3.

By default, Ganglia will automatically start displaying the Memcache metrics. However, there are two optional configuration settings available in the gmond.conf file that can be used to modify how Ganglia handles the sFlow Memcache metrics.

sflow{
  accept_memcache_metrics = no
  multiple_memcache_instances = no
}

Setting the accept_memcache_metrics flag to no will cause Ganglia to ignore sFlow Memcache metrics.

The multiple_memcache_instances setting must be set to yes in cases where there are multiple Memcache instances running on each server in the cluster. Each Memcache instance will be identified by the server port included in the title of the charts. For example, the following chart is reporting on the Memcache server listening on port 11211 on host ganglia:

Ganglia and sFlow offers a comprehensive view of the performance of a cluster of Memcache servers, providing not just Memcache related metrics, but also the server CPU, memory, disk and network IO performance metrics needed to fully characterize cluster performance.

Note: A Memcache sFlow agent does more than simply export performance counters, it also exports detailed data on Memcache operations that can be used to monitor hot keys, missed keys, top clients etc. The operation data complements the counter data displayed in Ganglia, helping to identify the root cause of problems. For example, Ganglia was showing that the Memcache miss rate was high and an examination of the transactions identified a mistyped key in the application code as the root cause. In addition, Memcache performance is critically dependent on network latency and packet loss - here again, sFlow provides the necessary visibility since most switch vendors already include support for the sFlow standard.

Thursday, December 29, 2011

Using Ganglia to monitor Java virtual machines

The Ganglia charts show the standard sFlow Java virtual machine metrics. The combination of Ganglia and sFlow provides a highly scalable solution for monitoring the performance of clustered Java application servers. The sFlow Java agent for stand-along Java services, or Tomcat sFlow for web-based servlets, simplify deployments by eliminating the need to poll for metrics using a Java JMX client. Instead, metrics are pushed directly from each Java virtual machine to the central Ganglia collector.

Note: The Tomcat sFlow agent also allows Ganglia to report HTTP performance metrics.

The article, Ganglia 3.2 released, describes the basic steps needed to configure Ganglia as an sFlow collector. Once configured, Ganglia will automatically discover and track new servers as they are added to the network. The articles, Java virtual machine and Tomcat, describes the steps needed to instrument existing Java applications and Apache Tomcat servlet engines respectively. In both cases the sFlow agent is included when starting the Java virtual machine and requires minimal configuration and no change to the application code.

Note: To try out Ganglia's sFlow/Java reporting, you will need to download Ganglia 3.3.

By default, Ganglia will automatically start displaying the Java virtual machine metrics. However, there are two optional configuration settings available in the gmond.conf file that can be used to modify how Ganglia handles the sFlow Java metrics.

sflow{
  accept_jvm_metrics = yes
  multiple_jvm_instances = no
}

Setting the accept_jvm_metrics flag to no will cause Ganglia to ignore Java virtual machine metrics.

The multiple_jvm_instances setting must be set to yes in cases where there are multiple Java virtual machine instances running on each server in the cluster. Charts associated with each java virtual machine instance will be identified by a unique "hostname" included in the title of its charts. For example, the following chart is identified as being associated with the apache-tomcat java virtual machine on host xenvm4.sf.inmon.com:

Ganglia and sFlow offers a comprehensive view of the performance of a cluster of Java servers, providing not just Java related metrics, but also the server CPU, memory, disk and network IO performance metrics needed to fully characterize cluster performance.

Wednesday, December 28, 2011

Using Ganglia to monitor web farms

The Ganglia charts show HTTP performance metrics collected using sFlow. Enabling sFlow monitoring in web servers provides a highly scalable solution for monitoring the performance of large web farms. Embedded sFlow monitoring simplifies deployments by eliminating the need to poll for metrics or tail log files. Instead, metrics are pushed directly from each web server to the central Ganglia collector. Currently, there are implementation of sFlow for Apache, NGINX, Tomcat and node.js web servers, see http://www.sflow.net/relatedlinks.php.

The article, Ganglia 3.2 released, describes the basic steps needed to configure Ganglia as an sFlow collector. Once configured, Ganglia will automatically discover and track new web servers as they are added to the network.

Note: To try out Ganglia's sFlow/HTTP reporting, you will need to download Ganglia 3.3.

By default, Ganglia will automatically start displaying the HTTP metrics. However, there are two optional configuration settings available in the gmond.conf file that can be used to modify how Ganglia handles the sFlow HTTP metrics.

sflow{
  accept_http_metrics = yes
  multiple_http_instances = no
}

Setting the accept_http_metrics flag to no will cause Ganglia to ignore sFlow HTTP metrics.

The multiple_http_instances setting must be set to yes in cases where there are multiple HTTP instances running on each server in the cluster. Charts associated with each HTTP instance are identified by the server port included in the title of its charts. For example, the following chart is reporting on the web server listening on port 8080 on host xenvm4.sf.inmon.com:

Ganglia and sFlow provide a comprehensive view of the performance of a cluster of web servers, providing not just HTTP related metrics, but also the server CPU, memory, disk and network IO performance metrics needed to fully characterize cluster performance.

Note: An HTTP sFlow agent does more than simply export performance counters, it also exports detailed transaction data that can be used to monitor top URLs, top Referers, top clients, response times etc. The transaction data complements the counter data displayed in Ganglia, helping to identify the root cause of problems. For example, Ganglia was showing a sudden increase in HTTP requests and an examination of the transactions demonstrated that the increase was a denial of service attack, identifying the targeted URL and the list of attacker IP addresses.

Thursday, December 22, 2011

Merchant silicon

The following chart, from Commoditization of Ethernet Switches: How Value is Flowing into Silicon, shows the rapidly increasing market share of network switches based on Broadcom, Marvell and Intel (Fulcrum) chipsets (often referred to as "merchant silicon") as switch vendors move from proprietary ASICs to off-the-shelf designs.

Off-the-shelf vs. Internal Silicon Design

As an example, many vendors now base their 10 Gigabit top of rack switches on Broadcom chipsets. Often vendors don't disclose when they are using merchant silicon, however, based on news reports, similarities in specifications and rumors, the following switches appear to use similar Broadcom chipsets: IBM BNT RackSwitch G8264, Juniper QFX3500, Cisco Nexus 3064, Arista 7050S-64, HP 5900-AF, Alcatel-Lucent Omniswitch 6900 and Dell Force10 S4810.

In addition to reducing costs, the move to merchant silicon helps increase multi-vendor interoperability and support for standards. For example, the sFlow standard is widely implemented in merchant chipsets and the adoption of merchant silicon for 10 Gigabit top of rack switches has greatly increased the presence of sFlow in data centers. The Network World article, OpenFlow, Merchant Silicon, and the Future of Networking, suggests that the rising popularity of merchant silicon is also helping to drive adoption of the OpenFlow standard.

Together, the sFlow and OpenFlow standards transform data center networking by providing the integrated visibility and control needed to adapt to changing workloads in converged, virtualized and cloud environments.

Thursday, December 8, 2011

Routing Open vSwitch into the mainline

The December 1st issue of LWN.net kernel development news includes the article Routing Open vSwitch into the mainline describing the Open vSwitch and reporting that "Open vSwitch was pulled into the networking tree on December 3; expect it in the 3.3 kernel."

This is exciting news! The inclusion of Open vSwitch support in the mainline Linux kernel integrates the advanced network visibility and control capabilities (through support of sFlow and OpenFlow) needed for virtualizing networking in cloud environments. Open vSwitch is already the default switch in the Xen Cloud Platform (XCP) and Citrix XenServer 6 and inclusion within the Linux kernel will help to further unify networking across open source virtualization systems, including: Xen, KVM, Proxmox VE and VirtualBox. In addition, integrated sFlow and OpenFlow support has also been demonstrated for the upcoming Windows 8 version of Microsoft's Hyper-V virtualization platform and sFlow and OpenFlow are also widely supported by network equipment vendors.

Broad support for open standards like sFlow and OpenFlow is critical, integrating the visibility and control capabilities within physical and virtual network elements that allows orchestration systems such as OpenStack, openQRM, and OpenNebula to automate and optimize management of network and server resources in cloud data centers.

Monday, December 5, 2011

sflowtool

The sflowtool command line utility is used to convert standard sFlow records into a variety of different formats. While there are a large number of native sFlow analysis applications, familiarity with sflowtool is worthwhile since it allows a wide variety of additional tools to analyze sFlow data as well as opening up the data to custom scripting.

First download, compile and install sflowtool using the following commands:

[root@xenvm4 ~]# wget http://www.inmon.com/bin/sflowtool-3.22.tar.gz
[root@xenvm4 ~]# tar -xvzf sflowtool-3.22.tar.gz
[root@xenvm4 ~]# cd sflowtool-3.22
[root@xenvm4 sflowtool-3.22]# ./configure
[root@xenvm4 sflowtool-3.22]# make
[root@xenvm4 sflowtool-3.22]# make install

Update 14 August 2015: Download the latest version of sflowtool from GitHub, https://github.com/sflow/sflowtool/archive/master.zip

The default behavior of sflowtool is to convert sFlow into ASCII text:

[root@xenvm4 ~]# sflowtool
startDatagram =================================
datagramSourceIP 10.0.0.111
datagramSize 144
unixSecondsUTC 1321922602
datagramVersion 5
agentSubId 0
agent 10.0.0.20
packetSequenceNo 3535127
sysUpTime 270660704
samplesInPacket 1
startSample ----------------------
sampleType_tag 0:2
sampleType COUNTERSSAMPLE
sampleSequenceNo 228282
sourceId 0:14
counterBlock_tag 0:1
ifIndex 14
networkType 6
ifSpeed 100000000
ifDirection 0
ifStatus 3
ifInOctets 4839078
ifInUcastPkts 15205
ifInMulticastPkts 0
ifInBroadcastPkts 4294967295
ifInDiscards 0
ifInErrors 0
ifInUnknownProtos 4294967295
ifOutOctets 149581962744
ifOutUcastPkts 158884229
ifOutMulticastPkts 4294967295
ifOutBroadcastPkts 4294967295
ifOutDiscards 101
ifOutErrors 0
ifPromiscuousMode 0
endSample   ----------------------
endDatagram   =================================

The text output of flowtool is easily processed using scripts. The following example provides a basic skeleton for processing the output of sflowtool in Perl:

#!/usr/bin/perl -w
use strict;
use POSIX;

open(PS, "/usr/local/bin/sflowtool|") || die "Failed: $!\n";
while( <PS> ) {  
  my ($attr,$value) = split;
 
  # process attribute  
}

close(PS);

Examples of scripts using sflowtool on this blog include Memcached hot keys and Memcached missed keys. Other examples include converting sFlow for Graphite and RRDtool.

The sFlow standard extends to application layer monitoring, including visibility into HTTP performance. Implementations of sFlow for popular web servers, including Apache, NGINX, Tomcat and node.js offer real-time visibility into large web farms.

The -H option causes sflowtool to output the HTTP request samples using the combined log format, making the data accessible to most log analyzers.

[root@xenvm4 ~]# sflowtool -H
10.0.0.70 - - [22/Nov/2011:12:36:32 -0800] "GET http://sflow.org/images/h-photo.jpg HTTP/1.1" 304 0 "http://sflow.org/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.120 Safari/535.2"
10.0.0.70 - - [22/Nov/2011:12:36:32 -0800] "GET http://sflow.org/inc/nav.js HTTP/1.1" 304 0 "http://sflow.org/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.120 Safari/535.2"
10.0.0.70 - - [22/Nov/2011:12:36:32 -0800] "GET http://sflow.org/images/participant-foundry.gif HTTP/1.1" 304 0 "http://sflow.org/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.120 Safari/535.2"

For example, the following commands use sflowtool and webalizer to create reports:

/usr/local/bin/sflowtool -H | rotatelogs log/http_log &
webalizer -o report log/*

The resulting webalizer report shows top URLs:

The sFlow standard operates by randomly sampling packet headers. The sflowtool -t option allows sFlow to be used for remote packet capture, converting packet header information from sFlow to standard pcap format that can be used with packet analysis applications.

The following example uses sflowtool and tcpdump to display a packet trace:

[root@xenvm4 ~]# sflowtool -t | tcpdump -r - -vv
reading from file -, link-type EN10MB (Ethernet)
10:30:01.000000 arp who-has 10.0.0.66 tell 10.0.0.220
10:30:07.000000 IP (tos 0x0, ttl  64, id 49952, offset 0, flags [DF], proto: TCP (6), length: 1500) xenserver1.sf.inmon.com.39120 > openfiler.sf.inmon.com.iscsi-target: . 2757963136:2757964584(1448) ack 4136690254 win 3050 
10:30:07.000000 IP (tos 0x0, ttl  64, id 49953, offset 0, flags [DF], proto: TCP (6), length: 1500) xenserver1.sf.inmon.com.39120 > openfiler.sf.inmon.com.iscsi-target: . 1448:2896(1448) ack 1 win 3050 
10:30:07.000000 IP (tos 0x0, ttl  64, id 49954, offset 0, flags [DF], proto: TCP (6), length: 1500) xenserver1.sf.inmon.com.39120 > openfiler.sf.inmon.com.iscsi-target: . 2896:4344(1448) ack 1 win 3050

The Wireshark article describes how to use sflowtool and Wireshark to graphically display packet information.

sflowtool can also be used to convert sFlow to NetFlow version 5. The following command converts sFlow records into NetFlow records and sends them to UDP port 9991 on netflow.inmon.com:

[root@xenvm4 ~]# sflowtool -c netflow.inmon.com -d 9991

Converting sFlow to NetFlow provides compatibility with NetFlow analyzers. However, converting sFlow to NetFlow results in a significant loss of information and it is better to use a native sFlow analyzer to get the full value of sFlow. In many cases traffic analysis software supports both sFlow and NetFlow, so conversion is unnecessary.

Finally, sFlow provides information on network, server, virtual machine and application performance and the sflowtool source code offers developers a useful starting point for adding sFlow support to network, server and application performance monitoring software - see Developer resources for additional information.

Wednesday, November 23, 2011

Wireshark

Wireshark (previously called Ethereal) is a popular, free, open source protocol analyzer. This article will demonstrate how Wireshark can be used with sFlow to remotely capture traffic. For background, the article Packet capture describes some of the reasons why the multi-vendor sFlow standard should be considered as an option for packet capture, particularly in high-speed, switched Ethernet, environments.

The first step is to configure the network switches to monitor selected links and send sFlow to the host that will be used for packet analysis - configuration instructions for most switch vendors are available on this blog. Alternatively, if sFlow is already being used for network-wide visibility then obtaining an sFlow feed can be as simple as directing the sFlow analyzer to forward sFlow to Wireshark.

The article CaptureSetup/Pipes describes how Wireshark can be configured to receive packets on a pipe. The following command launches Wireshark, using sflowtool to extract packets from the sFlow feed and pipe them into Wireshark:

[root@xenvm4 ~]# wireshark -k -i <(sflowtool -t)

Wireshark provides a real-time, graphical display of captured packets. The following screen shot shows packets captured using sFlow:

Packet trace in Wireshark captured using sFlow

In addition to being able to decode and filter packets, Wireshark has a number of statistical reporting capabilities. The following screen shot shows protocol statistics generated using captured sFlow data:

Protocol statistics in Wireshark captured using sFlow

When looking at sFlow statistics in Wireshark, it is important remember that sFlow is a sampling technology and that the numbers should be scaled up by the sampling rate. In this case a sampling rate of 1 in 1000 was configured so while the percentages are correct, the Packets, Bytes and Mbit/s numbers need to be multiplied by 1000. Looking at the top, highlighted, line the total values should be 24,000 packets, 25 Megabytes and 2 Mbit/s (not 24 packets, 24 Kilobytes and 0.002 Mbit/s shown in the table).

Because sFlow is a packet sampling technology there are limitations to the type of protocol following you can do in Wireshark. However, there are offsetting benefits. If you don't know which links to tap to solve a
problem you can use sFlow to cast a wide net and capture packets from hundreds, or even thousands of links simultaneously. Using sFlow also lets you easily monitor 1, 10, 40 and 100GigE ports without
overwhelming Wireshark.

In addition to its graphical interface, Wireshark also offers a text-only interface to facilitate scripting. The tshark command runs Wireshark in text mode, providing similar functionality to tcpdump. The following example uses sflowtool to extract packets from the sFlow feed and pipe them into tshark :

[root@xenvm4 ~]# tshark -i<(sflowtool -t)
Running as user "root" and group "root". This could be dangerous.
Capturing on /dev/fd/63
  0.000000    10.0.0.16 -> 10.0.0.18    TCP 37366 > iscsi-target [PSH, ACK] Seq=1 Ack=1 Win=3050 Len=1200 TSV=472366446 TSER=1180632633
  5.000000    10.0.0.16 -> 10.0.0.18    TCP twamp-control > nfs [ACK] Seq=1 Ack=1 Win=2560 Len=1448 TSV=472366931 TSER=1180633845[Packet size limited during capture]
  5.000000    10.0.0.16 -> 10.0.0.18    TCP twamp-control > nfs [ACK] Seq=1449 Ack=1 Win=2560 Len=1448 TSV=472366931 TSER=1180633845

Wireshark's interactive filtering and browsing capabilities, combined with an extensive library of protocol decodes, provides the detail needed to diagnose network problems using packet headers captured by switches using sFlow. The protocol analysis capabilities of Wireshark complement the network-wide visibility provided by an sFlow analyzer, extracting additional details that are useful for troubleshooting.

Tuesday, November 22, 2011

Packet capture

Why use sFlow for packet analysis? To rephrase the Heineken slogan, sFlow reaches the parts of the network that other technologies cannot reach. The sFlow standard is widely supported by switch vendors, embedding wire-speed packet monitoring throughout the network. With sFlow, any link or group of links can be remotely monitored. The alternative approach of physically attaching a probe to a SPAN/Mirror port is becoming much less feasible with increasing network sizes (10's of thousands of switch ports) and link speeds (10, 40 and 100 Gigabits). Using sFlow for packet capture doesn't replace traditional packet analysis, instead sFlow extends the capabilities of existing packet capture tools into the high speed switched network.

This article uses the tcpdump packet analyzer, readily available on most platforms, to demonstrate how to use sFlow to remotely capture and analyze network traffic.

The first step is to configure the network switches to monitor selected links and send sFlow to the host that will be used for packet analysis - configuration instructions for most switch vendors are available on this blog. Alternatively, if sFlow is already being used for network-wide visibility then obtaining an sFlow feed can be as simple as directing the sFlow analyzer to forward sFlow to the packet analyzer.

Next, perform packet analysis on the host. The following command displays a packet trace, using sflowtool to extract packets from the sFlow feed and pipe them into tcpdump:

[root@xenvm4 ~]# sflowtool -t | tcpdump -r - -vv
reading from file -, link-type EN10MB (Ethernet)
10:30:01.000000 arp who-has 10.0.0.66 tell 10.0.0.220
10:30:07.000000 IP (tos 0x0, ttl  64, id 49952, offset 0, flags [DF], proto: TCP (6), length: 1500) xenserver1.sf.inmon.com.39120 > openfiler.sf.inmon.com.iscsi-target: . 2757963136:2757964584(1448) ack 4136690254 win 3050 
10:30:07.000000 IP (tos 0x0, ttl  64, id 49953, offset 0, flags [DF], proto: TCP (6), length: 1500) xenserver1.sf.inmon.com.39120 > openfiler.sf.inmon.com.iscsi-target: . 1448:2896(1448) ack 1 win 3050 
10:30:07.000000 IP (tos 0x0, ttl  64, id 49954, offset 0, flags [DF], proto: TCP (6), length: 1500) xenserver1.sf.inmon.com.39120 > openfiler.sf.inmon.com.iscsi-target: . 2896:4344(1448) ack 1 win 3050

Note: Using sflowtool to convert sFlow into standard pcap format makes the sFlow data accessible to the wide variety of packet analysis applications that support the standard.

Next time you have to diagnose a network problem, rather than spending the night in the data center with a crash cart, stay at your desk and try out remote monitoring with sFlow. It may not be the solution to all problems, but it is surprising how many can be quickly resolved without leaving your desk.

Thursday, November 17, 2011

Monitoring at 100 Gigabits/s

Chart 1: Top Connections on a 100 Gigabit link

Chart 1 shows the top connections on a 100 Gigabit Ethernet link monitored at the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC11).

Chart 2: Packet counters on 100 Gigabit link

Chart 2 shows the packet rates on the link, approximately 10 million packets per second ingress and 5 million packets per second egress. The maximum packet rate on a 100 Gigabit, full duplex, link is approximately 300 million packets per second (150 million in each direction) making traffic monitoring an interesting challenge.

The article, Ok, but how much time do I have? discusses some of the challenges in monitoring at 10 Gigabit speeds: at 100 Gigabits the challenge is 10 times greater, requiring that the probe process each packet within 3 nanoseconds to provide wire-speed monitoring of a full-duplex link. Probe vendors (e.g. upcoming EndaceExtreme probe) are working to meet this challenge using custom hardware. However, the costs associated with probes and the added operational complexity of maintaining a probe-based solution is prohibitive for most applications, particularly if large numbers of links need to be monitored.

In this instance, the switch hardware, a Brocade MLXe, includes support for the sFlow traffic monitoring standard. Embedding the instrumentation in the switch hardware delivers continuous, wire-speed, monitoring of all switch ports: the switch has a total of 15.36 Terabits, 4.8 billion packets per second, of routing capacity.

Monitoring using sFlow has minimal overhead and is typically enabled on every interface on every switch to provide network-wide visibility. A central sFlow analyzer continuously monitors all the interfaces and can be queried to generate traffic reports. In this case InMon's Traffic Sentinel was used to monitor the switches in the SC11 network (SCinet) and generate the charts shown in this article.

The sFlow standard is widely supported by switch vendors. Selecting switches with sFlow support when upgrading or building out a new network provides comprehensive visibility into network performance at minimal cost. Retrofitting monitoring with probes is expensive and provides limited coverage.

Wednesday, November 16, 2011

SC11 OpenFlow testbed

ESnet/ECSEL Demo at SC11

The SCinet Research Sandbox, a part of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC11), is being used to demonstrate OpenFlow applications using switches from IBM (BNT), HP, NEC and Pronto. Of the four switch models that form the testbed, three support standard sFlow monitoring (IBM, HP and NEC), providing detailed visibility into traffic within the testbed.

In the End-to-End Circuit Service at Layer 2 (ECSEL) demonstration, OpenFlow is used to provision RDMA over Converged Ethernet (RoCE) connectivity through an NEC switch.

Chart 1: Top Connections in OpenFlow Testbed

Chart 1 shows RoCE as the dominant traffic in the testbed, saturating the 10G links. This chart demonstrates how convergence is changing the nature of data center traffic as new storage and clustering workloads place heavy demands on the shared network. The sFlow standard embeds monitoring within the network switches to provide the network-wide visibility needed to manage increased demand for bandwidth.

Note: The RoCE protocol, along with FCoE, AoE and many other data center protocols operate at layer 2 so traditional approaches to monitoring that rely on layer 3 monitoring using IP flow records from routers and firewalls are of limited value.

The OpenFlow protocol provides a way for external software to control the forwarding decisions within switches. In this case, OpenFlow is being used to provision the network to carry the RoCE connections. Maintaining connectivity between the switches and the controllers is essential in order to maintain control of the network.

Chart 2: Traffic between OpenFlow Switches and Open Flow Controllers (NOX)

Chart 2 shows that OpenFlow control traffic is being carried in-band and that it is consuming very little network bandwidth. However, in-band control traffic is potentially vulnerable to interference from large bandwidth consumers (like RoCE).

Chart 3: OpenFlow Traffic and Network Priority

Chart 3 shows that the Quality of Service (QoS) policy for all the OpenFlow connections is Best Effort (BE). Best Effort is the default priority class for traffic in the network, leaving the OpenFlow control channels vulnerable to network congestion. Assigning a high priority to the OpenFlow protocol would ensure that OpenFlow messages could traverse the network during periods of congestion, allowing the controller to take corrective action to mitigate the congestion.

OpenFlow and sFlow

The paper, sFlow and OpenFlow, describes how the OpenFlow and sFlow standards complement one another. Monitoring using sFlow provides visibility into network traffic that allows the OpenFlow controller to dynamically allocate network resources based on changing network demands.

More generally, the Data center convergence, visibility and control presentation describes the critical role that measurement plays in managing costs and optimizing performance in converged, virtualized and cloud environments.

Tuesday, November 15, 2011

Eye of Sauron

Credit: The Lord of the Rings: Return of the King

In The Lord of the Rings: The Return of the King, Sauron's eye is drawn to movement, making it hard for his enemies to escape notice. The sFlow packet sampling mechanism operates in a similar way, devoting resources where they are most needed in order to provide network-wide visibility.

In a typical sFlow deployment, every port on every switch is configured to sample traffic with fixed probabilities. This strategy for setting sampling rates is effective because the distribution of traffic in data centers is extremely irregular: only a small number of links are busy at any given moment and the set of busy links can change quickly. As the traffic on a link increases, additional samples are generated, allowing the central sFlow analyzer to immediately detect the increased traffic and the path the traffic takes across the network. When the link traffic decreases, fewer samples are generated, reducing the load on the sFlow analyzer so that it can focus on active parts of the network.

The sFlow standard offers network-wide surveillance with the scalability to monitor tens of thousands of links. As network convergence and virtualization puts increasing pressure on the network, visibility is essential for the effective control of network resources needed to deliver reliable services. Building a network visibility strategy around sFlow maximizes the choice of vendors, ensures interoperable monitoring in mixed vendor environments, eliminates vendor lock-in and facilitating "best in class" product selection.

Tuesday, November 8, 2011

DevOps

Credit: Wikimedia

DevOps is an emerging set of principles, methods and practices for communication, collaboration and integration between software development and IT operations professionals - Wikipedia.

The article Instrumentation and Observability describes the critical role that instrumentation plays in the DevOps process, "To progress, one must ask questions. These questions must be answered." The article goes on to state, "To observe a situation without changing it is the ultimate achievement." Finally, the case is made for pervasively embedding instrumentation within the production environment, "applications should expose this information as a consequence of normal behavior."

The sFlow standard embeds lightweight instrumentation within switches, servers and applications throughout the data center. sFlow is highly scalable, combining an efficient "push" mechanism with statistical sampling in order to provide continuous, real-time, data center wide visibility.

The article, Host-based sFlow: the drop-in, cloud-friendly monitoring standard, describes some of the operational benefits of sFlow. The granular visibility into scale-out web applications provided by sFlow facilitates DevOps by allowing software developers to see how services perform at scale and identify bottlenecks that can be eliminated through continuous refinement of application logic. At the same time, visibility into application transactions, response times and throughput allows operations teams to flexibly allocate network and server resources as demand changes, controlling costs and ensuring optimal performance.

Tuesday, October 4, 2011

Comparing sFlow and NetFlow in a vSwitch

As virtualization shifts the network edge from top of rack switches to software virtual switches running on the hypervisors; visibility in the virtual switching layer is essential in order to provide network, server and storage management teams with the information needed to coordinate resources and ensure optimal performance.

The recent release of Citrix XenServer 6.0 provides an opportunity for a side-by-side comparison of sFlow and NetFlow monitoring technologies since both protocols are supported by the Open vSwitch that is now the default XenServer network stack.

The diagram above shows the experimental setup. Traffic between the virtual machines VM1 and VM2 passes through the Virtual Switch where sFlow and NetFlow measurements are simultaneously generated. The sFlow is sent to an sFlow Analyzer (InMon sFlowTrend) and the NetFlow to a NetFlow Analyzer (SolarWinds Real-Time NetFlow Analyzer). Both tools are running in tandem making it is easy to perform side by side comparisons to see differences in the visibility that NetFlow and sFlow provide into the same underlying traffic.

Note: XenServer 6.0, sFlowTrend and Real-Time NetFlow Analyzer are all available at no charge, making it easy for anyone to reproduce these tests.

Configuration

The Host sFlow supplemental pack was installed to automate sFlow configuration of the Open vSwitch and to export standard sFlow Host metrics. The following /etc/hsflowd.conf file sets the packet sampling rate to 1-in-400, counter polling interval to 20 seconds and sends sFlow to sFlowTrend running on host 10.0.0.42 and listening on UDP port 6343.

sflow{
  DNSSD = off
  polling = 20
  sampling = 400
  collector{
    ip = 10.0.0.42
    udpport = 6343
  }
}

The following command was used to manually configure NetFlow monitoring, sending NetFlow to the Real-Time NetFlow Analyzer running on host 10.0.0.42 and listening on UDP port 2055:

ovs−vsctl −− set Bridge xenbr0 netflow=@nf \
−− −−id=@nf create NetFlow targets=\"10.0.0.42:2055\" active−timeout=60

Results

The following charts show the top protocols measured using sFlow and NetFlow:

Top Protocols in sFlowTrend

Top Protocols in Real-Time NetFlow Analyzer

Looking at the two charts, both show similar average traffic levels. The sFlowTrend chart shows the ingress Memcache (TCP:11211) traffic at between 0.7 and 0.9 Mb/s. Looking at the Real-time NetFlow Analyzer total traffic table, 464.41Mb were seen over the last 11 minutes 47 seconds, giving an average rate of 0.66 Mb/s. The sFlowTrend measurements are consistently higher since they include the bandwidth consumed by layer 2 headers whereas NetFlow only reports on layer 3 bytes. However, the layer 2 overhead can be estimated by assuming that an additional 18 bytes per packet (MAC source, MAC destination, type and CRC) and multiplying by the total packets count (492,036), resulting in an additional 0.1 Mb/s which brings the NetFlow measurement to 0.76Mb/s, putting it into agreement with the sFlow measurements.

Note: The overhead associated with Ethernet headers and tunneling protocols can represent a significant fraction of overall bandwidth. By exporting packet headers, sFlow provides detailed information on the encapsulations and their overhead. NetFlow does not provide a direct measure of total bandwidth.

The periodic, 60 second, spikes in traffic shown on the NetFlow Analyzer chart are an artifact of the way NetFlow reports on long running connections. With NetFlow, packet and byte counters are maintained for each connection in a flow cache within the switch. When the connection terminates, a flow record is generated containing the connection information and counters. The active-timeout setting in the NetFlow configuration is used to ensure visibility into long running connections, causing the switch to periodically export NetFlow records for active connections. In contrast, sFlow does not use a flow cache, instead sampled packet headers are continually exported, resulting in real-time charts that more accurately reflect the traffic trend.

In addition, exporting packet headers allows an sFlow analyzer to monitor all types of traffic flowing across the switch; note the ARP and IPv6 traffic displayed in sFlowTrend in addition to the TCP/UDP flows. Visibility into layer 2 traffic is particularly important in switched environments where protocols such as DHCP/BOOTP, STP, LLDP and ARP need to be closely managed. sFlow also provides visibility into networked storage, including Ethernet SAN technologies (e.g. FCoE or AoE), that typically dominates bandwidth usage in the data center. Looking forward, there are a number of tunneling protocols being developed to connect virtual switches, including: GRE, mpls, VPLS, VXLAN and NVGRE. As new protocols are deployed on the network they are easily monitored without any change to exiting sFlow agents ensuring end-to-end visibility across the physical and virtual network.

In contrast, NetFlow relies on the switch to decode the traffic. In this case the switch is exporting NetFlow version 5 which only exports records for IPv4 traffic. The NetFlow analyzer is thus only able to report on IPv4 protocols, all other traffic is invisible. This limitation is not unique to Open vSwitch; NetFlow version 5 is the most widely supported version of NetFlow in network devices and is also the version exported by VMware vSphere 5.0.

The next two charts show top connections flowing through the virtual switch:

Top Connections in sFlowTrend

Top Connections in Real-Time NetFlow Analyzer

The Top Connections charts further demonstrate the limitation in NetFlow visibility where only IPv4 flows are shown. The sFlow analyzer is able to report in detail on all types of traffic flowing through the switch, in this case showing details of IPv6 traffic in addition to IPv4 flows.

The next two charts show interface utilization and packet counts from sFlowTrend:

Link Utilization in sFlowTrend

Link Counters in sFlowTrend

This type of interface trending is a staple of network management, but obtaining the information is challenging in virtual environments. While SNMP is typically used to obtain this information from network equipment, servers are much less likely to be managed using SNMP and so SNMP polling is often not an option. In addition, there may be large numbers of virtual ports associated with each physical switch port. In a virtual environment with 10,000 physical switch ports you might need to monitor as many as 200,000 virtual ports. Even if SNMP agents were installed on all the servers, SNMP polling does not scale well to large numbers of interfaces. The integrated counter polling mechanism built into sFlow provides scalable monitoring of the utilization of every switch port in the network, both physical and virtual, quickly identifying problems wherever they may occur in the network.

In contrast, NetFlow only reports on traffic flows so neither of these charts is available in the NetFlow Analyzer. The remaining charts are based on sFlow counter data so there are no corresponding NetFlow Analyzer charts.

The next sFlowTrend chart shows the CPU load on the hypervisor:

Hypervisor CPU in sFlowTrend

The virtual switch is a software component running on the hypervisor, thus if the hypervisor is overloaded, then network performance will degrade. The sFlow counter polling mechanism extends to system performance counters in addition to the interface counters shown earlier, allowing the sFlow analyzer to display hypervisor CPU utilization. In this case the chart shows a small spike in system CPU utilization corresponds to the spike in traffic at 9:52AM.

The next sFlowTrend chart shows a trend in disk IO on the virtual machine:

Virtual Machine Disk IO in sFlowTrend

This chart shows that the burst in iSCSI traffic shown in the Top Protocols chart corresponds to a spike in read activity on the virtual machine. Again, sFlow's counter push mechanism efficiently exports information about the performance of virtual machines, allowing the interaction between network and system activity to be understood.

Comments

NetFlow provides limited visibility, focusing on layer 3 network connections. The NetFlow architecture relies on complex functionality within the switches and the complexity of configuring and maintaining NetFlow adds to operational costs and limits scalability. For example, gaining visibility into IPv6 traffic requires firmware (and often hardware) upgrades to the network infrastructure that can be challenging in large scale, always-on, cloud environments.

In contrast, adding support for additional protocols in sFlow requires no change to the network infrastructure, but is simply a matter of upgrading the sFlow analyzer. The sFlow architecture eliminates complexity from the agents, increasing scalability and reducing the operational costs associated with configuration and maintenance. sFlow provides comprehensive visibility into network and system resources needed to manage performance in virtualized and cloud environments.

Friday, September 30, 2011

XenServer 6.0 supplemental pack

Today, the press release, Citrix Optimizes XenServer for the Cloud Era, announced the release of XenServer 6.0. Of particular interest is the inclusion of Open vSwitch (OVS) as the default network stack for XenServer. OVS includes support for the sFlow standard, seamlessly integrates monitoring of physical and virtual networking to provide the end-to-end visibility needed to manage cloud performance.

The Host sFlow agent is available as a XenServer 6.0 supplemental pack, offering a simple way to enable sFlow in OVS and further simplifying operations by exporting standard server and virtual machine statistics. Unifying network and system monitoring provides the integrated view of performance needed for the coordinated management of resources that is essential in virtualized environments.

This article describes the steps needed to install the supplemental pack and configure sFlow monitoring.

Note: The default vSwitch in XenServer 6.0 includes sFlow support. See the XenServer 5.6FP1/FP2 instructions to enable sFlow on the older platforms.

First, download the Host sFlow XenServer supplemental pack (xenserver60-hsflowd-X_XX.iso).

Then, either copy the file to the host and run the following commands:

mkdir /tmp/iso 
mount -o loop xenserver60-hsflowd-X_XX.iso /tmp/iso 
cd /tmp/iso 
./install.sh 
cd 
umount /tmp/iso

Alternatively, burn the ISO file onto a CD and run the following commands to install:

mount /dev/cdrom /mnt
cd /mnt
./install.sh
cd
umount /mnt

Next, use the following commands to start the monitoring daemons:

service hsflowd start
service sflowovsd start

The following steps configure all the sFlow agents to sample packets at 1-in-512, poll counters every 20 seconds and send sFlow to an analyzer (10.0.0.50) over UDP using the default sFlow port (6343).

Note: A previous posting discussed the selection of sampling rates.

The default configuration method used for sFlow is DNS-SD; enter the following DNS settings in the site DNS server:

analyzer A 10.0.0.50

_sflow._udp SRV 0 0 6343 analyzer
_sflow._udp TXT (
"txtvers=1"
"polling=20"
"sampling=512"
)

Note: These changes must be made to the DNS zone file corresponding to the search domain in the XenServer /etc/resolv.conf file. If you need to modify the search domain, do not edit the resolv.conf file directly since the changes will be lost on a system reboot, instead either follow the directions in How to Add a Permanent Search Domain Entry in the Resolv.conf File of a XenServer Host, or simply edit the DNSSD_domain setting in the /etc/hsflowd.conf file to specify the domain to use to retrieve DNS-SD settings.

Once the sFlow settings are added to the DNS server, they will be automatically picked up by the Host sFlow agents. If you need to change the sFlow settings, simply change them on the DNS server and the change will automatically be applied to all the XenServer systems in the data center.

Alternatively, manual configuration is an option if you do not want to use DNS-SD. Simply edit the Host sFlow agent configuration file, /etc/hsflowd.conf, on each Xenserver:

sflow{
  DNSSD = off
  polling = 20
  sampling = 512
  collector{
    ip = 10.0.0.50
    udpport = 6343
  }
}

After editing the configuration file you will need to restart the Host sFlow agent:

service hsflowd restart

An sFlow analyzer is needed to receive the sFlow data and report on performance (see Choosing an sFlow analyzer). The free sFlowTrend analyzer is a great way to get started, see sFlowTrend adds server performance monitoring to see examples.

Thursday, September 15, 2011

Microsoft Hyper-V

This week, InMon Corp. demonstrated sFlow monitoring in Microsoft Hyper-V during the Microsoft BUILD conference in Anaheim, California. InMon and Microsoft are working together to integrate sFlow support in Hyper-V 3.0, the hypervisor that will be included in the upcoming Windows Server 8 and Windows 8 releases.

The video above describes the integration of sFlow monitoring in Hyper-V. The extensible virtual switch is a key component of the Hyper-V 3.0 release. The following diagram from the video shows how sFlow is implemented as virtual switch extension.

The sFlow extension is inserted into the packet forwarding path, randomly sampling packets that flow through the virtual switch. sFlow's packet sampling mechanism is extremely lightweight, delivering detailed visibility while ensuring minimal impact on forwarding performance. Sampled packets are sent to a userspace sFlow agent that exports the sampled packet headers along with physical and virtual NIC interface counters, server and virtual machine cpu, memory and I/O performance to an sFlow collector.

The photo above is of a live demonstration of the Hyper-V sFlow extension. The screen shows a real-time chart of network traffic flowing between virtual machines generated using the free sFlowTrend collector.

The sFlow standard is widely supported by switch vendors. Including sFlow monitoring in the Hyper-V virtual switch simplifies network management by allowing a common set of tools to monitor physical and virtual network performance. sFlow in Hyper-V further simplifies management by exporting server and virtual machine statistics. Unifying network and system monitoring provides the integrated view of performance needed for the coordinated management that is essential in virtualized environments. In addition to monitoring data center resources, sFlow can also be used to monitor performance within public clouds (see Rackspace cloudservers and Amazon EC2), providing the integrated visibility needed to optimize workload placement and minimise operating costs in hybrid cloud environments.

For public or private cloud operators, support for sFlow in the Hyper-V extensible switch and the Open vSwitch embeds visibility in leading commercial hypervisors (Microsoft Hyper-V and Citrix XenServer), open source hypervisors (Xen Cloud Platform (XCP), Xen, KVM, Proxmox VE and VirtualBox) and cloud management systems (OpenStack, OpenQRM and OpenNebula), providing the scalable visibility needed for accounting and billing, operational control and cost management.

Friday, September 2, 2011

Java virtual machine

The jmx-sflow-agent project is an open source implementation of sFlow monitoring for Java Virtual Machines (JVM). Instrumenting Java using sFlow provides scalable, real-time monitoring for applications such as Hadoop, Cassandra and Tomcat that typically involve large numbers of virtual machines.

The sFlow Host Structures extension describes a set of standard metrics for monitoring physical and virtual machines. The jmx-sflow-agent extends the set of physical server metrics exported by a Host sFlow agent which exports the physical server metrics.

The jmx-sflow-agent exports the generic virtual machine structures. In addition, the jmx-sflow-agent uses the Java Management Extensions (JMX) interface to collect performance statistics about java threads, heap/non-heap memory, garbage collection, compilation and class loading. These additional metrics are exported along with the generic virtual machine statistics.

The jmx-sflow-agent software is designed to integrate with the Host sFlow agent to provide a complete picture of server performance. Download, install and configure Host sFlow before proceeding to install the tomcat-sflow-valve - see Installing Host sFlow on a Linux Server. There are a number of options for analyzing cluster performance using Host sFlow, including Ganglia and sFlowTrend.

Next, download the sflowagent.jar file from https://github.com/sflow/jmx-sflow-agent. Copy the sflowagent.jar file into the same directory as your Java application. Enabling the sFlow instrumentation involves including -javaagent:sflowagent.jar argument when starting the Java application.

The following example shows how instrumentation is added to MyApp:

java -javaagent:sflowagent.jar\
     -Dsflow.hostname=vm.test\
     -Dsflow.uuid=564dd23c-453d-acd2-fa64-85da86713620\
     -Dsflow.dsindex=80001\
     MyApp

Arguments can be passed to the sFlow module as system properties:

sflow.hostname, optionally assign a "hostname" to identify the virtual machine. Choose a naming strategy that is helpful in identifying virtual machine instances, for example: "hadoop.node1" etc. The hostname is exported in the sFlow host_descr structure.
sflow.uuid, optionally assign a UUID to the virtual machine so that it can be uniquely identified. The UUID is exported in the sFlow host_descr structure.
sflow.dsindex, uniquely identifies the data source associated with this virtual machine on this server. The dsindex number only needs to be set if more than one virtual machine is running on the server. For virtual machines offering network services, use TCP/UDP port number associated with the service as the dsindex, otherwise, use numbers in the range 65536-99999 to avoid clashes with other sFlow agents on the server.

The sFlow module shares Host sFlow's configuration and will automatically send data to the set of sFlow analyzers specified in the Host sFlow configuration file, picking up any configuration changes without the need to restart the Java application.

Finally, the real potential of of the Java virtual machine sFlow agent is as part of a broader sFlow monitoring system providing scaleable, real-time, visibility into applications, servers, storage and networking across the entire data center.

Saturday, August 27, 2011

node.js

The node-sflow-module project is an open source implementation of sFlow monitoring for node.js, an open source event-based environment environment for creating network applications that built on Google's high performance V8 JavaScript Engine.

The advantage of using sFlow is the scalability it offers for monitoring the performance of large web server clusters or load balancers where request rates are high and conventional logging solutions generate too much data or impose excessive overhead. Real-time monitoring of HTTP provides essential visibility into the performance of large-scale, complex, multi-layer services constructed using Representational State Transfer (REST) architectures. In addition, monitoring HTTP services using sFlow is part of an integrated performance monitoring solution that provides real-time visibility into applications, servers and switches (see sFlow Host Structures).

The node-sflow-module software (sflow.js) is designed to integrate with the Host sFlow agent to provide a complete picture of server performance. Download, install and configure Host sFlow before proceeding to install node.js - see Installing Host sFlow on a Linux Server. There are a number of options for analyzing cluster performance using Host sFlow, including Ganglia and sFlowTrend.

Next, download the sflow.js file from http://node-sflow-module.googlecode.com/. Copy the sflow.js file into the same directory as your node.js application. Including the sFlow instrumentation is a one line change to the application, see the simple Hello World example below:

var http = require("http");
require("./sflow.js").instrument(http);

http.createServer(function (req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('Hello World\n');
}).listen(1337, "127.0.0.1");
console.log('Server running at http://127.0.0.1:1337/');

Once installed, the sflow.js will stream measurements to a central sFlow Analyzer. Currently the only software that can decode HTTP sFlow is sflowtool. Download, compile and install the latest sflowtool sources on the system your are using to receive sFlow from the servers in the node.js cluster.

Running sflowtool will display output of the form:

[pp@pcentos ~]$ sflowtool
startDatagram =================================
datagramSourceIP 10.0.0.112
datagramSize 116
unixSecondsUTC 1314458638
datagramVersion 5
agentSubId 8124
agent 10.0.0.112
packetSequenceNo 1
sysUpTime 22002
samplesInPacket 1
startSample ----------------------
sampleType_tag 0:2
sampleType COUNTERSSAMPLE
sampleSequenceNo 1
sourceId 3:8124
counterBlock_tag 0:2201
http_method_option_count 0
http_method_get_count 2
http_method_head_count 0
http_method_post_count 0
http_method_put_count 0
http_method_delete_count 0
http_method_trace_count 0
http_methd_connect_count 0
http_method_other_count 0
http_status_1XX_count 0
http_status_2XX_count 2
http_status_3XX_count 0
http_status_4XX_count 0
http_status_5XX_count 0
http_status_other_count 0
endSample   ----------------------
endDatagram   =================================
startDatagram =================================
datagramSourceIP 10.0.0.112
datagramSize 236
unixSecondsUTC 1314458652
datagramVersion 5
agentSubId 8124
agent 10.0.0.112
packetSequenceNo 2
sysUpTime 35729
samplesInPacket 1
startSample ----------------------
sampleType_tag 0:1
sampleType FLOWSAMPLE
sampleSequenceNo 0
sourceId 3:8124
meanSkipCount 6
samplePool 6
dropEvents 0
inputPort 0
outputPort 1073741823
flowBlock_tag 0:2201
flowSampleType http
http_method 2
http_protocol 1001
http_uri /
http_host 10.0.0.112:8124
http_useragent Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1) AppleWebKit/534.
http_bytes 0
http_duration_uS 0
http_status 200
flowBlock_tag 0:2100
extendedType socket4
socket4_ip_protocol 6
socket4_local_ip 10.0.0.112
socket4_remote_ip 10.1.1.60
socket4_local_port 8124
socket4_remote_port 52609
endSample   ----------------------
endDatagram   =================================

The -H option causes sflowtool to output the HTTP request samples using the combined log format:

[pp@pcentos ~]$ sflowtool -H
10.1.1.60 - - [27/Aug/2011:08:26:52 -0700] "GET / HTTP/1.1" 200 0 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1) AppleWebKit/534."

Converting sFlow to combined logfile format allows existing log analyzers to be used to analyze the sFlow data. For example, the following commands use sflowtool and webalizer to create reports:

The resulting webalizer report shows top URLs:

Finally, the real potential of HTTP sFlow is as part of a broader performance management system providing real-time visibility into applications, servers, storage and networking across the entire data center.