Wednesday, August 25, 2010


Recent articles, Higher Learning, Higher Speed: Campuses Graduate to 802.11n and Beyond 802.11n: Enterprise WLAN Trends For 2010, describe some of the trends driving the adoption of 802.11n wireless networking in higher education and enterprise campuses.

Moving to a wireless access network offer many benefits, including: flexibility, mobility, energy savings and reduced cabling. However, managing performance in a wireless environment is challenging since wireless bandwidth is a shared, limited, resource that can easily become congested. Bandwidth management is further complicated by the rapidly changing traffic patterns as users move from one part of the network to another.

The sFlow standard is currently supported by most switch vendors and is widely used to provide network-wide visibility. However, sFlow is not limited to monitoring switches. Deploying sFlow capable wireless access points extends monitoring into the wireless network, delivering the visibility needed for effective bandwidth management. The diagram at the top of the page shows how sFlow is used to centrally monitor the performance of the entire wireless infrastructure. When combined with sFlow from switches, sFlow delivers end-to-end visibility into the performance of the entire wired and wireless network.

The first task in managing wireless performance is rapidly identifying areas of the network experiencing performance problems.  Each wireless access point uses sFlow's scalable "counter push" mechanism to export utilization and error statistics, allowing a central sFlow analyzer to rapidly pinpoint overloaded access points (see Link utilization).

(chart created using sFlowTrend)

The trend chart above shows a sharp increase in transmission failure and retry counts, indicating a severe performance problem. The next step is to find the source of this congestion. Each wireless access point uses sFlow's packet sampling mechanism to export packet headers, allowing the sFlow analyzer to identify sources of traffic.

(chart created using sFlowTrend)

The chart above shows the top connections and protocols making use of the wireless access point. Looking at the top connections chart, it is clear that the increased load (shown in red) is due to an afpovertcp connection between dchp0 and dhcp6. The traffic associated with this connection is peaking at nearly 30M bits/s, resulting in poor performance. Using the information from the chart to install a rate limit in the wireless access point provides a short term fix, restoring network service.

Further investigation reveals that a laptop is using the wireless network to backup its entire hard drive. A longer term solution uses traffic measurements to develop traffic shaping policies that balance the requirements of different traffic classes. In this case, creating a low priority class for backup traffic helps prevent future quality of service problems.

In this example, sFlow was used to manually identify and manage traffic. However, the real-time, network-wide visibility that sFlow provides makes it possible to automate performance management, ensuring fair access to all network users (see Network edge).

Monday, August 9, 2010

Host sFlow 1.0 Released

(image from Host sFlow)

The Host sFlow project has released an open source agent implementing the sFlow Host Structures specification. The current stable release (version 1.0) supports: Linux, Windows, XenServer and Xen. The project is working to add support for additional platforms, including: AIX, HPUX, OS X, FreeBSD, OpenBSD, NetBSD, VMWare, KVM and Hyper-V.

The Host sFlow agent provides a highly scalable solution for monitoring clusters of servers (see Top servers and Cluster performance). The combination of sFlow in top of rack (ToR) or end of row (EoR) switches and Host sFlow agents installed on servers delivers visibility into the performance of large scale data center workloads (see Hybrid server monitoring).

The combination of Open vSwitch and the Host sFlow agent provide a lightweight, scalable, performance monitoring solution for open source virtualization (Xen, XenServer and KVM) and cloud platforms (OpenStack and Xen Cloud Platform).

The sFlow Host Structures specification has only recently been finalized. When looking for an sFlow analyzer (see Choosing an sFlow analyzer), ask if the vendor supports all optional and extended sFlow fields (including packet headersinterface counters and host structures). If possible, arrange for an evaluation and test the solution in a large scale trial.

Tuesday, August 3, 2010

sFlow Host Structures

The completed sFlow Host Structures specification has been published by, extending the sFlow standard to include physical and virtual server performance metrics. The specification describes a coherent framework that builds on the sFlow metrics exported by most switch vendors, linking network, server and application performance monitoring to provide an integrated picture of performance.

The diagram above shows how the packet header information exported by network devices is used to link network performance with performance metrics collected from servers and applications. The packet header contains MAC addresses corresponding to physical and virtual server network adapter cards as well as TCP/UDP socket information identifying individual application instances. Collecting sFlow data from the network devices provides an sFlow analyzer with a real-time map of the physical and logical relationships between entities on the network (see Packet paths and Application mapping).

A server exporting sFlow performance metrics includes an additional structure containing the MAC addresses associated with each of its network adapters. The inclusion of the MAC addresses provides a common key linking server performance metrics (CPU, Memory, I/O etc.) to network performance measurements (network flows, link utilizations, etc.), providing a complete picture of the server's performance (see Hybrid server monitoring and UUID)

The sFlow Host Structures specification builds on the scalable "counter push" mechanism that is used by network devices to export standard interface counters (see Link utilization). Most operating systems already maintain performance counter to track CPU, memory and I/O performance. The sFlow Host Structures specification leverages work done by the Ganglia project to define a common set of metrics across different operating systems, including: Windows, Linux (Fedora/RedHat/CentOS, Debian, Gentoo, SuSE/OpenSuSE), Solaris, FreeBSD, NetBSD, OpenBSD, DragonflyBSD and AIX. The extension of sFlow to include server performance metrics integrates network and system monitoring to deliver a data center wide view of performance (see Top servers and Cluster performance).

For virtual machine performance metrics, the sFlow Host Structures specification draws on definitions from the libvirt project which has defined a standard set of metrics that can be collected from a wide variety of virtualization platforms, including: Xen, QEMU, KVM, LXC, OpenVZ, User Mode Linux, VirtualBox, VMWare ESX and GSX. Again, the MAC addresses associated with each virtual machine are exported along with its performance metrics so that the virtual machine's performance can be linked to its network activity.

The sFlow Host Structures document also describes the extension of sFlow's sampling mechanism to include application transaction sampling. Examples of application level transactions include: HTTP requests to a web server, NFS/CIFS requests to a file server, memcached requests and operations performed by a Hadoop cluster. An application sFlow agent samples completed transactions, capturing information about each completed request, including: size, duration, type, URL, file name etc. Each application transaction sample is linked to the network through the inclusion of TCP/UDP socket information which can be matched to packet header information from network devices.

What clearly distinguishes sFlow from other monitoring technologies is the integrated, end-to-end, view of performance that it offers. Integration exponentially increases the value of information by making it actionable. For example, identifying that an application is running slowly isn't enough to solve the performance problem. However, if you also know that the server hosting the application is seeing poor disk performance, can link the disk performance to a slow NFS server, can identify the other clients of the NFS server and finally determine that all the request are competing for access to a single file, then you are in a position to take action. It is this ability to link data together, combined with the scalability to monitor every resource in the data center that makes sFlow revolutionary.