Wednesday, November 23, 2011

Wireshark


Wireshark (previously called Ethereal) is a popular, free, open source protocol analyzer. This article will demonstrate how Wireshark can be used with sFlow to remotely capture traffic. For background, the article Packet capture describes some of the reasons why the multi-vendor sFlow standard should be considered as an option for packet capture, particularly in high-speed, switched Ethernet, environments.

The first step is to configure the network switches to monitor selected links and send sFlow to the host that will be used for packet analysis -  configuration instructions for most switch vendors are available on this blog. Alternatively, if sFlow is already being used for network-wide visibility then obtaining an sFlow feed can be as simple as directing the sFlow analyzer to forward sFlow to Wireshark.

The article CaptureSetup/Pipes describes how Wireshark can be configured to receive packets on a pipe. The following command launches Wireshark, using sflowtool to extract packets from the sFlow feed and pipe them into Wireshark:

[root@xenvm4 ~]# wireshark -k -i <(sflowtool -t)

Wireshark provides a real-time, graphical display of captured packets. The following screen shot shows packets captured using sFlow:

Packet trace in Wireshark captured using sFlow

In addition to being able to decode and filter packets, Wireshark has a number of statistical reporting capabilities. The following screen shot shows protocol statistics generated using captured sFlow data:

Protocol statistics in Wireshark captured using sFlow

When looking at sFlow statistics in Wireshark, it is important remember that sFlow is a sampling technology and that the numbers should be scaled up by the sampling rate. In this case a sampling rate of 1 in 1000 was configured so while the percentages are correct, the Packets, Bytes and Mbit/s numbers need to be multiplied by 1000. Looking at the top, highlighted, line the total values should be 24,000 packets, 25 Megabytes and 2 Mbit/s (not 24 packets, 24 Kilobytes and 0.002 Mbit/s shown in the table).

Because sFlow is a packet sampling technology there are limitations to the type of protocol following you can do in Wireshark. However, there are offsetting benefits. If you don't know which links to tap to solve a
problem you can use sFlow to cast a wide net and capture packets from hundreds, or even thousands of links simultaneously. Using sFlow also lets you easily monitor 1, 10, 40 and 100GigE ports without
overwhelming Wireshark.

In addition to its graphical interface, Wireshark also offers a text-only interface to facilitate scripting. The tshark command runs Wireshark in text mode, providing similar functionality to tcpdump. The following example uses sflowtool to extract packets from the sFlow feed and pipe them into tshark :

[root@xenvm4 ~]# tshark -i<(sflowtool -t)
Running as user "root" and group "root". This could be dangerous.
Capturing on /dev/fd/63
  0.000000    10.0.0.16 -> 10.0.0.18    TCP 37366 > iscsi-target [PSH, ACK] Seq=1 Ack=1 Win=3050 Len=1200 TSV=472366446 TSER=1180632633
  5.000000    10.0.0.16 -> 10.0.0.18    TCP twamp-control > nfs [ACK] Seq=1 Ack=1 Win=2560 Len=1448 TSV=472366931 TSER=1180633845[Packet size limited during capture]
  5.000000    10.0.0.16 -> 10.0.0.18    TCP twamp-control > nfs [ACK] Seq=1449 Ack=1 Win=2560 Len=1448 TSV=472366931 TSER=1180633845

Wireshark's interactive filtering and browsing capabilities, combined with an extensive library of protocol decodes, provides the detail needed to diagnose network problems using packet headers captured by switches using sFlow. The protocol analysis capabilities of Wireshark complement the network-wide visibility provided by an sFlow analyzer, extracting additional details that are useful for troubleshooting.

Tuesday, November 22, 2011

Packet capture


Why use sFlow for packet analysis? To rephrase the Heineken slogan, sFlow reaches the parts of the network that other technologies cannot reach. The sFlow standard is widely supported by switch vendors, embedding wire-speed packet monitoring throughout the network. With sFlow, any link or group of links can be remotely monitored. The alternative approach of physically attaching a probe to a SPAN/Mirror port is becoming much less feasible with increasing network sizes (10's of thousands of switch ports) and link speeds (10, 40 and 100 Gigabits). Using sFlow for packet capture doesn't replace traditional packet analysis, instead sFlow extends the capabilities of existing packet capture tools into the high speed switched network.

This article uses the tcpdump packet analyzer, readily available on most platforms, to demonstrate how to use sFlow to remotely capture and analyze network traffic.

The first step is to configure the network switches to monitor selected links and send sFlow to the host that will be used for packet analysis -  configuration instructions for most switch vendors are available on this blog. Alternatively, if sFlow is already being used for network-wide visibility then obtaining an sFlow feed can be as simple as directing the sFlow analyzer to forward sFlow to the packet analyzer.

Next, perform packet analysis on the host. The following command displays a packet trace, using sflowtool to extract packets from the sFlow feed and pipe them into tcpdump:

[root@xenvm4 ~]# sflowtool -t | tcpdump -r - -vv
reading from file -, link-type EN10MB (Ethernet)
10:30:01.000000 arp who-has 10.0.0.66 tell 10.0.0.220
10:30:07.000000 IP (tos 0x0, ttl  64, id 49952, offset 0, flags [DF], proto: TCP (6), length: 1500) xenserver1.sf.inmon.com.39120 > openfiler.sf.inmon.com.iscsi-target: . 2757963136:2757964584(1448) ack 4136690254 win 3050 
10:30:07.000000 IP (tos 0x0, ttl  64, id 49953, offset 0, flags [DF], proto: TCP (6), length: 1500) xenserver1.sf.inmon.com.39120 > openfiler.sf.inmon.com.iscsi-target: . 1448:2896(1448) ack 1 win 3050 
10:30:07.000000 IP (tos 0x0, ttl  64, id 49954, offset 0, flags [DF], proto: TCP (6), length: 1500) xenserver1.sf.inmon.com.39120 > openfiler.sf.inmon.com.iscsi-target: . 2896:4344(1448) ack 1 win 3050 

Note: Using sflowtool to convert sFlow into standard pcap format makes the sFlow data accessible to the wide variety of packet analysis applications that support the standard.

Next time you have to diagnose a network problem, rather than spending the night in the data center with a crash cart, stay at your desk and try out remote monitoring with sFlow. It may not be the solution to all problems, but it is surprising how many can be quickly resolved without leaving your desk.

Thursday, November 17, 2011

Monitoring at 100 Gigabits/s

Chart 1: Top Connections on a 100 Gigabit link

Chart 1 shows the top connections on a 100 Gigabit Ethernet link monitored at the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC11).

Chart 2: Packet counters on 100 Gigabit link

Chart 2 shows the packet rates on the link, approximately 10 million packets per second ingress and 5 million packets per second egress. The maximum packet rate on a 100 Gigabit, full duplex, link is approximately 300 million packets per second (150 million in each direction) making traffic monitoring an interesting challenge.

The article, Ok, but how much time do I have? discusses some of the challenges in monitoring at 10 Gigabit speeds: at 100 Gigabits the challenge is 10 times greater, requiring that the probe process each packet within 3 nanoseconds to provide wire-speed monitoring of a full-duplex link. Probe vendors (e.g. upcoming EndaceExtreme probe) are working to meet this challenge using custom hardware. However, the costs associated with probes and the added operational complexity of maintaining a probe-based solution is prohibitive for most applications, particularly if large numbers of links need to be monitored.

In this instance, the switch hardware, a Brocade MLXe, includes support for the sFlow traffic monitoring standard. Embedding the instrumentation in the switch hardware delivers continuous, wire-speed, monitoring of all switch ports: the switch has a total of 15.36 Terabits, 4.8 billion packets per second, of routing capacity.

Monitoring using sFlow has minimal overhead and is typically enabled on every interface on every switch to provide network-wide visibility. A central sFlow analyzer continuously monitors all the interfaces and can be queried to generate traffic reports. In this case InMon's Traffic Sentinel was used to monitor the switches in the SC11 network (SCinet) and generate the charts shown in this article.

The sFlow standard is widely supported by switch vendors. Selecting switches with sFlow support when upgrading or building out a new network provides comprehensive visibility into network performance at minimal cost. Retrofitting monitoring with probes is expensive and provides limited coverage.

Wednesday, November 16, 2011

SC11 OpenFlow testbed

ESnet/ECSEL Demo at SC11
The SCinet Research Sandbox, a part of the 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC11), is being used to demonstrate OpenFlow applications using switches from IBM (BNT), HP, NEC and Pronto. Of the four switch models that form the testbed, three support standard sFlow monitoring (IBM, HP and NEC), providing detailed visibility into traffic within the testbed.

In the End-to-End Circuit Service at Layer 2 (ECSEL) demonstration, OpenFlow is used to provision RDMA over Converged Ethernet (RoCE) connectivity through an NEC switch.
Chart 1: Top Connections in OpenFlow Testbed

Chart 1 shows RoCE as the dominant traffic in the testbed, saturating the 10G links. This chart demonstrates how convergence is changing the nature of data center traffic as new storage and clustering workloads place heavy demands on the shared network. The sFlow standard embeds monitoring within the network switches to provide the network-wide visibility needed to manage increased demand for bandwidth.

Note: The RoCE protocol, along with FCoE, AoE and many other data center protocols operate at layer 2 so traditional approaches to monitoring that rely on layer 3 monitoring using IP flow records from routers and firewalls are of limited value.

The OpenFlow protocol provides a way for external software to control the forwarding decisions within switches. In this case, OpenFlow is being used to provision the network to carry the RoCE connections. Maintaining connectivity between the switches and the controllers is essential in order to maintain control of the network.
Chart 2: Traffic between OpenFlow Switches and Open Flow Controllers (NOX)

Chart 2 shows that OpenFlow control traffic is being carried in-band and that it is consuming very little network bandwidth. However, in-band control traffic is potentially vulnerable to interference from large bandwidth consumers (like RoCE).
Chart 3: OpenFlow Traffic and Network Priority

Chart 3 shows that the Quality of Service (QoS) policy for all the OpenFlow connections is Best Effort (BE). Best Effort is the default priority class for traffic in the network, leaving the OpenFlow control channels vulnerable to network congestion. Assigning a high priority to the OpenFlow protocol would ensure that OpenFlow messages could traverse the network during periods of congestion, allowing the controller to take corrective action to mitigate the congestion.

OpenFlow and sFlow
The paper, sFlow and OpenFlow, describes how the OpenFlow and sFlow standards complement one another. Monitoring using sFlow provides visibility into network traffic that allows the OpenFlow controller to dynamically allocate network resources based on changing network demands.

More generally, the Data center convergence, visibility and control presentation describes the critical role that measurement plays in managing costs and optimizing performance in converged, virtualized and cloud environments.

Tuesday, November 15, 2011

Eye of Sauron

Credit: The Lord of the Rings: Return of the King

In The Lord of the Rings: The Return of the King, Sauron's eye is drawn to movement, making it hard for his enemies to escape notice. The sFlow packet sampling mechanism operates in a similar way, devoting resources where they are most needed in order to provide network-wide visibility.

In a typical sFlow deployment, every port on every switch is configured to sample traffic with fixed probabilities. This strategy for setting sampling rates is effective because the distribution of traffic in data centers is extremely irregular: only a small number of links are busy at any given moment and the set of busy links can change quickly. As the traffic on a link increases, additional samples are generated, allowing the central sFlow analyzer to immediately detect the increased traffic and the path the traffic takes across the network. When the link traffic decreases, fewer samples are generated, reducing the load on the sFlow analyzer so that it can focus on active parts of the network.

The sFlow standard offers network-wide surveillance with the scalability to monitor tens of thousands of links. As network convergence and virtualization puts increasing pressure on the network, visibility is essential for the effective control of network resources needed to deliver reliable services. Building a network visibility strategy around sFlow maximizes the choice of vendors, ensures interoperable monitoring in mixed vendor environments, eliminates vendor lock-in and facilitating "best in class" product selection.

Tuesday, November 8, 2011

DevOps

Credit: Wikimedia
DevOps is an emerging set of principles, methods and practices for communication, collaboration and integration between software development and IT operations professionals - Wikipedia.

The article Instrumentation and Observability describes the critical role that instrumentation plays in the DevOps process, "To progress, one must ask questions. These questions must be answered." The article goes on to state, "To observe a situation without changing it is the ultimate achievement." Finally, the case is made for pervasively embedding instrumentation within the production environment, "applications should expose this information as a consequence of normal behavior."

The sFlow standard embeds lightweight instrumentation within switches, servers and applications throughout the data center. sFlow is highly scalable, combining an efficient "push" mechanism with statistical sampling in order to provide continuous, real-time, data center wide visibility.

The article, Host-based sFlow: the drop-in, cloud-friendly monitoring standard, describes some of the operational benefits of sFlow. The granular visibility into scale-out web applications provided by sFlow facilitates DevOps by allowing software developers to see how services perform at scale and identify bottlenecks that can be eliminated through continuous refinement of application logic. At the same time, visibility into application transactions, response times and throughput allows operations teams to flexibly allocate network and server resources as demand changes, controlling costs and ensuring optimal performance.