Sunday, April 4, 2010

Hybrid server monitoring

Current trends toward convergence tightly link networking, storage and system performance. In a converged environment, system administrators need to be aware of network traffic linking applications, storage and users in order to avoid performance problems. The dynamic application environment created virtual machine migration, scale out storage and elastic service pools requires that server administrators be aware of network I/O in order to optimize performance and avoid creating problems through poor workload placement choices.

Server operating systems and hardware integrate the instrumentation needed to monitor CPU, memory and disk performance. However, server network adapters typically lack the hardware support needed for traffic monitoring, leaving system administrators with a very limited view of server network I/O. Without hardware support, the network monitoring tools that are available to system administrators are typically used only for troubleshooting since using the tools operationally would adversely impact server performance.

Solving the problem of poor server network visibility requires a broader perspective. Depending on the data center network topology, each server is attached to a blade, top of rack (ToR) or an end of row (EoR) switch. The diagram above illustrates the one-to-one relationship between network adapter and the switch connecting the server to storage and networking resources. Monitoring traffic on a server's switch port provides a complete picture of server network I/O.

Switch vendors recognize the need for network-wide visibility and most have implemented hardware support for the sFlow standard in their data center switches. Combining performance metrics from the server with network visibility from the adjacent switch creates a hybrid monitoring solution that exploits the strengths of existing server and switch instrumentation to provide a complete picture of system performance.

Similar challenges exist in virtual server environments. The integration of sFlow traffic monitoring in the virtual switch (e.g. Xen Cloud Platform) with system performance metrics obtained from virtual machines provides a complete picture of cloud performance. The emerging VEPA standard allows much of the virtual switch functionality to be offloaded from the server software to the adjacent physical switch hardware. VEPA will be a firmware upgrade for most switches so selecting a switch with sFlow support today provides visibility into physical server network I/O and also provides an upgrade path to extend visibility into the virtualization layer as VEPA becomes available.

One challenge remains to widely implementing this hybrid monitoring strategy. Currently, performance monitoring of servers is highly fragmented. Each server hardware, operating system, and system management vendor creates their own agents and software for performance monitoring, none of which interoperate. The emerging sFlow host and power extensions define standard export formats so that performance management tools can easily combine network and server measurements to build a complete picture of data center performance.

No comments:

Post a Comment