sFlow: EC2

Saturday, February 12, 2011

EC2

The article, Visibility in the cloud, provides a general discussion of how to monitor cloud infrastructure. This article uses the Amazon Elastic Compute Cloud (EC2) service to provide a concrete example of implementing sFlow monitoring in a public cloud.

There are a number of APIs and tools available for managing large cloud server deployments in the Amazon cloud. However, the web interface provides the quickest solution for setting up the small number of cloud servers used in this example:

In this case two Amazon Linux 64 bit instances have been created. In order to provide sFlow monitoring, open source Host sFlow agents were installed on each server.

Note: Amazon does include basic performance monitoring through its CloudWatch service. However, there is a charge for minute granularity reporting and alerts. Implementing a monitoring solution based on the sFlow standard is free and provides minute granularity reporting. In addition, implementing a standards based approach to performance monitoring provides a solution that is portable between public cloud providers (see Rackspace cloudservers for examples of sFlow monitoring in the Rackspace cloud) and private clouds.

The firewall configurations were modified (changes shown in red) to implement packet sampling:

[root@ip-10-117-46-49 ~]# more /etc/sysconfig/iptables
# Generated by iptables-save v1.4.7 on Sat Feb 12 18:41:17 2011
*filter
:INPUT ACCEPT [52:3952]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [28:2896]
-A INPUT -m statistic --mode random --probability 0.010000 -j ULOG --ulog-nlgroup 5 
-A OUTPUT -m statistic --mode random --probability 0.010000 -j ULOG --ulog-nlgroup 5 
COMMIT
# Completed on Sat Feb 12 18:41:17 2011

Note: On Linux systems, Host sFlow uses the iptables ULOG facility to monitor network traffic, see ULOG for a more detailed discussion.

The Host sFlow agents were configured to poll counters every 30 seconds and pick up the packet samples via ULOG, sending the resulting sFlow to collector, 10.117.46.49:

[ec2-user@ip-10-244-162-76 ~]$ more /etc/hsflowd.conf
sflow {
  DNSSD = off
  polling = 30
  sampling = 400

  collector {
    ip = 10.117.46.49
  }

  ulogGroup = 5
  ulogProbability = 0.01
}

Deploying an sFlow analyzer into the cloud provides real-time reports of performance across all the server instances in the cloud. For example, the following chart shows a cluster-wide view of performance:

The following chart displays the top network connections to the cluster:

In addition to monitoring server and network performance, sFlow can also be used to monitor performance of the scale-out applications that are typically deployed in the cloud, including: web farms, memcached and membase clusters.

The sFlow standard is extremely well suited for cloud performance monitoring. The scalability of sFlow allows tens of thousands of cloud servers to be centrally monitored. With sFlow, data is continuously sent from the cloud servers to the sFlow analyzer, providing a real-time view of performance across the cloud.

The sFlow push model is much more efficient than typical monitoring architectures that require the management system to periodically poll servers for statistics. Polling breaks down in highly dynamic cloud environments where servers can appear and disappear. With sFlow, cloud servers are automatically discovered and continuously monitored as soon as they are created. The sFlow messages act as a server heartbeat, providing rapid notification when a server is deleted and stops sending sFlow.

Finally, sFlow provides the detailed, real-time, visibility into network, server and application performance needed to manage performance and control costs. For anyone interested in more information on sFlow, the sFlow presentation provides a strategic view of the role that sFlow monitoring plays in converged, virtualized and cloud environments.

Saturday, February 12, 2011

EC2

No comments:

Post a Comment