The Cumulus Linux platform makes it possible to run the same open source agent on switches, servers, and hypervisors - providing unified end-to-end visibility across the data center. The open networking model that Cumulus is pioneering offers exciting opportunities. Cumulus Linux allows popular open source server orchestration tools to also manage the network, and the combination of real-time, data center wide analytics with orchestration make it possible to create self-optimizing data centers.
Install and configure Host sFlow agent
The following command installs the Host sFlow agent on a Cumulus Linux switch:sudo apt-get install hsflowdNote: Network managers may find this command odd since it is usually not possible to install third party software on switch hardware. However, what is even more radical is that Cumulus Linux allows users to download source code and compile it on their switch. Instead of being dependent on the switch vendor to fix a bug or add a feature, users are free to change the source code and contribute the changes back to the community.
The sFlow agent requires very little configuration, automatically monitoring all switch ports using the following default settings:
Link Speed | Sampling Rate | Polling Interval |
---|---|---|
1 Gbit/s | 1-in-1,000 | 30 seconds |
10 Gbit/s | 1-in-10,000 | 30 seconds |
40 Gbit/s | 1-in-40,000 | 30 seconds |
100 Gbit/s | 1-in-100,000 | 30 seconds |
Note: The default settings ensure that large flows (defined as consuming 10% of link bandwidth) are detected within approximately 1 second - see Large flow detection
Once the Host sFlow agent is installed, there are two alternative configuration mechanisms that can be used to tell the agent where to send the measurements:
1. DNS Service Discovery (DNS-SD)
This is the default configuration mechanism for Host sFlow agents. DNS-SD uses a special type of DNS record (the SRV record) to allow hosts to automatically discover servers. For example, adding the following line to the site DNS zone file will enable sFlow on all the agents and direct the sFlow measurements to an sFlow analyzer (10.0.0.1):_sflow._udp 300 SRV 0 0 10.0.0.1No Host sFlow agent specific configuration is required, each switch or host will automatically pick up the settings when the Host sFlow agent is installed, when the device is restarted, or if settings on the DNS server are changed.
Default sampling rates and polling interval can be overridden by adding a TXT record to the zone file. For example, the following TXT record reduces the sampling rate on 10G links to 1-in-2000 and the polling interval to 20 seconds:
_sflow._udp 300 TXT ( "txtvers=1" "sampling.10G=2000" "polling=20" )Note: Currently defined TXT options are described on sFlow.org.
The article DNS-SD describes how DNS service discovery allows sFlow agents to automatically discover their configuration settings. The slides DNS Service Discovery from a talk at the SF Bay Area Large Scale Production Engineering Meetup provide additional background.
2. Configuration File
The Host sFlow agent is configured by editing the /etc/hsflowd.conf file. For example, the following configuration disables DNS-SD, instructs the agent to send sFlow to 10.0.0.1, reduces the sampling rate on 10G links to 1-in-2000 and the polling interval to 20 seconds:sflow { DNSSD = off polling = 20 sampling.10G = 2000 collector { ip = 10.0.0.1 } }The Host sFlow agent must be restarted for configuration changes to take effect:
sudu /etc/init.d/hsflowd restartAll hosts and switches can share the same settings and it is straightforward to use orchestration tools such as Puppet, Chef, etc. to manage the sFlow settings.
Collecting and analyzing sFlow
Figure 1: Visibility and the software defined data center |
- Applications - e.g. Apache, NGINX, Tomcat, Memcache, HAProxy, F5, A10 ...
- Virtual Servers - e.g. Xen, Hyper-V, KVM ...
- Virtual Network - e.g. Open vSwitch, Hyper-V extensible vSwitch
- Servers - e.g. BSD, Linux, Solaris and Windows
- Network - over 40 switch vendors, see Drivers for growth
The sFlow data from a Cumulus switch contains standard Linux performance statistics in addition to the interface counters and packet samples that you would typically get from a networking device.
Note: Enhanced visibility into host performance is important on open switch platforms since they may be running a number of user installed services that can stress the limited CPU, memory and IO resources.
For example, the following sflowtool output shows the raw data contained in an sFlow datagram from a switch running Cumulus Linux:
startDatagram ================================= datagramSourceIP 10.0.0.160 datagramSize 1332 unixSecondsUTC 1402004767 datagramVersion 5 agentSubId 100000 agent 10.0.0.233 packetSequenceNo 340132 sysUpTime 17479000 samplesInPacket 7 startSample ---------------------- sampleType_tag 0:2 sampleType COUNTERSSAMPLE sampleSequenceNo 876 sourceId 2:1 counterBlock_tag 0:2001 adaptor_0_ifIndex 2 adaptor_0_MACs 1 adaptor_0_MAC_0 6c641a000459 counterBlock_tag 0:2005 disk_total 0 disk_free 0 disk_partition_max_used 0.00 disk_reads 980 disk_bytes_read 4014080 disk_read_time 1501 disk_writes 0 disk_bytes_written 0 disk_write_time 0 counterBlock_tag 0:2004 mem_total 2056589312 mem_free 1100533760 mem_shared 0 mem_buffers 33464320 mem_cached 807546880 swap_total 0 swap_free 0 page_in 35947 page_out 0 swap_in 0 swap_out 0 counterBlock_tag 0:2003 cpu_load_one 0.390 cpu_load_five 0.440 cpu_load_fifteen 0.430 cpu_proc_run 1 cpu_proc_total 95 cpu_num 2 cpu_speed 0 cpu_uptime 770774 cpu_user 160600160 cpu_nice 192970 cpu_system 77855100 cpu_idle 1302586110 cpu_wio 4650 cpuintr 0 cpu_sintr 308370 cpuinterrupts 1851322098 cpu_contexts 800650455 counterBlock_tag 0:2006 nio_bytes_in 405248572711 nio_pkts_in 394079084 nio_errs_in 0 nio_drops_in 0 nio_bytes_out 406139719695 nio_pkts_out 394667262 nio_errs_out 0 nio_drops_out 0 counterBlock_tag 0:2000 hostname cumulus UUID fd-01-78-45-93-93-42-03-a0-5a-a3-d7-42-ac-3c-de machine_type 7 os_name 2 os_release 3.2.46-1+deb7u1+cl2+1 endSample ---------------------- startSample ---------------------- sampleType_tag 0:2 sampleType COUNTERSSAMPLE sampleSequenceNo 876 sourceId 0:44 counterBlock_tag 0:1005 ifName swp42 counterBlock_tag 0:1 ifIndex 44 networkType 6 ifSpeed 0 ifDirection 2 ifStatus 0 ifInOctets 0 ifInUcastPkts 0 ifInMulticastPkts 0 ifInBroadcastPkts 0 ifInDiscards 0 ifInErrors 0 ifInUnknownProtos 4294967295 ifOutOctets 0 ifOutUcastPkts 0 ifOutMulticastPkts 0 ifOutBroadcastPkts 0 ifOutDiscards 0 ifOutErrors 0 ifPromiscuousMode 0 endSample ---------------------- startSample ---------------------- sampleType_tag 0:1 sampleType FLOWSAMPLE sampleSequenceNo 1022129 sourceId 0:7 meanSkipCount 128 samplePool 130832512 dropEvents 0 inputPort 7 outputPort 10 flowBlock_tag 0:1 flowSampleType HEADER headerProtocol 1 sampledPacketSize 1518 strippedBytes 4 headerLen 128 headerBytes 6C-64-1A-00-04-5E-E8-E7-32-77-E2-B5-08-00-45-00-05-DC-63-06-40-00-40-06-9E-21-0A-64-0A-97-0A-64-14-96-9A-6D-13-89-4A-0C-4A-42-EA-3C-14-B5-80-10-00-2E-AB-45-00-00-01-01-08-0A-5D-B2-EB-A5-15-ED-48-B7-34-35-36-37-38-39-30-31-32-33-34-35-36-37-38-39-30-31-32-33-34-35-36-37-38-39-30-31-32-33-34-35-36-37-38-39-30-31-32-33-34-35-36-37-38-39-30-31-32-33-34-35-36-37-38-39-30-31-32-33-34-35 dstMAC 6c641a00045e srcMAC e8e73277e2b5 IPSize 1500 ip.tot_len 1500 srcIP 10.100.10.151 dstIP 10.100.20.150 IPProtocol 6 IPTOS 0 IPTTL 64 TCPSrcPort 39533 TCPDstPort 5001 TCPFlags 16 endSample ----------------------While sflowtool is extremely useful, there are many other open source and commercial tools available, including:
Note: The sFlow Collectors list on sFlow.org contains a number of additional tools.
There is a great deal of variety among sFlow collectors - many focus on the network, others have a compute infrastructure focus, and yet others report on application performance. The shared sFlow measurement infrastructure delivers value in each of these areas. However, as network, storage, host and application resources are brought together and automated to create cloud data centers, a new set of sFlow analytics tools is emerging to deliver the integrated real-time visibility required to drive automation and optimize performance and efficiency across the data center.
While network administrators are likely to be familiar with sFlow, application development and operations teams may be unfamiliar with the technology. The 2012 O'Reilly Velocity conference talk provides an introduction to sFlow aimed at the DevOps community.Cumulus Linux presents the switch as a server with a large number of network adapters, an abstraction that will be instantly familiar to anyone with server management experience. For example, displaying interface information on Cumulus Linux uses the standard Linux command:
ifconfig swp2On the other hand, network administrators experienced with switch CLIs may find that Linux commands take a little time to get used to - the above command is roughly equivalent to:
show interfaces fastEthernet 6/1However, the basic concepts of networking don't change and these skills are essential to designing, automating, operating and troubleshooting data center networks. Open networking platforms such as Cumulus Linux are an important piece of the automation puzzle, taking networking out of its silo and allowing a combined NetDevOps team to manage network, server, and application resources using proven monitoring and orchestration tools such as Ganglia, Graphite, Nagios, CFEngine, Puppet, Chef, Ansible, and Salt.
No comments:
Post a Comment