Open Virtual Network (OVN) is an open source network virtualization solution built as part of the Open vSwitch (OVS) project. OVN provides layer 2/3 virtual networking and firewall services for connecting virtual machines and Linux containers.
OVN is built on the same architectural principles as VMware's commercial NSX and offers the same core network virtualization capability — providing a free alternative that is likely to see rapid adoption in open source orchestration systems, Mirantis: Why the Open Virtual Network (OVN) matters to OpenStack.
This article uses OVN as an example, describing a testbed which demonstrates how the standard sFlow instrumentation build into the physical and virtual switches provides the end-to-end visibility required to manage large scale network virtualization and deliver reliable services.
Open Virtual Network
The Northbound DB provides a way to describe the logical networks that are required. The database abstracts away implementation details which are handled by the ovn-northd and ovn-controllers and presents an easily consumable network virtualization service to orchestration tools like OpenStack.
The purple tables on the left describe a simple logical switch LS1 that has two logical ports LP1 and LP2 with MAC addresses AA and BB respectively. The green tables on the right show the Southbound DB that is constructed by combining information from the ovn-controllers on hypervisors HV1 and HV2 to build forwarding tables in the vSwitches that realize the virtual network.
Docker, OVN, OVS, ECMP Testbed
- Physical Network The recent release of Cumulus VX by Cumulus Networks makes it possible to build realistic networks out of virtual machines. In this case we built a two-spine, two-leaf network using VirtualBox that provides L3 ECMP connectivity using BGP as the routing protocol, a configuration that is very similar to that used by large cloud providers. The green virtual machines leaf1, leaf2, spine1 and spine2 comprise the ECMP network.
- Servers Server 1, Server 2 and the Orchestration Server virtual machines are ubuntu-14.04.3-server installations. Server 1 and Server 2 are connected to the physical network with addresses 192.168.1.1 and 192.168.2.1 respectively that will be used to form the underlay network. Docker has been installed on Server 1 and Server 2 and each server has two containers. The containers on Server 1 have been assigned addresses 172.16.1.1/00:00:00:CC:01:01 and 172.16.1.2/00:00:00:CC:01:02 and the containers on Server 2 have been assigned addresses 172.16.2.1/00:00:00:CC:02:01, 172.16.2.2/00:00:00:CC:02:02.
- Virtual Network Open vSwitch (OVS) was installed from sources on Server 1 and Server 2 along with ovn-controller daemons. The ovs-northd daemon was built and installed on the Orchestration Server. A single logical switch sw0 has been configured that connects the server1-container2 (MAC 00:00:00:CC:01:02) to server2-container2 (MAC 00:00:00:CC:02:02).
- Management Network The out of band management network shown in orange is a VirtualBox bridged network connecting management ports on the physical switches and servers to the Orchestration Server.
Visibility
Enabling sFlow instrumentation in the testbed provides visibility into the physical and virtual network and server resources associated with the logical network.Most physical switches support sFlow. With Cumulus Linux, installing the Host sFlow agent enables the hardware support for sFlow in the bare metal switch to provide line rate monitoring on every 1, 10, 25, 40, 50 and 100 Gbit/s port. Since Cumulus VX isn't a hardware switch the Host sFlow agent makes use of the Linux iptables/nflog capability to monitor traffic.
Host sFlow agents are installed on Server 1 and Server 2. These agents stream server, virtual machine, and container metrics. In addition, the Host sFlow agent automatically enables the sFlow in Open vSwitch which in turn exports traffic flow, interface counter, resource and tunnel encap/decap information.
The sFlow data from the leaf1, leaf2, spine1, spine2, server1 and server2 is transmitted over the management network to sFlow-RT real-time analytics software running on the Orchestration Server.
A difficult challenge in managing large scale cloud infrastructure is rapidly identifying overloaded resources (hot spots), for example:
- Congested network link between physical switches
- Poorly performing virtual switch
- Overloaded server
- Overloaded container / virtual machine
- Oversubscribed service pool
- Distributed Denial of Service (DDoS) attack
- DevOps
The sFlow-RT analytics platform is designed with automation in mind, providing REST and embedded script APIs that facilitate metrics driven control actions. The following examples use the sFlow-RT REST API to demonstrate the type of data available using sFlow.
Congested network link between physical switches
The following query find the busiest link in the fabric based on sFlow interface counters:
curl "http://10.0.0.86:8008/metric/10.0.0.80;10.0.0.81;10.0.0.82;100.0.0.83/max:ifinoctets,max:ifoutoctets/json" [ { "agent": "10.0.0.80", "dataSource": "4", "lastUpdate": 3374, "lastUpdateMax": 17190, "lastUpdateMin": 3374, "metricN": 21, "metricName": "max:ifinoctets", "metricValue": 101670.72864951608 }, { "agent": "10.0.0.80", "dataSource": "4", "lastUpdate": 3375, "lastUpdateMax": 17191, "lastUpdateMin": 3375, "metricN": 21, "metricName": "max:ifoutoctets", "metricValue": 101671.07968507096 } ]Mapping the sFlow agent and dataSource associated with the busy link to a switch name and interface name is accomplished with a second query:
curl "http://10.0.0.86:8008/metric/10.0.0.80/host_name,4.ifname/json" [ { "agent": "10.0.0.80", "dataSource": "2.1", "lastUpdate": 13011, "lastUpdateMax": 13011, "lastUpdateMin": 13011, "metricN": 1, "metricName": "host_name", "metricValue": "leaf1" }, { "agent": "10.0.0.80", "dataSource": "4", "lastUpdate": 13011, "lastUpdateMax": 13011, "lastUpdateMin": 13011, "metricN": 1, "metricName": "4.ifname", "metricValue": "swp2" } ]Now we know that interface swp2 on switch leaf1 is the busy link, the next step is identifying the traffic flowing on the link by creating a flow definition (see RESTflow):
curl -H "Content-Type:application/json" -X PUT -d '{"keys":"macsource,macdestination,ipsource,ipdestination,stack","value":"bytes"}' http://10.0.0.86:8008/flow/test1/jsonNow that a flow has been defined, we can query the new metric to see traffic on the port:
curl "http://10.0.0.86:8008/metric/10.0.0.80/4.test1/json" [{ "agent": "10.0.0.80", "dataSource": "4", "lastUpdate": 714, "lastUpdateMax": 714, "lastUpdateMin": 714, "metricN": 1, "metricName": "4.test1", "metricValue": 211902.75708445764, "topKeys": [{ "key": "080027AABAA5,08002745B9B4,192.168.2.1,192.168.1.1,eth.ip.udp.geneve.eth.ip.icmp", "lastUpdate": 712, "value": 211902.75708445764 }] }]We can see that the traffic is a Geneve tunnel between Server 2 (192.168.2.1) and Server 1 (192.168.1.1) and that it is carrying encapsulated ICMP traffic. At this point, an additional flow can be created to find the sources of traffic in the virtual overlay network (see Down the rabbit hole).
The following flow definition takes the data from the physical switches and examines the tunnel contents:
curl -H "Content-Type:application/json" -X PUT -d '{"keys":"ipsource,ipdestination,genevevni,macsource.1,host:macsource.1:vir_host_name,macdestination.1,host:macdestination.1:vir_host_name,ipsource.1,ipdestination.1,stack","value":"bytes"}' http://10.0.0.86:8008/flow/test2/jsonQuerying the new metric to find out about the flow:
curl "http://10.0.0.86:8008/metric/10.0.0.80/4.test2/json" [{ "agent": "10.0.0.80", "dataSource": "4", "lastUpdate": 9442, "lastUpdateMax": 9442, "lastUpdateMin": 9442, "metricN": 1, "metricName": "4.test2", "metricValue": 3423.596229984865, "topKeys": [{ "key": "192.168.2.1,192.168.1.1,1,000000CC0202,/lonely_albattani,000000CC0102,/angry_hopper,172.16.2.2,172.16.1.2,eth.ip.udp.geneve.eth.ip.icmp", "lastUpdate": 9442, "value": 3423.596229984865 }]Now it is clear that the encapsulated flow starts at Server 2, Container 2 and ends at Server 1, Container 1.
Querying the OVN Northbound database for the MAC addresses, 000000CC0202 and 000000CC0102 links this traffic to the two ports on logical switch sw0.The flow also merges information about the identity of the containers - obtained from sFlow export from the Host sFlow agents on the servers. For example, the host:macsource.1:vir_host_name function in the flow definition looks up the virtual_host_name associated with the inner source MAC address. In this case, identifying docker container named /lonely_albattani as the source of the traffic.
At this point we have enough information to start putting in place controls. For example, knowing the container name and hosting server would allow the container to be shutdown, or the container workload could be moved - a relatively simple task since OVN will automatically update the settings on the destination server to associate the container with its logical network.
While this example showed manual steps to demonstrate sFlow-RT APIs, in practice the entire process is automated. For example, Leaf and spine traffic engineering using segment routing and SDN demonstrates how congestion on the physical links can be mitigated in ECMP fabrics.
Poor virtual switch performance
Open vSwitch performance monitoring describes key datapath performance metrics that Open vSwitch includes in its sFlow export. For example, the following query identifies the virtual switch with the lowest cache hit rate, the switch handling the largest number of cache misses, and the switch handling the largest number of active flows:
curl "http://10.0.0.86:8008/metric/ALL/min:ovs_dp_hitrate,max:ovs_dp_misses,max:ovs_dp_flows/json" [ { "agent": "10.0.0.84", "dataSource": "2.1000", "lastUpdate": 19782, "lastUpdateMax": 19782, "lastUpdateMin": 19782, "metricN": 2, "metricName": "min:ovs_dp_hitrate", "metricValue": 99.91260923845194 }, { "agent": "10.0.0.84", "dataSource": "2.1000", "lastUpdate": 19782, "lastUpdateMax": 19782, "lastUpdateMin": 19782, "metricN": 2, "metricName": "max:ovs_dp_misses", "metricValue": 0.3516881028938907 }, { "agent": "10.0.0.85", "dataSource": "2.1000", "lastUpdate": 8090, "lastUpdateMax": 19782, "lastUpdateMin": 8090, "metricN": 2, "metricName": "max:ovs_dp_flows", "metricValue": 11 } ]In this case the vSwitch on Server 1 (10.0.0.84) is handling the largest number of packets in its slow path and has the lowest cache hit rate. The vSwitch on Server 2 (10.0.0.85) has the largest number of active flows in its datapath.
The Open vSwitch datapath integrates sFlow support. The test1 flow definition created in the previous example provides general L2/L3 information, so we can make a query to see the active flows in the datapath on 10.0.0.84:
curl "http://10.0.0.86:8008/activeflows/10.0.0.84/test1/json" [ { "agent": "10.0.0.84", "dataSource": "16", "flowN": 1, "key": "000000CC0102,000000CC0202,172.16.1.2,172.16.2.2,eth.ip.icmp", "value": 97002.07726081279 }, { "agent": "10.0.0.84", "dataSource": "0", "flowN": 1, "key": "000000CC0202,000000CC0102,172.16.2.2,172.16.1.2,eth.ip.icmp", "value": 60884.34095101907 }, { "agent": "10.0.0.84", "dataSource": "3", "flowN": 1, "key": "080027946A4E,0800271AF7F0,192.168.2.1,192.168.1.1,eth.ip.udp.geneve.eth.ip.icmp", "value": 47117.093823014926 }, { "agent": "10.0.0.84", "dataSource": "17", "flowN": 1, "key": "0800271AF7F0,080027946A4E,192.168.1.1,192.168.2.1,eth.ip.udp.geneve.eth.ip.icmp", "value": 37191.709371373545 } ]The previous example showed how the flow information can be associated with Docker containers, logical networks, and physical networks so that control actions can be planned and executed reduce traffic on an overloaded virtual switch.
Overloaded server
The following query finds the server with the largest load average and the server with the highest load average:
curl "http://10.0.0.86:8008/metric/ALL/max:load_one,max:cpu_utilization/json" [ { "agent": "10.0.0.84", "dataSource": "2.1", "lastUpdate": 10661, "lastUpdateMax": 13769, "lastUpdateMin": 10661, "metricN": 7, "metricName": "max:load_one", "metricValue": 0.82 }, { "agent": "10.0.0.84", "dataSource": "2.1", "lastUpdate": 10661, "lastUpdateMax": 13769, "lastUpdateMin": 10661, "metricN": 7, "metricName": "max:cpu_utilization", "metricValue": 69.68566862013851 } ]In this case Server 1 (10.0.0.84) has the highest CPU load.
Interestingly, the switches in this case are running Cumulus Linux, which for all intents makes them servers since Cumulus Linux is based on Debian and can run unmodified Debian packages, including Host sFlow (see Cumulus Networks, sFlow and data center automation). If the busiest server happens to be one of the switches, it will show up as a result in this query.Since many workloads in a cloud environment tend to be network services, following up by examining network traffic, as was demonstrated in the previous two examples, is often the next step to identifying the source of the load.
In this case the server is also running Linux containers and the next example shows how to identify busy containers / virtual machines.
Overloaded container / virtual machine
The following query finds the container / virtual machine with the largest CPU utilization:
curl "http://10.0.0.86:8008/metric/ALL/max:vir_cpu_utilization/json" [{ "agent": "10.0.0.84", "dataSource": "3.100002", "lastUpdate": 13949, "lastUpdateMax": 13949, "lastUpdateMin": 13949, "metricN": 2, "metricName": "max:vir_cpu_utilization", "metricValue": 62.7706705162029 }]The following query extracts additional information for the agent and dataSource:
curl "http://10.0.0.86:8008/metric/10.0.0.84/host_name,node_domains,cpu_utilization,3.100002.vir_host_name/json" [ { "agent": "10.0.0.84", "dataSource": "2.1", "lastUpdate": 3377, "lastUpdateMax": 3377, "lastUpdateMin": 3377, "metricN": 1, "metricName": "host_name", "metricValue": "server1" }, { "agent": "10.0.0.84", "dataSource": "2.1", "lastUpdate": 3377, "lastUpdateMax": 3377, "lastUpdateMin": 3377, "metricN": 1, "metricName": "node_domains", "metricValue": 2 }, { "agent": "10.0.0.84", "dataSource": "2.1", "lastUpdate": 3377, "lastUpdateMax": 3377, "lastUpdateMin": 3377, "metricN": 1, "metricName": "cpu_utilization", "metricValue": 69.4535519125683 }, { "agent": "10.0.0.84", "dataSource": "3.100002", "lastUpdate": 19429, "lastUpdateMax": 19429, "lastUpdateMin": 19429, "metricN": 1, "metricName": "3.100002.vir_host_name", "metricValue": "/angry_hopper" } ]The results identify the container /angry_hopper running on server1, which is running two containers and itself has a CPU load of 69%.
Oversubscribed service pool
Cluster performance metrics describes how sFlow metrics can be used to characterize the performance of a pool of servers.
Dynamically Scaling Netflix in the Cloud |
Distributed Denial of Service (DDoS) attack
Multi-tenant performance isolation describes a large scale outage at a cloud service provider caused by a DDoS attack. The real-time traffic information available through sFlow provides the information needed to identify attacks and target mitigation actions in order to maintain service levels. DDoS mitigation with Cumulus Linux describes how hardware filtering capabilities of physical switches can be deployed to automatically filter out large scale attacks that would otherwise overload the servers.
DevOps
The previous examples focused on automation applications for sFlow. The diagram above shows how the sFlow-RT analytics engine is used to deliver metrics and events to cloud based and on-site DevOps tools, see: Cloud analytics, InfluxDB and Grafana, Cloud Analytics, Metric export to Graphite, and Exporting events using syslog. There are important scaleability and cost advantages to placing the sFlow-RT analytics engine in front of metrics collection applications as shown in the diagram. For example, in large scale cloud environments the metrics for each member of a dynamic pool are not necessarily worth trending since virtual machines are frequently added and removed. Instead, sFlow-RT can be configured to track all the members of the pool, calculates summary statistics for the pool, and log summary statistics. This pre-processing can significantly reduce storage requirements, reduce costs and increase query performance.
No comments:
Post a Comment