Sunday, June 21, 2015

Optimizing software defined data center

The recent Fortune magazine article, Software-defined data center market to hit $77.18 billion by 2020, starts with the quote "Data centers are no longer just about all the hardware gear you can stitch together for better operations. There’s a lot of software involved to squeeze more performance out of your hardware, and all that software is expected to contribute to a burgeoning new market dubbed the software-defined data center."

The recent ONS2015 Keynote from Google's Amin Vahdat describes how Google builds large scale software defined data centers. The presentation is well worth watching in its entirety since Google has a long history of advancing distributed computing with technologies that have later become mainstream.
There are a number of points in the presentation that relate to the role of networking to the performance of cloud applications. Amin states, "Networking is at this inflection point and what computing means is going to be largely determined by our ability to build great networks over the coming years. In this world data center networking in particular is a key differentiator."

This slide shows the the large pools of storage and compute connected by the data center network that are used to deliver data center services. Amin states that the dominant costs are compute and storage and that the network can be relatively inexpensive.
In Overall Data Center Costs James Hamilton breaks down the monthly costs of running a data center and puts the cost of network equipment at 8% of the overall cost.
However, Amin goes on to explain why networking has a disproportionate role in the overall value delivered by the data center.
The key to an efficient data center is balance. If a resource is scarce, then other resources are left idle and this increases costs and limits the overall value of the data center. Amin goes on to state, "Typically the resource that is most scarce is the network."
The need to build large scale high-performance networks has driven Google to build networks with the following properties:
  • Leaf and Spine (Clos) topology
  • Merchant silicon based switches (white box / brite box / bare metal)
  • Centralized control (SDN)
The components and topology of the network are shown in the following slide.
Here again Google is leading the overall network market transition to inexpensive leaf and spine networks built using commodity hardware.

Google is not alone in leading this trend. Facebook has generated significant support for the Open Compute Project (OCP), which publishes open source designs data center equipment, including merchant silicon based leaf and spine switches. A key OCP project is the Open Network Install Environment (ONIE), which allows third party software to be installed on the network equipment. ONIE separates hardware from software and has spawned a number of innovative networking software companies, including: Cumulus Networks, Big Switch Networks, Pica8, Pluribus Networks.  Open network hardware and the related ecosystem of software is entering the mainstream as leading vendors such as Dell and HP deliver open networking hardware, software and support to enterprise customers.
The ONS2015 keynote from AT&T's John Donovan, describes the economic drivers for AT&T's transition to open networking and compute architectures.
John discusses the rapid move from legacy TDM (Time Division Multiplexing) technologies to commodity Ethernet, explaining that "video now makes up the majority of traffic on our network." This is a fundamental shift for AT&T and John states that "We plan to virtualize and control more than 75% of our network using cloud infrastructure and a software defined architecture."

John mentions the CORD (Central Office Re-architected as a Datacenter) project which proposes an architecture very similar to Google's, consisting of a leaf and spine network built using open merchant silicon based hardware connecting commodity servers and storage. A prototype of the CORD leaf and spine network was shown as part of the ONS2015 Solutions Showcase.
ONS2015 Solutions Showcase: Open-source spine-leaf Fabric
Leaf and spine traffic engineering using segment routing and SDN describes a live demonstration presented in ONS2015 Solutions Showcase. The demonstration shows how centralized analytics and control can be used to optimize the performance of commodity leaf and spine networks handling the large "Elephant" flows that typically comprise most traffic on the network (for example, video streams - see SDN and large flows for a general discussion).

Getting back to the Fortune article, it is clear that the move to open commodity network, server and storage hardware shifts value from hardware to the software solutions that optimize performance. The network in particular is a critical resource that constrains overall performance and network optimization solutions can provide disproportionate benefits by eliminating bottlenecks that constrain compute and storage and limit the value delivered by the data center.

1 comment: