Monday, April 11, 2016

Minimizing cost of visibility

Visibility allows orchestration systems (OpenDaylight, ONOS, OpenStack Heat, Kubernetes, Docker Storm, Apache Mesos, etc.) to adapt to changing demand by targeting resources where they are needed to increase efficiency, improve performance, and reduce costs. However, the overhead of monitoring must be low in order to realize the benefits.
An analogous observation that readers may be familiar with is the importance of minimizing costs when investing in order to maximize returns - see Vanguard Principle 3: Minimize cost
Suppose that a 100 server pool is being monitored and visibility will allow the orchestration system to realize a 10% improvement by better workload scheduling and placement - increasing the pool's capacity by 10% without the need to add an additional 10 servers and saving the associated CAPEX/OPEX costs.

The chart shows the impact that measurement overhead has in realizing the potential gains in this example. If the measurement overhead is 0%, then the 10% performance gain is fully realized. However, even a relatively modest 2% measurement overhead reduces the potential improvement to just under 8% (over a 20% drop in the potential gains). A 9% measurement overhead wipes out the potential efficiency gain and measurement overheads greater than 9% result in a net loss of capacity.

More specifically, Optimizing software defined data center and Microservices discuss the critical role of network visibility in improving cloud computing performance. Consider the task of monitoring network activity in a high traffic Docker cluster running on the 100 server pool. High performance network monitoring solutions often require at least one dedicated CPU core (if Intel DPDK, or an equivalent technology, is used to accelerate network instrumentation). Suppose the server has 24 cores, dedicating one core to monitoring is a 4.2% measurement overhead and reduces the potential efficiency gain from 10% to 5% (a drop of nearly 50%). On the other hand, industry standard sFlow uses instrumentation built into hardware and software data paths.  Docker network visibility demonstration shows how Linux kernel instrumentation can be used to monitor traffic using less than 1% of 1 CPU core, an insignificant 0.04% measurement overhead that allows the orchestration system to achieve the full 10% efficiency gain.

To conclude, visibility is essential to the operation of cloud infrastructure and can drive greater efficiency. However, the net gains in efficiency are significantly affected by any overhead imposed by monitoring. Industry standard sFlow measurement technology is widely supported, minimizes overhead, and ensure that efficiency gains are fully realizable.

No comments:

Post a Comment