Monday, September 26, 2016

Asynchronous Docker metrics

Docker allows large numbers of lightweight containers can be started and stopped within seconds, creating an agile infrastructure that can rapidly adapt to changing requirements. However, the rapidly changing populating of containers poses a challenge to traditional methods of monitoring which struggle to keep pace with the changes. For example, periodic polling methods take time to detect new containers and can miss short lived containers entirely.

This article describes how the latest version of the Host sFlow agent is able to track the performance of a rapidly changing population of Docker containers and export a real-time stream of standard sFlow metrics.
The diagram above shows the life cycle status events associated with a container. The Docker Remote API provides a set of methods that allow the Host sFlow agent to communicate with the Docker to list containers and receive asynchronous container status events. The Host sFlow agent uses the events to keep track of running containers and periodically exports cpu, memory, network and disk performance counters for each container.

The diagram at the beginning of this article shows the sequence of messages, going from top to bottom, required to track a container. The Host sFlow agent first registers for container lifecycle events before asking for all the currently running containers. Later, when a new container is started, Docker immediately sends an event to the Host sFlow agent, which requests additional information (such as the container process identifier - PID) that it can use to retrieve performance counters from the operating system. Initial counter values are retrieved and exported along with container identity information as an sFlow counters message and a polling task for the new container is initiated. Container counters are periodically retrieved and exported while the container continues to run (2 polling intervals are shown in the diagram). When the Host sFlow agent receives an event from Docker indicating that the container is being stopped, it retrieves the final values of the performance counters, exports a final sFlow message, and removes the polling task for the container.

This method of asynchronously triggered periodic counter export allows an sFlow collector to accurately track rapidly changing container populations in large scale deployments. The diagram only shows the sequence of events relating to monitoring a single container. Docker network visibility demonstration shows the full range of network traffic and system performance information being exported.

Detailed real-time visibility is essential for fully realizing the benefits of agile container infrastructure, providing the feedback needed to track and automatically optimize the performance of large scale microservice deployments.

No comments:

Post a Comment