Wednesday, November 16, 2022

RDMA network visibility

The Remote Direct Memory Access (RDMA) data shown in the chart was gathered from The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC22) being held this week in Dallas. The conference network, SCinet, is described as the fastest and most powerful network on Earth, connecting the SC community to the world.
Resilient Distributed Processing and Reconfigurable Networks is one of the demonstrations using SCinet - Location: Booth 2847 (StarLight). Planned SC22 focus is on RDMA enabled data movement and dynamic network control.
  1. RDMA Tbps performance over global distance for timely Terabyte bulk data transfers (goal << 1 min Tbyte transfer on N by 400G network).
  2. Dynamic shifting of processing and network resources from on location/path/system to another (in response to demand and availability).
The real-time chart at the top of this page shows an up to the second view of RDMA traffic (broken out by source, destination, and RDMA operation).
The chart was generated using industry standard streaming sFlow telemetry from switches and routers in the SCinet network. An instance of the sFlow-RT analytics engine computes the RDMA flow metrics shown in the chart. RESTflow describes how sFlow disaggregates the traditional NetFlow / IPFIX analytics pipeline to offer flexible, scaleable, low latency flow measurements. Flow metrics with Prometheus and Grafana describes how metrics can be stored in a time series database for use in operational dashboards.

Real-time traffic analytics transforms network monitoring from reporting on the past to observing and acting on the present to automate troubleshooting and traffic engineering, e.g. Leaf and spine traffic engineering using segment routing and SDN and DDoS protection quickstart guide.

No comments:

Post a Comment