Sunday, November 17, 2024

SC24 Dropped packet visibility demonstration

The real-time dashboard is a joint InMon / Arista demonstration at The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC24) conference being held this week in Atlanta.

The conference network used in the demonstration, SCinet, is described as the most powerful and advanced network on Earth, connecting the SC community to the world.

The sFlow Packet Drop Monitoring In High Performance Networks dashboard combines telemetry from all the Arista switches in the SCinet network to provide real-time network-wide view of performance. Each of the three charts demonstrate a different type of measurement in the sFlow telemetry stream:

  • Counters: Total Traffic shows total traffic calculated from interface counters streamed from all interfaces. Counters provide a useful way of accurately reporting byte, frame, error and discard counters for each network interface. In this case, the chart rolls up data from all interfaces to trend total traffic on the network.
  • Samples: Top Flows shows the top 5 largest traffic flows traversing the network. The chart is based on sFlow's random packet sampling mechanism, providing a scaleable method of determining the hosts and services responsible for the traffic reported by the counters. Visibility into top flows is essential if one wants to take action to manage network usage and capacity: immediately identifying DDoS attacks, elephant flows, and tracking changing service demands.
    Note: Network addresses have been masked for privacy.
  • Notifications: Dropped Packets shows each dropped packet, the device that dropped it, and the reason it was dropped. Dropped packets have a profound impact on network performance and availability. Packet discards due to congestion can significantly impact application performance. Dropped packets due to black hole routes, expired TTLs, MTU mismatches, etc can result in insidious connection failures that are time consuming and difficult to diagnose.
    Note: Network addresses have been masked for privacy.
The sFlow data model integrates the three telemetry streams: counters, packet samples, and drop notifications. Each type of data is useful on its own, but together they provide the system wide observability needed to drive automation.
Dropped packet metrics with Prometheus and Grafana describes how to incorporate real-time dropped packet metrics into operational dashboards for rapid troubleshooting of network performance problems.

If you have Arista switches in your network, try enabling sFlow to gain insight into network traffic. Dropped packet notifications with Arista Networks, describes how to configure sFlow to include dropped packet notifications. Real-time network visibility is particularly relevant to AI / ML data center networks where congestion and dropped packets can result in serious performance degredation.

No comments:

Post a Comment