Thursday, February 26, 2015

Broadcom ASIC table utilization metrics, DevOps, and SDN

Figure 1: Two-Level Folded CLOS Network Topology Example
Figure 1 from the Broadcom white paper, Engineered Elephant Flows for Boosting Application Performance in Large-Scale CLOS Networks, shows a data center leaf and spine topology. Leaf and spine networks are seeing rapid adoption since they provide the scaleability needed to cost effectively deliver the low latency, high bandwidth interconnect for cloud, big data, and high performance computing workloads.

Broadcom Trident ASICs are popular in white box, brite-box and branded data center switches from a wide range of vendors, including: Accton, Agema, Alcatel-Lucent, Arista, Cisco, Dell, Edge-Core, Extreme, Hewlett-Packard, IBM, Juniper, Penguin Computing, and Quanta.
Figure 2: OF-DPA Programming Pipeline for ECMP
Figure 2 shows the packet processing pipeline of a Broadcom ASIC. The pipeline consists of a number of linked hardware tables providing bridging, routing, access control list (ACL), and ECMP forwarding group functions. Operations teams need to be able to proactively monitor table utilizations in order to avoid performance problems associated with table exhaustion.

Broadcom's recently released sFlow specification, sFlow Broadcom Switch ASIC Table Utilization Structures, leverages the industry standard sFlow protocol to offer scaleable, multi-vendor, network wide visibility into the utilization of these hardware tables.

Support for the new extension has just been added to the open source Host sFlow agent, which runs on Cumulus Linux, a Debian based Linux distribution that supports open switch hardware from Agema, Dell, Edge-Core, Penguin Computing, Quanta. Hewlett-Packard recently announced that they will soon be selling a new line of open network switches built by Accton Technologies and supporting Cumulus Linux.
The speed with which this new features can be delivered on hardware from the wide range of vendors supporting Cumulus Linux is a powerful illustration of the power of open networking. While support for the Broadcom ASIC table extension has been checking into the Host sFlow trunk it hasn't yet made it into the Cumulus Networks binary repositories. However, Cumulus Linux is an open platform, so users are free to download sources, compile and install the latest software version direct from SourceForge.
The following output from the open source sflowtool command line utility shows the raw table measurements (this is in addition to the extensive set of sFlow measurements already exported via sFlow on Cumulus Linux):
bcm_asic_host_entries 4
bcm_host_entries_max 8192
bcm_ipv4_entries 0
bcm_ipv4_entries_max 0
bcm_ipv6_entries 0
bcm_ipv6_entries_max 0
bcm_ipv4_ipv6_entries 9
bcm_ipv4_ipv6_entries_max 16284
bcm_long_ipv6_entries 3
bcm_long_ipv6_entries_max 256
bcm_total_routes 10
bcm_total_routes_max 32768
bcm_ecmp_nexthops 0
bcm_ecmp_nexthops_max 2016
bcm_mac_entries 3
bcm_mac_entries_max 32768
bcm_ipv4_neighbors 4
bcm_ipv6_neighbors 0
bcm_ipv4_routes 0
bcm_ipv6_routes 0
bcm_acl_ingress_entries 842
bcm_acl_ingress_entries_max 4096
bcm_acl_ingress_counters 68
bcm_acl_ingress_counters_max 4096
bcm_acl_ingress_meters 18
bcm_acl_ingress_meters_max 8192
bcm_acl_ingress_slices 3
bcm_acl_ingress_slices_max 8
bcm_acl_egress_entries 36
bcm_acl_egress_entries_max 512
bcm_acl_egress_counters 36
bcm_acl_egress_counters_max 1024
bcm_acl_egress_meters 18
bcm_acl_egress_meters_max 512
bcm_acl_egress_slices 2
bcm_acl_egress_slices_max 2
The sflowtool output is useful for troubleshooting and is easy to parse with scripts.


The diagram shows how the sFlow-RT analytics engine is used to deliver metrics and events to cloud based and on-site DevOps tools, see: Cloud analytics,  InfluxDB and GrafanaCloud AnalyticsMetric export to Graphite, and Exporting events using syslog.

For example, the following sFlow-RT application simplifies monitoring of the leaf and spine network by combining measurements from all the switches, identifying the switch with the maximum utilization of each table, pushing the summaries to operations dashboard every 15 seconds, and sending syslog events immediately when any table exceeds 80% utilization:
var network_wide_metrics = [

var max_utilization = 80;

setIntervalHandler(function() {
  var vals = metric('ALL',network_wide_metrics);
  var graphite_metrics = {};
  for each (var val in vals) {
    if(!val.hasOwnProperty('metricValue')) continue;

    // generate syslog events for over utilized tables
    if(val.metricValue >= max_utilization) {
       var event = {
       try {
           '', // syslog collector: splunk>, logstash, etc.
           514,        // syslog port
           16,         // facility = local0
           5,          // severity = notice
      } catch(e) { logWarning("syslog() failed " + e); }

    // add metric to graphite set
    graphite_metrics["network.podA."+val.metricName] = val.metricValue;

  // sent metrics to graphite
  try {
      '',  // graphite server
      2003,          // graphite carbon UDP port
  } catch(e) { logWarning("graphite() failed " + e); }
The following screen capture shows the graphs starting to appear in Graphite:

Real-time traffic analytics

The table utilization metrics are only a part of the visibility that sFlow provides into the performance of a leaf and spine network.

A leaf and spine fabric is challenging to monitor. The fabric spreads traffic across all the switches and links in order to maximize bandwidth. Unlike traditional hierarchical network designs, where a small number of links can be monitored to provide visibility, a leaf and spine network has no special links or switches where running CLI commands or attaching a probe would provide visibility. Even if it were possible to attach probes, the effective bandwidth of a leaf and spine network can be as high as a Petabit/second, well beyond the capabilities of current generation monitoring tools.
Scaleable traffic measurement is possible because Broadcom ASICs implement hardware support for sFlow monitoring, providing cost effective, line rate visibility that is build into the switches and scales to all port speeds (1G, 10G, 25G, 40G, 50G, 100G, ...) and the high port counts found in large leaf and spine networks.
The 2 minute video provides an overview of some of the performance challenges with leaf and spine fabrics and demonstrates Fabric View - a monitoring solution that leverages industry standard sFlow instrumentation in commodity data center switches to provide real-time visibility into fabric performance. Fabric visibility with Cumulus Linux describes how to set up Fabric View to monitor a Cumulus Linux leaf and spine network.


Real-time network analytics are a fundamental driver for a number of important SDN use cases, allowing the SDN controller to rapidly detect changes in traffic and respond by applying active controls. SDN fabric controller for commodity data center switches describes how control of the ACL table is the key feature needed to to build scaleable SDN solutions.

REST API for Cumulus Linux ACLs describes open source software to allow an SDN controller to centrally manage the ACL tables on a large scale network of switches running Cumulus Linux.
The ability to install software on the switches is transformative, allowing third party developers and network operators transparent access to the full capabilities of the switch and build solutions that efficiently handle automation challenges.
A number of SDN use cases have been demonstrated that build on Cumulus Linux to leverage the real-time visibility and control capabilities of the switch ASIC:
Visit the web site to learn more about SDN control of leaf and spine networks.

Finally, the SDN use cases make extensive use of the ACL table and so this brings us full circle to the importance of the Broadcom sFlow extension providing visibility into the utilization of table resources.

No comments:

Post a Comment