Wednesday, February 5, 2014

Flow-aware Real-time SDN Analytics (FRSA)

Today at the OpenDaylight Summit in Santa Clara, Ram (Ramki) Krishnan of Brocade Communications presented a framework and set of use cases for applying software defined networking (SDN) techniques control large (elephant) flows. Ramki is a co-author of related Internet Drafts: Large Flow Use Cases for I2RS PBR and QoS and Mechanisms for Optimal LAG/ECMP Component Link Utilization in Networks. The slides from the talk are available on the OpenDaylight Summit web site.

This article will review the slides and discuss selected topics in detail.
The FRSA framework identifies four classes of traffic flow based on flow rate and flow duration and identifies long lived large flows as amenable to SDN based control since they can be readily observed, consume significant resources, and last long enough to be effectively controlled. The article, SDN and large flows, discusses the opportunity presented by large flow control in greater detail.
The two elements required in the FRSA framework are real-time traffic analytics - to rapidly identify the large flows (within seconds) and a control mechanism such as integrated hybrid OpenFlow, that allows the normal switch forwarding protocols to handle traffic, but offers a way for the controller to intervene and determine the treatment of large flows.
The first use case described is distributed denial of service (DDoS) mitigation. The slide describes current approaches where a DDoS Appliance is added to the network to detect and filter attack traffic. However, large flood attacks aimed at overwhelming the Internet connection (the link between the Router and the Internet cloud in the diagram) cannot be mitigated using on site resources - they must be handled upstream.
DDoS mitigation is a large and growing problem and the market for DDoS mitigation appliances is significant and growing market, DDoS prevention market to grow by double digits through 2014 and Denial of Service Attacks Surge and Expose Enterprise Infrastructure Vulnerabilities and New Needs, IDC Says. There is an opportunity for service providers to capture a share of this market if they can use SDN to monitor and control their existing network infrastructure and deliver DDoS mitigation as a service to protect their customer's Internet connection from flood attacks. By removing the large flood attacks, existing ADC / load balancers / firewalls can be used to mitigate lower volume application layer attacks.
The following slide details the elements of the SDN DDoS mitigation solution:
This diagram shows how standard sFlow enabled in the switches and routers provides a constant stream of measurement data to an External Collector (sFlow-RT), which notifies the DDoS SDN application when large DDoS flows are detected. The DDoS SDN application selects a mitigation action and instructs the SDN Controller (OpenDaylight) to push the action to selected switches (for example using an OpenFlow rule to drop traffic associated with the DDoS attack). An example of this technique is described in detail in Physical switch hybrid OpenFlow example - demonstrating that the entire detection and mitigation cycle within 1 to 2 seconds.
The second use case is to load balance large flows in link aggregation (LAG) groups. The hash function used to spread traffic on a LAG group works for small flows, but large flows can end up on a single LAG member, limiting throughput even though there is spare capacity on other members of the group, see Load balancing LAG/ECMP groups.
The Large Flow LAG load balancing SDN application again makes use of real-time sFlow based analytics to rapidly detect large flows and the SDN Controller to selectively override forwarding decisions in Router 1 in order to load balance the flows across the link group connecting it to Router 2.
The third use case is similar to LAG load balancing. Equal cost multi-path (ECMP) routing is uses to spread traffic across a leaf and spine network topology. Again, hash based load balancing can result in large flow collisions and sub-optimal throughput.
The Large Flow Global load balancing SDN application makes use of centralized real-time analytics to identify flow collisions anywhere in the fabric and then instructs the SDN Controller to override forwarding in selected switches in order to shift flows to links with spare capacity, see ECMP load balancing.

The next three slides from the talk describe deployment opportunities for SDN based large flow load balancing.
The combination of sFlow analytics with integrated Hybrid OpenFlow described in the FRSA framework is a pragmatic approach to addressing the challenges of DDoS mitigation and load balancing in large scale, high speed network environments. The hybrid approach leverages the capabilities of existing distributed control planes to efficiently load balance small flows and combines it with an SDN controller to manage the relatively small number of large long lived flows that dominate network usage.

The key to making this approach work is pervasive support for the sFlow standard among switch vendors and recent breakthroughs in real-time sFlow analytics (sFlow-RT) that together deliver the scaleable data center wide monitoring and real-time detection of large flows needed to drive SDN applications.

It's exciting to see SDN solutions maturing and major networking vendors describing practical SDN solutions that address pressing challenges that can realistically be deployed in production networks in the near term. It looks like this is the year that SDN will emerge from proof of concept to deployment in commercially viable solutions.

Update Feb 28, 2014 Video of the talk is now available on YouTube - Flow-Aware Real-Time SDN Analytics | OpenDaylight Summit 2014

No comments:

Post a Comment