Wednesday, October 16, 2013

DNS amplification attacks

Figure 1: DNS Amplification Variation Used in Recent DDoS Attacks (Update)
DNS Amplification Variation Used in Recent DDoS Attacks (Update) describes how public DNS servers can be used to amplify the effect of Distributed Denial of Service (DDoS) attacks - resulting in some of the largest and most disruptive attacks reported to date.
Figure 2: The DDoS That Knocked Spamhaus Offline (And How We Mitigated It)
The DDoS That Knocked Spamhaus Offline (And How We Mitigated It) describes a large 75Gbps attack using DNS amplification. An even larger 300Gbps attack causes wide scale disruption, see What you need to know about the world’s biggest DDoS attack.

DDoS describes how the sFlow monitoring standard can be used to rapidly detect and mitigate DDoS attacks on the target (victim) network. This article will examine how data centers that may be inadvertently hosting open DNS servers can use sFlow to identify servers participating in amplification attacks.

A hosting service provider has very little control over the services running on the physical and virtual servers running in the data center, and while one might hope that the customers carefully configure and monitor their DNS servers, the reality is that there are many openly accessible DNS servers. Using the network switches to monitor DNS operations is an attractive option, offering an agentless method of detecting and monitoring DNS servers wherever they are in the data center.

The sFlow standard is well suited to this task:
  1. sFlow is widely supported in physical and virtual switches
  2. sFlow is embedded within switch hardware and can be enabled in high traffic production networks without impacting performance.
  3. sFlow is scaleable, a single software analyzer can monitor hundreds of switches and tens of thousands of switch ports to deliver network wide visibility
  4. sFlow exports packet headers, allowing sFlow analysis software to perform deep packet inspection and report on DNS operations.
In the following example, a single instance of sFlow-RT is monitoring 7500 switch ports in a data center network.

The following DNS attributes are extracted from the sFlow packet samples and can be included in flow definitions, or as filters:

DescriptionNameExample
request=false, response=truednsqrfalse
op codednsopcode0
authoritative answerdnsaafalse
truncateddnstcfalse
recursion desireddnsrdfalse
recursion availablednsratrue
reserveddnsz0
response codednsrcode0
number of entries in questiondnsqdcount1
number of entries in answerdnsancount0
number of entries in name server sectiondnsnscount0
number of entries in resources sectiondnsarcount0
domain name in querydnsqnameyahoo.com.
query type codednsqtype15
query type namednsqtypenameMX(15)
query classdnsqclass1

The following flow definitions were created using sFlow-RT's embedded scripting API:
// track DNS query types
setFlow('dnsqueries',
  {keys:'dnsqtypename',value:'frames',filter:'dnsqr=false',t:20});

// track DNS query domains for ANY(15) queries
setFlow('dnsqany',
  {keys:'dnsqname', value:'frames', filter:'dnsqr=false&dnsqtype=255',t:20});

// track total DNS request rate for ANY(15) queries
setFlow('dnsqanytot',
  {value:'frames', filter:'dnsqr=false&dnsqtype=255',t:20});
Alternatively, the flow definitions can be specified by making calls to the REST API using cURL:
curl -H "Content-Type:application/json" -X PUT --data "{keys:'dnsqtypename', value:'frames', filter:'dnsqr=false',t:20}" http://localhost:8008/flow/dnsqueries/json
Using the script API has a number of advantages: it ensures that flow definitions are automatically reinstated on a system restart, makes it easy to generate trend charts (for example the graphite() function sends metrics to Graphite for integration in performance dashboards) and to automate the response when DNS anomalies are detected (for example, using the syslog() function to send an alert or http() to access a REST API on a device or SDN controller to block the traffic)
The table above (http://localhost:8008/activeflows/ALL/dnsqueries/html?aggMode=sum) shows a continuously updating, real-time, view of the top DNS queries - a bit like the Linux top command, but applied to the active flows. The table shows that the fourth most frequent query is the ANY(255) query type.

The ANY(15) query is often used for DNS amplification attacks since it asks the name server for all the records within the domain, resulting in a large response that amplifies the traffic in the attack.
The above chart (http://localhost:8008/activeflows/ALL/dnsqany/html?aggMode=sum) looks at the domain names being queried by the ANY(15) queries. Domain names in the list are known to be associated with DNS amplification attacks, see DNS Amplification Attacks Observer, so it appears that there are open DNS servers being used to amplify DNS attacks in this data center.

The trend chart above (http://localhost:8008/metric/ALL/sum:dnsqanytot/html) looks at the overall level of ANY(15) requests in the data center. The trend is increasing as a new DNS amplification attack is launched.

There are a number of more detailed flow definitions that can be created:
  1. identify the open name servers: include ipdestination in the flow definition
  2. identify target of the attack: include ipsource in the flow definition 
  3. identify target country: include sourcecountry in the flow definition
  4. identify compromised hosts: include macsource in the flow definition
Note: Examples of these detailed flows have been omitted to preserve the anonymity.
Figure 3: Performance aware software defined networking
Incorporating sFlow analytics in a performance aware software defined networking solution offers the opportunity to automate a response. By itself, OpenFlow is not DNS aware, however, combining the detection capabilities of sFlow with OpenFlow rules to selectively steer traffic based on IP source, destination, protocol and port allows attacks to be blocked, or for a DNS proxy to be inserted in the packet path to selectively drop requests.

While this example focused on a data center hosting DNS servers, a similar approach could be used to monitor campus networks. Detecting hosts that are spoofing their source addresses and generating suspect DNS requests is a useful signature for identifying compromised hosts. In this case, the SDN controller would respond by isolating the compromised system from the rest of the network.

DNS amplification attacks are a serious problem that is difficult to address because the attacker is two steps removed from their victims (hidden behind compromised hosts and open DNS servers). DNS amplification attacks have limited impact on the intermediate networks and may go unnoticed, even though the combined effect of all the traffic arriving at the target network can be devastating. Software defined networking offers the promise of intelligent networks that can automatically respond the changing traffic conditions and security threats, providing a way to share and automate best practices and reducing operating costs to the point where intermediate networks can play a larger role in reducing the impact of these attacks.  

Tuesday, October 1, 2013

Embedding SDN applications

Figure 1: Performance aware software defined networking
Performance aware software defined networking describes a general architecture for integrating real-time analytics in software defined networking (SDN) stacks to support applications such as load balancing, DDoS mitigation, traffic marking, and multi-tenant performance isolation.

Examples on this blog have used Python or node.js scripts to create demonstrations. However, while external scripts are a quick way to build prototypes, moving from prototype to production quality implementations can be a challenge.

Much of the complexity in developing external control applications involves sharing and distributing state with the analytics engine and OpenFlow controller. This complexity can be greatly reduced if the application can be embedded in the analytics software, or in the OpenFlow controller. Deciding whether to embed an application in the analytics engine, or in the controller, should be based on how tightly coupled the application is to each of these services. In the case of performance management applications, most of the interaction is with the analytics engine and so it makes most sense to embed application logic within the analytics engine.

The following example demonstrates the benefits of embedding by taking the DDoS mitigation script described in Frenetic, Pyretic and Resonance and re-implementing it as an embedded application using the recently released sFlow-RT analytics engine scripting API.

The following script implements the DDoS mitigation application:
// author: Peter
// version: 1.0
// date: 9/30/2013
// description: DDoS controller script

include('extras/json2.js');

var flowkeys = 'ipsource';
var value = 'frames';
var filter = 'outputifindex!=discard&direction=ingress&sourcegroup=external';
var threshold = 1000; // 1000 packets per second
var groups = {'external':['0.0.0.0/0'],'internal':['10.0.0.2/32']};

var metricName = 'ddos';
var controls = {};
var enabled = true;

function sendy(address,type,state) {
  var result = runCmd(['../pyretic/pyretic/pyresonance/sendy_json.py',
                       '-i',address,'-e',type,'-V',state]);
}

function block(address) {
  if(!controls[address]) {
     sendy(address,'auth','clear');
     controls[address] = 'blocked';
  }
}
function allow(address) {
  if(controls[address]) {
     sendy(address,'auth','authenticated');
     delete controls[address];
  }
}

setGroups(groups);
setFlow(metricName, {keys:flowkeys, value:value, filter:filter, n:10, t:2});
setThreshold(metricName,{metric:metricName,value:threshold,byFlow:true});

setEventHandler(function(evt) {
  if(!enabled) return;

  var address = evt.flowKey;
  block(address);
},[metricName]);

setHttpHandler(function(request) {
  var result = {};
  try {
    var action = '' + request.query.action;
    switch(action) {
    case 'block':
       var address = request.query.address[0];
       if(address) block(address);
        break;
    case 'allow':
       var address = request.query.address[0];
       if(address) allow(address);
       break;
    case 'enable':
      enabled = true;
      break;
    case 'disable':
      enabled = false;
      break;
    }
  }
  catch(e) { result.error = e.message }
  result.controls = controls;
  result.enabled = enabled;
  return JSON.stringify(result);
});
The following command line argument loads the script on startup:
-Dscript.file=ddos.js
In addition to providing the functionality of the original script, the embedded script also includes an HTTP interface for remotely monitoring and controlling the application. For example, manually blocking or allowing an address is accomplished with the following commands:
$ curl "http://10.0.0.54:8008/script/ddos.js/json?action=block&address=10.0.0.1"
{"controls":{"10.0.0.1":"blocked"},"enabled":true}
$ curl "http://10.0.0.54:8008/script/ddos.js/json?action=allow&address=10.0.0.1"
{"controls":{},"enabled":true}
Enabling and disabling the controller is also possible:
$ curl http://10.0.0.54:8008/script/ddos.js/json?action=enable
{"controls":{},"enabled":true}
$ curl http://10.0.0.54:8008/script/ddos.js/json?action=disable
{"controls":{},"enabled":false}
The following example uses the test bed described in Frenetic, Pyretic and Resonance to demonstrate the DDoS controller. Open a web browser to view a trend of traffic and then performing the following steps:
  1. disable the controller
  2. perform a simulated DoS attack (using a flood ping)
  3. enable the controller
  4. simulate a second DoS attack.
Figure 2: DDoS attack traffic with and without controller
Figure 2 shows the results of the demonstration. When the controller is disabled, the attack traffic reaches 6,000 packets per second and persists until the attacker stops sending. When the controller is enabled, traffic is stopped the instant it hits the 1,000 packet per second threshold in the application.
Figure 3: RESTful control of switches
While the previous example demonstrated integration with an OpenFlow controller, controller-less deployments are also possible, see RESTful control of switches. sFlow-RT scripts can access REST-APIs or use TCL/expect to access switch CLIs - using the http() and runCmd() functions to reconfigure routing policy and access controls in order to block or redirect attacks - just modify the block() and release() functions in the example script. The case study described in DDoS reconfigures router BGP settings and ACLs to stop attacks. In addition, monitoring can also be added to the script - for example, using the syslog() function to send notifications to an Security Information and Event Management (SIEM).

While important classes of SDN application make sense integrated within the SDN controller (e.g. network virtualization, virtual firewalls, and routers etc.), the use cases described in this article demonstrate that integrating performance aware SDN applications (e.g.  load balancingDDoS mitigationtraffic markingmulti-tenant performance isolation, etc.) within the analytics platform makes architectural sense.