sFlow: Cumulus Linux 3.4 REST API

Sunday, August 27, 2017

Cumulus Linux 3.4 REST API

The latest Cumulus Linux 3.4 release include a REST API. This article will demonstrate how the REST API can be used to automatically deploy traffic controls based on real-time sFlow telemetry. DDoS mitigation with Cumulus Linux describes how sFlow-RT can detect Distributed Denial of Service (DDoS) attacks in real-time and deploy automated controls.

The following ddos.js script is modified to use the REST API to send Network Command Line Utility - NCLU commands to add and remove ACLs, see Installing and Managing ACL Rules with NCLU:

var user = "cumulus";
var password = "CumulusLinux!";
var thresh = 10000;
var block_minutes = 1;

setFlow('udp_target',{keys:'ipdestination,udpsourceport',value:'frames'});

setThreshold('attack',{metric:'udp_target', value:thresh, byFlow:true, timeout:10});

function restCmds(agent,cmds) {
  for(var i = 0; i < cmds.length; i++) {
    let msg = {cmd:cmds[i]};
    http("https://"+agent+":8080/nclu/v1/rpc",
         "post","application/json",JSON.stringify(msg),user,password);
  }
}

var controls = {};
var id = 0;
setEventHandler(function(evt) {
  var key = evt.agent + ',' + evt.flowKey;
  if(controls[key]) return;

  var ifname = metric(evt.agent,evt.dataSource+".ifname")[0].metricValue;
  if(!ifname) return;

  var now = (new Date()).getTime();
  var name = 'ddos'+id++;
  var [ip,port] = evt.flowKey.split(',');
  var cmds = [
    'add acl ipv4 '+name+' drop udp source-ip any source-port '+port+' dest-ip '+ip+' dest-port any',
    'add int '+ifname+' acl ipv4 '+name+' inbound',
    'commit'
  ];
  controls[key] = {time:now, target: ip, port: port, agent:evt.agent, metric:evt.dataSource+'.'+evt.metric, key:evt.flowKey, name:name};
  try { restCmds(evt.agent, cmds); }
  catch(e) { logSevere('failed to add ACL, '+e); }
  logInfo('block target='+ip+' port='+port+' agent=' + evt.agent); 
},['attack']);

setIntervalHandler(function() {
  var now = (new Date()).getTime();
  for(var key in controls) {
    if(now - controls[key].time < 1000 * 60 * block_minutes) continue;
    var ctl = controls[key];
    if(thresholdTriggered('attack',ctl.agent,ctl.metric,ctl.key)) continue;

    delete controls[key];
    var cmds = [
      'del acl ipv4 '+ctl.name,
      'commit'
    ];
    try { restCmds(ctl.agent,cmds); }
    catch(e) { logSevere('failed to remove ACL, ' + e); }
    logInfo('allow target='+ctl.target+' port='+ctl.port+' agent='+ctl.agent);
  }
});

The quickest way test the script is to use docker to run sFlow-RT:

docker run -v $PWD/ddos.js:/sflow-rt/ddos.js \
-e "RTPROP=-Dscript.file=ddos.js -Dhttp.timeout.read=60000" \
-p 6343:6343/udp -p 8008:8008 sflow/sflow-rt

This solution can be tested using freely available software. The setup shown at the top of this article was constructed using a Cumulus VX virtual machine running on VirtualBox. The Attacker and Target virtual machines are Linux virtual machines used to simulate the DDoS attack.

A DNS amplification attack can be simulated using hping3. Run the following command on the Attacker host:

sudo hping3 --flood --udp -k -s 53 192.168.2.1

Run tcpdump on the Target host to see if the attack is getting through:

sudo tcpdump -i eth1 udp port 53

Each time an attack is launched a new ACL will be added that matches the attack signature and drops the traffic. The ACL is kept in place for at least block_minutes and removed once the attack ends. The following sFlow-RT log messages show the results:

2017-08-26T17:01:24+0000 INFO: Listening, sFlow port 6343
2017-08-26T17:01:24+0000 INFO: Listening, HTTP port 8008
2017-08-26T17:01:24+0000 INFO: ddos.js started
2017-08-26T17:03:07+0000 INFO: block target=192.168.2.1 port=53 agent=10.0.0.61
2017-08-26T17:03:49+0000 INFO: allow target=192.168.2.1 port=53 agent=10.0.0.61

REST API for Cumulus Linux ACLs describes the acl_server daemon that was used in the original article. The acl_server daemon is optimized for real-time performance, supporting use cases in which multiple traffic controls need to be quickly added and removed, e.g DDoS mitigation, marking large flows, ECMP load balancing, packet brokers.

A key benefit of the openness of Cumulus Linux is that you can install software to suite your use case, other examples include: BGP FlowSpec on white box switch, Internet router using Cumulus Linux, Topology discovery with Cumulus Linux, Black hole detection, and Docker networking with IPVLAN and Cumulus Linux.

25 comments:

NetworkerOctober 11, 2017 at 3:57 AM
I have tried to run this code yet it does not seem to work. I've tried to add "net add acl ..." instead of "add acl .." as the NCLU commands start with "net", yet it did not help ( the script runs and I can see that it has created the filters in sflow GUI), I've configured the agent correctly and I can see it on sflow, however, when launch the attack, no acl is added, I've created the ddos.js in the sflow directory, and started using the "env "RTPROP=-Dscript.file=ddos.js" ./start.sh" command , what am I doing wrong?
ReplyDelete
Replies
NetworkerNovember 4, 2017 at 1:44 PM
this is my current topology for the test:

attacker<-->Router<-->CumulusVX<-->Router<-->target
|
|
switch<--->sFlow(DDoS script)
|
internet

the script should run on the remote sflow machine? I think the topology is ok, feedback?

I've test the following command to test the API remotely:

curl -X POST -k -u user:pw -H "Content-Type: application/json" -d '{"cmd": "show counters"}' https://1.1.1.1:8080/nclu/v1/rpc

ofcourse after changing, the username , password and IP address. I was able to get multiple results, not only for the show counters, but for ospf configurations on the cumulusVX such as interfaces etc.

the question is, how should I know if I have did write configurations on the "/etc/nginx-restapi-chassis.conf"

when ran the following command "sudo nginx -c /etc/nginx-restapi-chassis.conf -t"

I got a successful test, warning free status

is there something else needed to be configured with the chassis.conf beside, the normal instructions?

regarding the sampling and polling value, I uncommenting the default value wthin hsflow.conf, should it be another than the default?

I'm also uncommenting an option stating to listen to JSON application on certain pre-defined port number, should I leave it uncommented?

Regards

ReplyDelete
Replies
NetworkerNovember 6, 2017 at 8:34 AM
sorry the CumulusVX is supposed to be connected to sflow and internet via a switch and not the target.
ReplyDelete
Replies
NetworkerNovember 7, 2017 at 3:30 AM
it seems that the nginy-restapi-restapi.conf is pre-configured for CumulusVX, cause i checked it out and everything seems alright. should I though leave the option listen [::]:8080 as it is (default setting)?

I run the example or adding a Layer2 bridge br212 using Curl PUT and the bridge was created successfully, however it took a couple of minutes to take effect. and this is the output once I've checked it on the switch

Name Master Speed MTU Mode Remote Host Remote Port Summary
-- ------ -------- ------- ----- ------------ ------------- --------------- ----------------------------------------------------
UP lo None N/A 65536 Loopback IP: 127.0.0.1/8, ::1/128
UP eth0 None 1G 1500 Mgmt IP: 192.168.122.133/24(DHCP)
UP swp1 None 1G 1500 Interface/L3 R1 FastEthernet0/0 IP: 10.10.11.2/24
UP swp2 None 1G 1500 Interface/L3 R2 FastEthernet0/0 IP: 10.10.22.2/24
UP br212 None N/A 1500 Bridge/L2 802.1q Tag: Untagged STP: Disabled Vlan Aware Bridge

what I dont get, how come the attack flow is being detected, target ip, source port and amout of frame. Yet the threshold is not being triggerd and the event is not being logged? the sFLow has a complete information about the switch and its interface as I've checked in the "agent" tab. is the problem with code failing to execute? or the Problem is with CumulusVX Platform on GNS3? the Curl commands though worked fine as I mentioned above
ReplyDelete
Replies
NetworkerNovember 8, 2017 at 3:44 PM
I did change the threshold vlaue to 300, and it seemed to work some how. the attacked is being detected on the sFlow-RT flows page. the attack has been as well logged and recorded on the sFlow-RT Events page, as in I can see the attack details. As for the graph it keep showing spikes, once the traffic exceeds the 300 threshold vlaue, it drops down below 300 immediatly, yet it rises again aftwards and drops again below 300. it is some how similar to this graph
https://robertscribbler.files.wordpress.com/2016/05/stephan-rahmstorf-temperature-anomaly.jpg

yet the sflow graph is not curving up like the one in the link.

the kali-linux shows that there's 100% packet loss. yet the target's tcpdump output shows that the packets are being received.

the sflow script runs and prints the following message:

date/time...-0500 SERVERE : failed to remove ACL , InternalError:Malformed URL java.net.malformedURLException: For input string: "agent:MAC:address:8080" (ddos.js#13)

date/time...-500 INFO: block target=192.168.22.2 port=53 agent=agent:MAC:address

any feedback? cause I'm fully understanding if the attack is being mitigated.

thanks alot your help and patience

Regards
ReplyDelete
Replies
NetworkerNovember 9, 2017 at 2:20 AM
I did reduce the sampling rate for sampling.1G interface to 40. nothing changed though. You indeed are right the ACL is not being created because the udp traffic is being captured by the tcpdump on the victim. the graph still looks the same, keeps spiking above and below 300, as clarified yesterday.

Sorry it was my mistake it didnt check properly. its not a mac address, it is IPv6 of the management interface (eth0) on the cumulus

when check sFlow-RT Agents tab, I can see the agent, but instead of seeing it as IPv4, I'm seeing as IPv6.

I think thats why the ACL is not being created. and thats I'm getting the error:

SEVERE: failed to add ACL, InternatlError: Malformed URL java.net.MalformedURLException: for input string: "agentIPv6:8080" (ddos.js#13)

Im gonna try something by forcing the CumulusVX having an IPv4 and no IPv6, cause due to my connection via the VMware, it managed to get an IPv6 via my home router.

or what do you think?
ReplyDelete
Replies
NetworkerNovember 10, 2017 at 2:26 AM
I did eventually used your method of using the IPv6 for the agent. and it worked. however, if I've understood it right, the target gets blocked, so even if you lauch another attack from another machine, the attckers wont be able to access the target. Until the scripts unblocks the target again through ( allow target)? did I understand it right?

thanks alot for your help
ReplyDelete
Replies
NetworkerNovember 12, 2017 at 4:25 AM
ok noted. During my testing, I've noticed that, when launch attack, it gets successfully mitigated, GUI and Tcpdump proves as well the acl in the CumulusVX and the CLI prints block then allow after the blocking time has passed. However, when lauch a 2nd attack after the target has been allowed again, the 2nd attack doesn't get mitigated or detected, does the script run once only? something wrong with my virtual environment?

I also had a question in mind, actually 2, what if I wanna run the script on than a single agent? and what if I wanna run more than a script on the same agent? does sFLow support multi-scripting?

Regards
ReplyDelete
Replies
PeterNovember 16, 2017 at 2:38 PM
RESTful control of Cumulus Linux ACLs describes how to integrate real-time control of ACLs with the Cumulus Linux 3.4 HTTP API.
ReplyDelete
Replies
NetworkerNovember 20, 2017 at 6:47 AM
Thanks Peter, I'll read through and run a test
ReplyDelete
Replies
NetworkerNovember 20, 2017 at 9:35 AM
I think I kind of troubleshooted the problem for why is the 2nd attack is not being mitigated, I nocticed that if dont issue the command

sudo iptables -I FORWARD -j NFLOG --nflog-group 1 --nflog-prefix SFLOW

after each attack, the next attack will not be detected or mitigated, is there any clarification for this issue?

I was wondering as about the "data source" field in the Event page, what does it mean, for me it was showing "3"

Regards
ReplyDelete
Replies

Add comment