- sFlow real-time traffic data identifies active BGP routes
- BGP path attributes are available in flow definitions
Setup
First download sFlow-RT. Next create a configuration file, bgp.js, in the sFlow-RT home directory with the following contents:var reflectorIP = '10.0.0.254'; var myAS = '65162'; var myID = '10.0.0.162'; var sFlowAgentIP = '10.0.0.253'; // allow BGP connection from reflectorIP bgpAddNeighbor(reflectorIP,myAS,myID); // direct sFlow from sFlowAgentIP to reflectorIP routing table // calculate a 60 second moving average byte rate for each route bgpAddSource(sFlowAgentIP,reflectorIP,60,'bytes');The following sFlow-RT System Properties load the configuration file and enable BGP:
- script.file=bgp.js
- bgp.start=yes
$ ./start.sh 2016-06-28T13:14:34-0700 INFO: Listening, BGP port 1179 2016-06-28T13:14:35-0700 INFO: Listening, sFlow port 6343 2016-06-28T13:14:35-0700 INFO: Starting the Jetty [HTTP/1.1] server on port 8008 2016-06-28T13:14:35-0700 INFO: Starting com.sflow.rt.rest.SFlowApplication application 2016-06-28T13:14:35-0700 INFO: Listening, http://localhost:8008 2016-06-28T13:14:36-0700 INFO: bgp.js started 2016-06-28T13:14:36-0700 INFO: bgp.js stoppedConfigure the switch (10.0.0.253) to send sFlow to the sFlow-RT instance(10.0.0.162), see Switch configurations for vendor specific configurations. Check the sFlow-RT /agents/html page to verify that sFlow telemetry is being received from the agent.
Next, configure the router (10.0.0.254) to reflect BGP routes to the sFlow-RT instance (10.0.0.162):
router bgp 65254 bgp router-id 10.0.0.254 neighbor 10.0.0.162 remote-as 65162 neighbor 10.0.0.162 port 1179 neighbor 10.0.0.162 timers connect 30 neighbor 10.0.0.162 route-reflector-client neighbor 10.0.0.162 activateThe following sFlow-RT log entry confirms that a BGP session has been established:
2016-06-28T13:20:17-0700 INFO: BGP open 10.0.0.254 53975
Query active routes
The following cURL command uses the REST API to identify the top 5 IPv4 prefixes ranked by traffic (measured in bytes/second):curl "http://10.0.0.162:8008/bgp/topprefixes/10.0.0.254/json?maxPrefixes=5
{
"as": 65254,
"direction": "destination",
"id": "10.0.0.254",
"learnedPrefixesAdded": 691838,
"learnedPrefixesRemoved": 0,
"nPrefixes": 691838,
"pushedPrefixesAdded": 0,
"pushedPrefixesRemoved": 0,
"startTime": 1467322582093,
"state": "established",
"topPrefixes": [
{
"aspath": "NNNN-NNNN-NNNNN-NNNNN",
"localpref": 100,
"med": 1,
"nexthop": "NNN.NNN.NNN.N",
"origin": "IGP",
"prefix": "NN.NNN.NN.0/24",
"value": 9.735462342126082E7
},
{
"aspath": "NNN-NNNN",
"localpref": 100,
"med": 1,
"nexthop": "NNN.NNN.NNN.N",
"origin": "IGP",
"prefix": "NN.NNN.NNN.0/24",
"value": 7.347515546153101E7
},
{
"aspath": "NNNN-NNNNNN-NNNNN",
"localpref": 100,
"med": 1,
"nexthop": "NNN.NNN.NNN.N",
"origin": "IGP",
"prefix": "NN.NNN.NN.N/24",
"value": 4.26137765317916E7
},
{
"aspath": "NNNN-NNNN-NNNN",
"localpref": 100,
"med": 1,
"nexthop": "NNN.NNN.NNN.N",
"origin": "IGP",
"prefix": "NNN.NN.NNN.0/24",
"value": 2.6633190792947102E7
},
{
"aspath": "NNNN-NNN-NNNNN",
"localpref": 100,
"med": 10001,
"nexthop": "NNN.NNN.NNN.NN",
"origin": "IGP",
"prefix": "NN.NNN.NNN.0/24",
"value": 1.5500941476103483E7
}
],
"valuePercentCoverage": 71.38452058755995,
"valueTopPrefixes": 2.55577687683634E8,
"valueTotal": 3.5802956380458355E8
}
In addition to returning the top prefixes, the query returns information about the amount of traffic covered by these prefixes. In this case, the valuePercentageCoverage of 71.38 indicates that 71.38% of the traffic is covered by the top 5 prefixes.Note: Identifying numeric digits have been substituted with the letter N to protect privacy.Additional arguments can be used to refine the top prefixes query:
- maxPrefixes, maximum number of prefixes in the result
- minValue, only include entries with a value greater than the threshold
- direction, specify "ingress" for traffic arriving from remote networks and "egress" for traffic destined for remote networks
- minPrefix, exclude shorter prefixes, e.g. minPrefix=1 would exclude 0.0.0.0/0.
- includeCovered, set to "true" to also include prefixes that are covered by the top prefix, but wouldn't otherwise make the list. For example, if 10.1.0.0/16 was included, then 10.1.3.0/24 would also be included if it were in the set of prefixes advertised by the router.
- pruneCovered, set to "true" to eliminate covered prefixes that share the same next hop.
Writing Applications, describes how to build analytics driven controller applications using sFlow-RT's REST and embedded JavaScript APIs. For example, SDN router using merchant silicon top of rack switch, White box Internet router PoC, and Active Route Manager demonstrate how real-time identification of active routes can be used to efficiently manage limited hardware resources in commodity white box switches in order to handle a full Internet routing table of over 600,000 routes.
Defining Flows
The following flow attributes learned from the BGP session are merged with sFlow data received from switch 10.0.0.253:- ipsourcemaskbits
- ipdestinationmaskbits
- bgpnexthop
- bgpnexthop6
- bgpas
- bgpsourceas
- bgpsourcepeeras
- bgpdestinationas
- bgpdestinationpeeras
- bgpdestinationaspath
- bgpcommunities
- bgplocalpref
Writing Applications describes how to program sFlow-RT flow caches, using the flow keys to select and identify traffic flows. For example, the following Python script uses the REST API to identify the source networks associated with a UDP amplification DDoS attack:
#!/usr/bin/env python import requests import json // DNS port reflector_port = '53'
max_pps = 100000 rest = 'http://localhost:8008' # define flow flow = {'keys':'mask:ipsource,bgpsourceas', 'filter':'udpsourceport='+reflector_port, 'value':'frames'} requests.put(rest+'/flow/ddos/json',data=json.dumps(flow)) # set threshold threshold = {'metric':'ddos', 'value': max_pps, 'byFlow':True} requests.put(rest+'/threshold/ddos/json',data=json.dumps(threshold)) # tail even log eventurl = rest+'/events/json?thresholdID=ddos&maxEvents=10&timeout=60' eventID = -1 while 1 == 1: r = requests.get(eventurl + "&eventID=" + str(eventID)) if r.status_code != 200: break events = r.json() if len(events) == 0: continue eventID = events[0]["eventID"] events.reverse() for e in events: print e['flowKey']Running the script generates a log of the source network and AS number that exceed 100,000 packets per second of DNS response traffic (again, identifying numeric digits have been substituted with the letter N to protect privacy):
$ ./ddos.py NNN.NNN.0.0/13,NNNN NNN.NNN.NNN.NNN/27,NNNN NNN.NN.NNN.NNN/28,NNNNN NNN.NNN.NN.0/24,NNNNNA variation on the script can be used to identify large "Elephant" flows and their destination AS paths (showing the list of networks that packets traverse en route to their destination):
#!/usr/bin/env python import requests import json max_Bps = 1000000000/8 rest = 'http://localhost:8009' # define flow flow = { 'keys':'ipsource,ipdestination,tcpsourceport,tcpdestinationport,bgpdestinationaspath', 'value':'bytes'} requests.put(rest+'/flow/elephant/json',data=json.dumps(flow)) # set threshold threshold = {'metric':'elephant', 'value': max_Bps, 'byFlow':True} requests.put(rest+'/threshold/elephant/json',data=json.dumps(threshold)) # tail even log eventurl = rest+'/events/json?thresholdID=elephant&maxEvents=10&timeout=60' eventID = -1 while 1 == 1: r = requests.get(eventurl + "&eventID=" + str(eventID)) if r.status_code != 200: break events = r.json() if len(events) == 0: continue eventID = events[0]["eventID"] events.reverse() for e in events: print e['flowKey']Running the script generates real-time notification of the Elephant flows (flows exceeding 1Gbit/s) along with their destination AS paths:
$ ./elephant.py NNN.NN.NN.NNN,NNN.NNN.NN.NN,60789,25,NNNNN NNN.NN.NNN.NN,NNN.NN.NN.NNN,443,38016,NNNNN-NNNNN-NNNNN-NNNNN NN.NNN.NNN.NNN,NNN.NNN.NN.NN,37030,10059,NNNN-NNN-NNNN NNN.NN.NN.NNN,NN.NN.NNN.NNN,34611,25,NNNNSDN and large flows describes how a small number of Elephant flows typically consume most of the bandwidth, even though they are greatly outnumbered by small (Mice) flows. Dynamic policy based routing can targeted at Elephant flows to significantly improve performance and manage network resources: Leaf and spine traffic engineering using segment routing and SDN and WAN optimization using real-time traffic analytics are two examples.
Finally, the real-time BGP analytics don't exist in isolation. The diagram shows how the sFlow-RT real-time analytics engine receives a continuous telemetry stream from sFlow instrumentation build into network, server and application infrastructure and delivers analytics through APIs and can easily be integrated with a wide variety of on-site and cloud, orchestration, DevOps and Software Defined Networking (SDN) tools.
Hello/Hola/Privyet,
ReplyDeleteI have tried to configure sflow on Huawei switch devices. its working perfectly.
on the other hand i tried to configure Netstream on Huawei Router NE40-x3 series devices. I have verified my net-stream configurations with huawei user manual. its correct.
why it is not connecting to the sflow-rt controller? sflow-rt doesnt support netsream protocols ?
am i missing any configs?
sFlow-RT is a real-time flow analyzer - see Rapidly detecting large flows, sFlow vs. NetFlow/IPFIX. NetFlow/IPFIX/NetStream aren't supported since the don't provide real-time information.
Delete