Wednesday, July 20, 2016

Internet router using Cumulus Linux

Internet router using merchant silicon describes how an inexpensive white box switch running Linux can be used to replace a much costlier Internet router. This article will describe the steps needed to install the software on an x86 based white box switch running Cumulus Linux 3.0.

First, add the Debian Jessie repository:
sudo sh -c 'echo "deb http://ftp.us.debian.org/debian jessie main contrib" > \
/etc/apt/sources.list.d/deb.list'
Next, install Host sFlow, Java, and Bird:
sudo apt-get update
sudo apt-get install hsflowd
sudo apt-get install unzip
sudo apt-get install default-jre-headless
sudo apt-get install bird
Install sFlow-RT (the latest version is available at sFlow-RT.com):
wget http://www.inmon.com/products/sFlow-RT/sflow-rt_2.0-1116.deb
sudo dpkg -i sflow-rt_2.0-1116.deb
Increase the default virtual memory limit for sflowrt (needs to be greater than 1/3 amount of RAM on system to start Java virtual machine, see Giant Bug: Cannot run java with a virtual mem limit (ulimit -v)):
sudo sh -c 'echo "sflowrt soft as 2000000" > \
/etc/security/limits.d/99-sflowrt.conf'
Note: Maximum Java heap memory has a default of 1G and is controlled by settings in /usr/local/sflow-rt/conf.d/sflow-rt.jvm file.

Install the Active Route Manager application:
sudo sh -c "cd /usr/local/sflow-rt; ./get-app.sh sflow-rt active-routes"
Cumulus Networks, sFlow and data center automation describes how to configure the sFlow agent (hsflowd). The sFlow collector address should be set to 127.0.0.1.

Finally, configure Bird and sFlow-RT as described in Internet router using merchant silicon.

The instructions were tested on a Cumulus VX virtual machine, but should work on physical switches. Cumulus VX is free and provides a convenient way to try out Cumulus Linux and create virtual networks to test configurations.

If you are going to experiment with the solution on CumulusVX then the following command is needed to enable sFlow traffic monitoring:
sudo iptables -I FORWARD -j NFLOG --nflog-group 1 --nflog-prefix SFLOW
On physical switches the sFlow agent automatically configures packet sampling in the ASIC and is able to monitor all packets (not just the routed packets captured by the iptables command above).

Monday, July 18, 2016

World map

World Map has been released on GitHub, https://github.com/sflow-rt/world-map. The application displays an up to the second view of traffic as animated bubbles overlaid on a world map.

Download and install sFlow-RT to run the world-map application. Enable the System Property, geo.country=resources/config/GeoIP.dat, to allow the application to identify countries based on IP addresses.

Friday, July 15, 2016

Internet router using merchant silicon

SDN router using merchant silicon top of rack switch and Dell OS10 SDN router demo discuss how an inexpensive white box switch running Linux can be used to replace a much costlier Internet router. The key to this solution is the observation that, while the full Internet routing table of over 600,000 routes is too large to fit in white box switch hardware, only a small fraction of the routes carry most of the traffic. Traffic analytics allows the active routes to be identified and installed in the hardware.

This article describes a simple self contained solution that uses standard APIs and should be able to run on a variety of Linux based network operating systems, including: Cumulus Linux, Dell OS10, Arista EOS, and Cisco NX-OS. The distinguishing feature of this solution is its real-time response, where previous solutions respond to changes in traffic within minutes or hours, this solution updates hardware routes within seconds.

The diagram shows the elements of the solution. Standard sFlow instrumentation embedded in the merchant silicon ASIC data plane in the white box switch provides real-time information on traffic flowing through the switch. The sFlow agent is configured to send the sFlow to an instance of sFlow-RT running on the switch. The Bird routing daemon is used to handle the BGP peering sessions and to install routes in the Linux kernel using the standard netlink interface. The network operating system in turn programs the switch ASIC with the kernel routes so that packets are forwarded by the switch hardware and not by the kernel software.

The key to this solution is Bird's multi-table capabilities. The full Internet routing table learned from BGP peers is installed in a user space table that is not reflected into the kernel. A BGP route reflector session between sFlow-RT and Bird allows sFlow-RT to see the full routing table and combine it with the sFlow telemetry to perform real-time BGP route analytics and identify the currently active routes. A second BGP session allows sFlow-RT to push routes to Bird which in turn pushes the active routes to the kernel, programming the ASIC.

In this example, the following Bird configuration, /etc/bird/bird.conf, was installed on the switch:
# Please refer to the documentation in the bird-doc package or BIRD User's
# Guide on http://bird.network.cz/ for more information on configuring BIRD and
# adding routing protocols.

# Change this into your BIRD router ID. It's a world-wide unique identification
# of your router, usually one of router's IPv4 addresses.
router id 10.0.0.136;

# The Kernel protocol is not a real routing protocol. Instead of communicating
# with other routers in the network, it performs synchronization of BIRD's
# routing tables with the OS kernel.
protocol kernel {
  learn;
  scan time 2;
  import all;
  export all;   # Actually insert routes into the kernel routing table
}

# The Device protocol is not a real routing protocol. It doesn't generate any
# routes and it only serves as a module for getting information about network
# interfaces from the kernel. 
protocol device {
  scan time 60;
}

# Create a new table (disconnected from kernel/master) for peering routes
table peers;

# Create BGP sessions with peers
protocol bgp peer_65134 {
  table peers;
  igp table master;
  local as 65136;
  neighbor 10.0.0.134 as 65134;
  import all;
  export all;
}

protocol bgp peer_65135 {
  table peers;
  igp table master;
  local as 65136;
  neighbor 10.0.0.135 as 65135;
  import all;
  export all;
}

# Copy default route from peers table to master table
protocol pipe {
  table peers;
  peer table master;
  import none;
  export filter {
     if net ~ [ 0.0.0.0/0 ] then accept;
     reject;
  };
}

# Reflect peers table to sFlow-RT
protocol bgp to_sflow_rt {
  table peers;
  igp table master;
  local as 65136;
  neighbor 127.0.0.1 port 1179 as 65136;
  rr client;
  import all;
  export all;
}

# Receive active prefixes from sFlow-RT
protocol bgp from_sflow_rt {
  local as 65136;
  neighbor 10.0.0.136 port 1179 as 65136;
  import all;
  export none;
}
The open source Active Route Manager (ARM) application has been installed in sFlow-RT and the following sFlow-RT configuration, /usr/local/sflow-rt/conf.d/sflow-rt.conf, enables the BGP route reflector and control sessions with Bird:
bgp.start=yes
arm.reflector.ip=127.0.0.1
arm.reflector.as=65136
arm.reflector.id=0.0.0.1
arm.sflow.ip=10.0.0.136
arm.target.ip = 10.0.0.136
arm.target.as=65136
arm.target.id=0.0.0.2
arm.target.prefixes=10000
Once configured, operation is entirely automatic. As soon as traffic starts flowing to a new route, the route is identified and installed in the ASIC. If the route later becomes inactive, it is automatically removed from the ASIC to be replaced with a different active route. In this case, the maximum number of routes allowed in the ASIC has been specified as 10,000. This number can be changed to reflect the capacity of the hardware.
The Active Route Manager application has a web interface that provides up to the second visibility into the number of routes, routes installed in hardware, amount of traffic, hardware and software resource utilization etc. In addition, the sFlow-RT REST API can be used to make additional queries.

Wednesday, July 6, 2016

Network, host, and application monitoring for Amazon EC2

Microservices describes how visibility into network traffic is the key to monitoring, managing and securing applications that are composed of large numbers of communicating services running in virtual machines or containers.

Amazon Virtual Private Cloud (VPC) Flow Logs can be used to monitor network traffic:
However, there are limitations on the types of traffic that are logged, a 10-15 minute delay in accessing flow records, and costs associated with using VPC and storing the logs in CloudWatch (currently $0.50 per GB ingested, $0.03 per GB archived per month, and possible addition Data Transfer OUT charges).

In addition, collecting basic host metrics at 1 minute granularity using CloudWatch is an additional $3.50 per instance per month.

The open source Host sFlow agent offers an alternative:
  1. Lightweight, requiring minimal CPU and memory on EC2 instances.
  2. Real-time, up to the second network visibility
  3. Efficient, export of extensive set of host metrics every 10-60 seconds (configurable).
This article will demonstrate how to install Host sFlow on an Amazon Linux instance:
$ cat /etc/issue
Amazon Linux AMI release 2016.03
The following commands build the latest version of the Host sFlow agent from sources:
yum install libcap-devel libpcap-devel
git clone https://github.com/sflow/host-sflow
cd host-sflow
make
sudo make install
You can also make an RPM package (make rpm) so that the Host sFlow agent can be installed on additional EC2 instances without compiling.

Edit the Host sFlow configuration file, /etc/hsflowd.conf, to specify an sFlow collector, sampling rate, polling interval, and interface(s) to monitor:
sflow {
  agent=eth0
  DNSSD=off
  polling=20
  sampling=400
  collector { ip = 10.117.46.49 }
  pcap { dev=eth0 }
}
Note: The same configuration file can be used for all EC2 instances.

Finally, start the Host sFlow daemon:
sudo service hsflowd start
The above steps are easily automated using Puppet, Chef, Ansible, etc. to deploy Host sFlow agents on all your EC2 instances.

There are a variety of open source and commercial software packages listed on sFlow.org that can be used to analyze and the telemetry stream. The sFlow-RT analyzer has APIs that provide similar functionality to the Amazon VPC and CloudWatch APIs, but with sub-second response times.
The diagram shows how the sFlow-RT real-time analytics engine receives a continuous telemetry stream from sFlow instrumentation build into network, server and application infrastructure and delivers analytics through APIs and can easily be integrated with a wide variety of on-site and cloud, orchestration, DevOps and Software Defined Networking (SDN) tools.

Download and install sFlow-RT in an EC2 instance. The following articles provide examples of integrations:
Industry standard sFlow is easily deployed, highly scaleable, and provides a low cost, low latency, alternative to Amazon VPC flow logging for gaining visibility into EC2 microservice deployments. Using sFlow for visibility allows a common monitoring technology to be used in public, private and hybrid cloud deployments, and to extend visibility into physical and virtual networks.

Friday, July 1, 2016

Real-time BGP route analytics

The diagram shows how sFlow-RT real-time analytics software can combine BGP route information and sFlow telemetry to generate route analytics. Merging sFlow traffic with BGP route data significantly enhances both data streams:
  1. sFlow real-time traffic data identifies active BGP routes
  2. BGP path attributes are available in flow definitions
The following example demonstrates how to configure sFlow / BGP route analytics. In this example, the switch IP address is 10.0.0.253, the router IP address is 10.0.0.254, and the sFlow-RT address is 10.0.0.162.

Setup

First download sFlow-RT. Next create a configuration file, bgp.js, in the sFlow-RT home directory with the following contents:
var reflectorIP  = '10.0.0.254';
var myAS         = '65162';
var myID         = '10.0.0.162';
var sFlowAgentIP = '10.0.0.253';

// allow BGP connection from reflectorIP
bgpAddNeighbor(reflectorIP,myAS,myID);

// direct sFlow from sFlowAgentIP to reflectorIP routing table
// calculate a 60 second moving average byte rate for each route
bgpAddSource(sFlowAgentIP,reflectorIP,60,'bytes');
The following sFlow-RT System Properties load the configuration file and enable BGP:
  • script.file=bgp.js
  • bgp.start=yes
Start sFlow-RT and the following log lines will confirm that BGP has been enabled and configured:
$ ./start.sh 
2016-06-28T13:14:34-0700 INFO: Listening, BGP port 1179
2016-06-28T13:14:35-0700 INFO: Listening, sFlow port 6343
2016-06-28T13:14:35-0700 INFO: Starting the Jetty [HTTP/1.1] server on port 8008
2016-06-28T13:14:35-0700 INFO: Starting com.sflow.rt.rest.SFlowApplication application
2016-06-28T13:14:35-0700 INFO: Listening, http://localhost:8008
2016-06-28T13:14:36-0700 INFO: bgp.js started
2016-06-28T13:14:36-0700 INFO: bgp.js stopped
Configure the switch (10.0.0.253) to send sFlow to the sFlow-RT instance(10.0.0.162), see Switch configurations for vendor specific configurations. Check the sFlow-RT /agents/html page to verify that sFlow telemetry is being received from the agent.

Next, configure the router (10.0.0.254) to reflect BGP routes to the sFlow-RT instance (10.0.0.162):
router bgp 65254
 bgp router-id 10.0.0.254
 neighbor 10.0.0.162 remote-as 65162
 neighbor 10.0.0.162 port 1179
 neighbor 10.0.0.162 timers connect 30
 neighbor 10.0.0.162 route-reflector-client
 neighbor 10.0.0.162 activate
The following sFlow-RT log entry confirms that a BGP session has been established:
2016-06-28T13:20:17-0700 INFO: BGP open 10.0.0.254 53975

Query active routes

The following cURL command uses the REST API to identify the top 5 IPv4 prefixes ranked by traffic (measured in bytes/second):
curl "http://10.0.0.162:8008/bgp/topprefixes/10.0.0.254/json?maxPrefixes=5
{
 "as": 65254,
 "direction": "destination",
 "id": "10.0.0.254",
 "learnedPrefixesAdded": 691838,
 "learnedPrefixesRemoved": 0,
 "nPrefixes": 691838,
 "pushedPrefixesAdded": 0,
 "pushedPrefixesRemoved": 0,
 "startTime": 1467322582093,
 "state": "established",
 "topPrefixes": [
  {
   "aspath": "NNNN-NNNN-NNNNN-NNNNN",
   "localpref": 100,
   "med": 1,
   "nexthop": "NNN.NNN.NNN.N",
   "origin": "IGP",
   "prefix": "NN.NNN.NN.0/24",
   "value": 9.735462342126082E7
  },
  {
   "aspath": "NNN-NNNN",
   "localpref": 100,
   "med": 1,
   "nexthop": "NNN.NNN.NNN.N",
   "origin": "IGP",
   "prefix": "NN.NNN.NNN.0/24",
   "value": 7.347515546153101E7
  },
  {
   "aspath": "NNNN-NNNNNN-NNNNN",
   "localpref": 100,
   "med": 1,
   "nexthop": "NNN.NNN.NNN.N",
   "origin": "IGP",
   "prefix": "NN.NNN.NN.N/24",
   "value": 4.26137765317916E7
  },
  {
   "aspath": "NNNN-NNNN-NNNN",
   "localpref": 100,
   "med": 1,
   "nexthop": "NNN.NNN.NNN.N",
   "origin": "IGP",
   "prefix": "NNN.NN.NNN.0/24",
   "value": 2.6633190792947102E7
  },
  {
   "aspath": "NNNN-NNN-NNNNN",
   "localpref": 100,
   "med": 10001,
   "nexthop": "NNN.NNN.NNN.NN",
   "origin": "IGP",
   "prefix": "NN.NNN.NNN.0/24",
   "value": 1.5500941476103483E7
  }
 ],
 "valuePercentCoverage": 71.38452058755995,
 "valueTopPrefixes": 2.55577687683634E8,
 "valueTotal": 3.5802956380458355E8
}
In addition to returning the top prefixes, the query returns information about the amount of traffic covered by these prefixes. In this case, the valuePercentageCoverage of 71.38 indicates that 71.38% of the traffic is covered by the top 5 prefixes.
Note: Identifying numeric digits have been substituted with the letter N to protect privacy.
Additional arguments can be used to refine the top prefixes query:
  • maxPrefixes, maximum number of prefixes in the result 
  • minValue, only include entries with a value greater than the threshold
  • direction, specify "ingress" for traffic arriving from remote networks and "egress" for traffic destined for remote networks
  • minPrefix, exclude shorter prefixes, e.g. minPrefix=1 would exclude 0.0.0.0/0.
  • includeCovered, set to "true" to also include prefixes that are covered by the top prefix, but wouldn't otherwise make the list. For example, if 10.1.0.0/16 was included, then 10.1.3.0/24 would also be included if it were in the set of prefixes advertised by the router.
  • pruneCovered, set to "true" to eliminate covered prefixes that share the same next hop.
IPv6 prefixes an be queried using /bgp/topprefixes6/{router}/json, which takes the same arguments as the topprefixes query shown above.

Writing Applications, describes how to build analytics driven controller applications using sFlow-RT's REST and embedded JavaScript APIs. For example, SDN router using merchant silicon top of rack switchWhite box Internet router PoC, and Active Route Manager demonstrate how real-time identification of active routes can be used to efficiently manage limited hardware resources in commodity white box switches in order to handle a full Internet routing table of over 600,000 routes.

Defining Flows

The following flow attributes learned from the BGP session are merged with sFlow data received from switch 10.0.0.253:
  • ipsourcemaskbits
  • ipdestinationmaskbits
  • bgpnexthop
  • bgpnexthop6
  • bgpas
  • bgpsourceas
  • bgpsourcepeeras
  • bgpdestinationas
  • bgpdestinationpeeras
  • bgpdestinationaspath
  • bgpcommunities
  • bgplocalpref
The sFlow-RT /flowkeys/html page can be queried to verify that the attributes have been merged and to see the full set of attributes that are available from the sFlow feed.

Writing Applications describes how to program sFlow-RT flow caches, using the flow keys to select and identify traffic flows. For example, the following Python script uses the REST API to identify the source networks associated with a UDP amplification DDoS attack:
#!/usr/bin/env python
import requests
import json

// DNS port
reflector_port = '53'
max_pps = 100000

rest = 'http://localhost:8008'

# define flow
flow = {'keys':'mask:ipsource,bgpsourceas',
 'filter':'udpsourceport='+reflector_port,
 'value':'frames'}
requests.put(rest+'/flow/ddos/json',data=json.dumps(flow))

# set threshold
threshold = {'metric':'ddos', 'value': max_pps, 'byFlow':True}
requests.put(rest+'/threshold/ddos/json',data=json.dumps(threshold))

# tail even log
eventurl = rest+'/events/json?thresholdID=ddos&maxEvents=10&timeout=60'
eventID = -1
while 1 == 1:
  r = requests.get(eventurl + "&eventID=" + str(eventID))
  if r.status_code != 200: break
  events = r.json()
  if len(events) == 0: continue

  eventID = events[0]["eventID"]
  events.reverse()
  for e in events:
    print e['flowKey']
Running the script generates a log of the source network and AS number that exceed 100,000 packets per second of DNS response traffic (again, identifying numeric digits have been substituted with the letter N to protect privacy):
$ ./ddos.py 
NNN.NNN.0.0/13,NNNN
NNN.NNN.NNN.NNN/27,NNNN
NNN.NN.NNN.NNN/28,NNNNN
NNN.NNN.NN.0/24,NNNNN
A variation on the script can be used to identify large "Elephant" flows and their destination AS paths (showing the list of networks that packets traverse en route to their destination):
#!/usr/bin/env python
import requests
import json

max_Bps = 1000000000/8

rest = 'http://localhost:8009'

# define flow
flow = {
 'keys':'ipsource,ipdestination,tcpsourceport,tcpdestinationport,bgpdestinationaspath',
 'value':'bytes'}
requests.put(rest+'/flow/elephant/json',data=json.dumps(flow))

# set threshold
threshold = {'metric':'elephant', 'value': max_Bps, 'byFlow':True}
requests.put(rest+'/threshold/elephant/json',data=json.dumps(threshold))

# tail even log
eventurl = rest+'/events/json?thresholdID=elephant&maxEvents=10&timeout=60'
eventID = -1
while 1 == 1:
  r = requests.get(eventurl + "&eventID=" + str(eventID))
  if r.status_code != 200: break
  events = r.json()
  if len(events) == 0: continue

  eventID = events[0]["eventID"]
  events.reverse()
  for e in events:
    print e['flowKey']
Running the script generates real-time notification of the Elephant flows (flows exceeding 1Gbit/s) along with their destination AS paths:
$ ./elephant.py 
NNN.NN.NN.NNN,NNN.NNN.NN.NN,60789,25,NNNNN
NNN.NN.NNN.NN,NNN.NN.NN.NNN,443,38016,NNNNN-NNNNN-NNNNN-NNNNN
NN.NNN.NNN.NNN,NNN.NNN.NN.NN,37030,10059,NNNN-NNN-NNNN
NNN.NN.NN.NNN,NN.NN.NNN.NNN,34611,25,NNNN
SDN and large flows describes how a small number of Elephant flows typically consume most of the bandwidth, even though they are greatly outnumbered by small (Mice) flows. Dynamic policy based routing can targeted at Elephant flows to significantly improve performance and manage network resources: Leaf and spine traffic engineering using segment routing and SDN and WAN optimization using real-time traffic analytics are two examples.
Finally, the real-time BGP analytics don't exist in isolation. The diagram shows how the sFlow-RT real-time analytics engine receives a continuous telemetry stream from sFlow instrumentation build into network, server and application infrastructure and delivers analytics through APIs and can easily be integrated with a wide variety of on-site and cloud, orchestration, DevOps and Software Defined Networking (SDN) tools.

Wednesday, June 29, 2016

Configuring OpenSwitch

The following configuration enables sFlow monitoring of all interfaces on a white box switch running the OpenSwitch operating system, sampling packets at 1-in-4096, polling counters every 20 seconds and sending the sFlow to an analyzer (10.0.0.50) on UDP port 6343 (the default sFlow port):
switch(config)# sflow collector 10.0.0.50
switch(config)# sflow sampling 4096
switch(config)# sflow polling 20
switch(config)# sflow enable
A previous posting discussed the selection of sampling rates.  Additional information can be found in the OpenSwitch sFlow User Guide.

See Trying out sFlow for suggestions on getting started with sFlow monitoring and reporting.

Thursday, June 16, 2016

Cisco Tetration analytics

Cisco Tetration Analytics: the most Comprehensive Data Center Visibility and Analysis in Real Time, at Scale, June 15, 2016, announced the new Cisco Tetration Analytics platform. The platform collects telemetry from proprietary agents on servers and embedded in hardware on certain Nexus 9k switches, analyzes the data, and presents results via Web GUI, REST API, and as events.

Cisco Tetration Analytics Data Sheet describes the hardware requirements:
Platform Hardware
Quantity
Cisco Tetration Analytics computing nodes (servers)
16
Cisco Tetration Analytics base nodes (servers)
12
Cisco Tetration Analytics serving nodes (servers)
8
Cisco Nexus 9372PX Switches
3

And the power requirements:
Property
Cisco Tetration Analytics Platform
Peak power for Cisco Tetration Analytics Platform (39-RU single-rack option)
22.5 kW
Peak power for Cisco Tetration Analytics Platform (39-RU dual-rack option)
11.25 kW per rack (22.5 KW Total)

No pricing is given, but based on the hardware, data center space, power and cooling requirements, this brute force approach to analytics will be reassuringly expensive to purchase and operate.
A much less expensive alternative is to use industry standard sFlow agents embedded in Cisco Nexus 9k/3k switches and in switches from over 40 other vendors. The open source Host sFlow agent extends visibility to servers and applications by streaming telemetry from Linux, Windows, FreeBSD, Solaris, and AIX operating system, hypervisors, Docker containers, web servers (Apache, NGINX, Tomcat, HAproxy) and Java application servers.

The diagram shows how the sFlow-RT real-time analytics engine receives a continuous telemetry stream from sFlow instrumentation build into network, server and application infrastructure and delivers analytics through APIs and can easily be integrated with a wide variety of on-site and cloud, orchestration, DevOps and Software Defined Networking (SDN) tools.

Minimizing cost of visibility describes why lightweight monitoring is critical to realizing the value that telemetry can bring to improving operational efficiency. In the case of the sFlow based solution, the critical data path instrumentation is built into the switch ASICs and in the Linux kernel, ensuring that there is negligible impact on operational performance.

The sFlow-RT analytics software shown in the diagram provides real-time (sub second) visibility for 5,000 unique end points (Virtual Machines or Bare metal server), the upper limit of scaleability in the Tetration data sheet, using a single virtual machine or Docker container with 4 GBytes of RAM and 4 CPU cores. With additional memory and CPU the solution easily scales to 100,000 unique end points.
How can sFlow provide real-time visibility at scale and consume so few resources? Shrink ray describes how advanced statistical techniques are used to select and analyze measurements that capture the essential features of network and system performance. A statistical approach yields fast, accurate answers, while minimizing the resources required to measure, transport and analyze the data.
The sFlow-RT analytics platform was selected as an example because of the overlap in capabilities with the Cisco Tetration analytics platform. However, sFlow is non-proprietary and there are many other open source and commercial sFlow analytics solutions listed on sFlow.org.

The Cisco press release states, "Available in July 2016, the first Tetration platform will be a full rack appliance that is deployed on-premise at the customer’s data center." On the other hand, the sFlow based solution described here is available today and can be installed and running in minutes on a virtual machine or in a Docker container.