Tuesday, November 19, 2019

SC19 SCinet: Grafana network traffic dashboard

The Grafana sFlow-RT Countries and Networks dashboard above shows traffic on the SCinet network, described as the fastest, most powerful volunteer-built network in the world. The network is build each year to support The International Conference for High Performance Computing, Networking, Storage, and Analysis. The SC19 conference is currently underway in Denver, Colorado and the screen capture is live data from the conference network.
The high speed switches and routers used to construct the SCinet network support industry standard sFlow streaming telemetry. In this case an instance of the sFlow-RT analytics engine receives the telemetry stream and generates flow analytics that are scraped every 15 seconds by an instance of the Prometheus time series database. The Prometheus database is in turn queried by an instance of Grafana which generated the dashboard shown at the top of the page.
In addition, sFlow-RT is running an embedded application that generates a real-time, up to the second, view of the traffic over the last 5 minutes.
This solution is extremely scalable. A single sFlow-RT instance, allocated only 1G of memory, easily monitors 158 network devices, while supporting 11 different applications (including the real-time dashboard and Prometheus export applications shown above).

Wednesday, October 30, 2019

Observability in Data Center Networks


Observability in Data Center Networks: In this session, you’ll learn how the sFlow protocol provides broad visibility in modern data center environments as they migrate to highly meshed topologies. Our data center workloads are shifting to take advantage of higher speeds and bandwidth, so visibility to east-west traffic within the data center is becoming more important. Join Peter Phaal—one of the inventors of sFlow—and Joe Reves from SolarWinds product management as they discuss how sFlow differs from other flow instrumentation to deliver visibility in the switching fabric.
THWACKcamp is SolarWinds’ free, annual, worldwide virtual IT learning event connecting thousands of skilled IT professionals with industry experts and SolarWinds technical staff. This video was one of the sessions.

Wednesday, October 9, 2019

InfluxDB 2.0

Introducing the Next-Generation InfluxDB 2.0 Platform mentions that InfluxDB 2.0 will be able to scrape Prometheus exporters. Get started with InfluxDB provides instructions for running an alpha version of the new software using Docker:
docker run --name influxdb -p 9999:9999 quay.io/influxdb/influxdb:2.0.0-alpha
Prometheus exporter describes an application that runs on the sFlow-RT analytics platform that converts real-time streaming telemetry from industry standard sFlow agents. Host, Docker, Swarm and Kubernetes monitoring describes how to deploy agents on popular container orchestration platforms.
The screen capture above shows three scrapers configured in InfluxDB 2.0:
  1. sflow-rt-analyzer,
    URL: http://10.0.0.70:8008/prometheus/analyzer/txt
  2. sflow-rt-dump,
    URL: http://10.0.0.70:8008/prometheus/metrics/ALL/ALL/txt
  3. sflow-rt-flow-src-dst,
    URL: http://10.0.0.70:8008/app/prometheus/scripts/export.js/flows/ALL/txt?metric=flow_src_dst_bps&key=ipsource,ipdestination&value=bytes&aggMode=max&maxFlows=100&minValue=1000&scale=8
The first collects metrics about the performance of the sFlow-RT analytics engine, the second, all the metrics exported by the sFlow agents, and the third, is a flow metric (see Flow metrics with Prometheus and Grafana).

Updated 19 October 2019, native support for Prometheus export added to sFlow-RT, URLs 1 and 2 modified to reflect new API.
InfluxDB 2.0 now includes the data exploration and dashboard building capabilities that were previously in the separate Chronograf application. The screen capture above shows a simple chart trending ifinoctets across a number of switch ports.

Note: There are a number of articles on this blog that demonstrate how to push metrics from sFlow-RT into InfluxDB 1.0 using its REST API. The ability to scrape metrics from a Prometheus exporter simplifies the integration.

Tuesday, October 1, 2019

Flow metrics with Prometheus and Grafana

The Grafana dashboard above shows real-time network traffic flow metrics. This article describes how to define and collect flow metrics using the Prometheus time series database and build Grafana dashboards using those metrics.
Prometheus exporter describes an application that runs on the sFlow-RT analytics platform that converts real-time streaming telemetry from industry standard sFlow agents. Host, Docker, Swarm and Kubernetes monitoring describes how to deploy agents on popular container orchestration platforms.

The latest version of the Prometheus exporter application adds flow export.
global:
  scrape_interval:     15s
  evaluation_interval: 15s

rule_files:
  # - "first.rules"
  # - "second.rules"

scrape_configs:
  - job_name: 'sflow-rt-metrics'
    metrics_path: /prometheus/metrics/ALL/ALL/txt
    static_configs:
      - targets: ['10.0.0.70:8008']
  - job_name: 'sflow-rt-src-dst-bps'
    metrics_path: /app/prometheus/scripts/export.js/flows/ALL/txt
    static_configs:
      - targets: ['10.0.0.70:8008']
    params:
      metric: ['ip_src_dst_bps']
      key: ['ipsource','ipdestination']
      label: ['src','dst']
      value: ['bytes']
      scale: ['8']
      minValue: ['1000']
      maxFlows: ['100']
  - job_name: 'sflow-rt-countries-bps'
    metrics_path: /app/prometheus/scripts/export.js/flows/ALL/txt
    static_configs:
      - targets: ['10.0.0.70:8008']
    params:
      metric: ['ip_countries_bps']
      key: ['null:[country:ipsource]:unknown','null:[country:ipdestination]:unknown']
      label: ['src','dst']
      value: ['bytes']
      scale: ['8']
      aggMode: ['sum']
      minValue: ['1000']
      maxFlows: ['100']
The above prometheus.yml file extends the previous example to add two additional scrape jobs, sflow-rt-src-dst-bps and sflow-rt-countries-bps, that return flow metrics. Defining flows describes the attributes and settings available to build a flow definition. The metric: setting names the Prometheus metric and the label: setting is used to map corresponding sFlow-RT flow keys into Prometheus labels.

Updated 19 October 2019, native support for Prometheus export added to sFlow-RT, sflow-rt-metrics job modified reflect new API.
The first step in building a Grafana dashboard panel to display flow data is to construct a query:
topk(10, sum(ip_src_dst_bps) by (src))
In this case, the query sums the flows by source address and return the top 10 values for each interval in the graph.

The query for the Top Source Countries chart is a little more complex:
topk(10,sum(ip_countries_bps{src!="unknown"}) by (src))
In this case unknown source country values (the value set in the prometheus.yml file to use when a country lookup fails on an ipsource address) are excluded in the query.
In the visualization settings, Null value: null as zeroTooltip Mode: Single, label the Left Y axis, and Legend Show disabled.
Finally, give the chart a title.
The Prometheus exporter application on sFlow-RT (accessible on port 8008) has a REST API explorer, above, that can be used to experiment with flow settings before configuring a Prometheus scraper job. When testing the settings, the first query will not return any data since the flow hasn't been programmed. Click the Execute button a second time to see data. Also consider using the sflow/flow-trend application as a way to gain familiarity with sFlow-RT's flow analytics engine.

Wednesday, September 25, 2019

Host, Docker, Swarm and Kubernetes monitoring

The open source Host sFlow agent incorporates technologies that address the challenges of microservice monitoring; leveraging recent enhancements to Berkeley Packet Filter (BPF) in the Linux kernel to randomly sample packets, and  Asynchronous Docker metrics to track rapidly changing workloads. The continuous stream of real-time telemetry from all compute nodes, transported using the industry standard sFlow protocol, provides comprehensive real-time cluster-wide visibility into all services and the traffic flowing between them.

The Host sFlow agent is available as pre-packaged rpm/deb files that can be downloaded and installed on each node in a cluster.
sflow {
  collector { ip=10.0.0.70 }
  docker { }
  pcap { dev=docker0 }
  pcap { dev=docker_gwbridge } 
}
The above /etc/hsflowd.conf file, see Configuring Host sFlow for Linux via /etc/hsflowd.conf, enables the docker {} and pcap {} modules for detailed visibility into container metrics and network traffic flows, and streams telemetry to an sFlow collector (10.0.0.70). The configuration is the same for every node making it simple to install and configure Host sFlow on all nodes using orchestration software such as Puppet, Chef, Ansible, etc.

The agent is also available as the pre-build sflow/host-sflow image, providing a simple method of instrumenting nodes running container workloads.
docker run \
--detach \
--name=host-sflow \
--env "COLLECTOR=10.0.0.70" \
--net=host \
--volume /var/run/docker.sock:/var/run/docker.sock:ro \
sflow/host-sflow
Execute above command to install and run the Host sFlow agent on a Docker node.
docker service create \
--mode global \
--name host-sflow \
--env "COLLECTOR=10.0.0.70" \
--network host \
--mount type=bind,src=/var/run/docker.sock,dst=/var/run/docker.sock,readonly \
sflow/host-sflow
Install and run an instance of the Host sFlow agent on each node in a Docker Swarm cluster.

Deploying Host sFlow under Kubernetes is a little more complicated.
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: host-sflow
spec:
  selector:
    matchLabels:
      name: host-sflow
  template:
    metadata:
      labels:
        name: host-sflow
    spec:
      hostNetwork: true
      containers:
      - name: host-sflow
        image: sflow/host-sflow:latest
        env:
          - name: COLLECTOR
            value: "10.0.0.70"
          - name: NET
            value: "host"
        volumeMounts:
          - mountPath: /var/run/docker.sock
            name: docker-sock
            readOnly: true
      volumes:
        - name: docker-sock
          hostPath:
            path: /var/run/docker.sock
First, create a deployment description file like the host-sflow.yml file above.
kubectl apply -f host-sflow.yml
Install and run an instance of the Host sFlow agent on each node in the Kubernetes cluster.
docker run -p 6343:6343/udp sflow/sflowtool
Run the command above on the collector (10.0.0.70) to verify that sFlow is arriving, see Running sflowtool using Docker.
docker run -p 6343:6343/udp -p 8008:8008 sflow/sflow-rt
Run the sflow/sflow-rt image to access real-time cluster performance metrics and network traffic flows through a REST API. Forwarding using sFlow-RT describes how to copy sFlow telemetry streams for additional tools.
Install sFlow-RT applications to export metrics to Prometheus, block DDoS attacks, visualize flows, etc. Writing Applications describes how to use APIs to build your own applications to integrate analytics with automation and monitoring tools.

Monday, September 9, 2019

Packet analysis using Docker

Why use sFlow for packet analysis? To rephrase the Heineken slogan, sFlow reaches the parts of the network that other technologies cannot reach. Industry standard sFlow is widely supported by switch vendors, embedding wire-speed packet monitoring throughout the network. With sFlow, any link or group of links can be remotely monitored. The alternative approach of physically attaching a probe to a SPAN/Mirror port is becoming much less feasible with increasing network sizes (10's of thousands of switch ports) and link speeds (10, 100, and 400 Gigabits). Using sFlow for packet capture doesn't replace traditional packet analysis, instead sFlow extends the capabilities of existing packet capture tools into the high speed switched network.

This article describes the sflow/tcpdump  and sflow/tshark Docker images, which provide a convenient way to analyze packets captured using sFlow.

Run the following command to analyze packets using tcpdump:
$ docker run -p 6343:6343/udp -p 8008:8008 sflow/tcpdump

19:06:42.000000 ARP, Reply 10.0.0.254 is-at c0:ea:e4:89:b0:98 (oui Unknown), length 64
19:06:42.000000 IP 10.0.0.236.548 > 10.0.0.70.61719: Flags [P.], seq 3380015689:3380015713, ack 515038158, win 41992, options [nop,nop,TS val 1720029042 ecr 904769627], length 24
19:06:42.000000 IP 10.0.0.236.548 > 10.0.0.70.61719: Flags [P.], seq 149816:149832, ack 510628, win 41992, options [nop,nop,TS val 1720029087 ecr 904770068], length 16
19:06:42.000000 IP 10.0.0.236.548 > 10.0.0.70.61719: Flags [P.], seq 149816:149832, ack 510628, win 41992, options [nop,nop,TS val 1720029087 ecr 904770068], length 16
The normal tcpdump options can be used. For example, to select DNS packets:
$ docker run -p 6343:6343/udp -p 8008:8008 sflow/tcpdump -vv port 53
reading from file -, link-type EN10MB (Ethernet)
19:08:49.000000 IP (tos 0x0, ttl 64, id 22316, offset 0, flags [none], proto UDP (17), length 65)
    10.0.0.70.43801 > dns.google.53: [udp sum ok] 35941+ A? clients2.google.com. (37)
19:09:00.000000 IP (tos 0x0, ttl 255, id 16813, offset 0, flags [none], proto UDP (17), length 66)
    10.0.0.64.50675 > 10.0.0.1.53: [udp sum ok] 57874+ AAAA? p49-imap.mail.me.com. (38)
The following command selects TCP SYN packets:
$ docker run -p 6343:6343/udp sflow/tcpdump 'tcp[tcpflags] == tcp-syn'
reading from file -, link-type EN10MB (Ethernet)
19:10:37.000000 IP 10.0.0.30.46786 > 10.0.0.162.1179: Flags [S], seq 2993962362, win 29200, options [mss 1460,sackOK,TS val 20531427 ecr 0,nop,wscale 9], length 0
Capture 10 packets to a file and then exit:
$ docker run -v $PWD:/pcap -p 6343:6343/udp sflow/tcpdump -w /pcap/packets.pcap -c 10
reading from file -, link-type EN10MB (Ethernet)
A tcpdump Tutorial with Examples — 50 Ways to Isolate Traffic provides an overview of the capabilities of tcpdump with useful examples.

Run the following command to analyze packets using tshark - a terminal based version of Wireshark:
$ docker run -p 6343:6343/udp -p 8008:8008 sflow/tshark
Capturing on '-'
    1   0.000000   10.0.0.236 → 10.0.0.70    AFP 1518 [Reply without query?]
    2   0.000000   10.0.0.236 → 10.0.0.70    AFP 1518 [Reply without query?]
    3   0.000000   10.0.0.114 → 10.0.0.72    SSH 1518 Server: Encrypted packet (len=1448)
Packets can be filtered using Display Filters. For example, the following command selects DNS traffic:
$ docker run -p 6343:6343/udp -p 8008:8008 sflow/tshark -Y 'dns'
Capturing on '-'
  328  22.000000      8.8.8.8 → 10.0.0.70    DNS 136 Standard query response 0xfce4 AAAA img.youtube.com CNAME ytimg.l.google.com AAAA
  472  36.000000    10.0.0.52 → 10.0.0.1     DNS 79 Standard query 0x173e AAAA www.nytimes.com
Print ip source, destination, protocol and packet lengths:
$ docker run -p 6343:6343/udp -p 8008:8008 sflow/tshark -T fields -e ip.src -e ip.dst -e ip.proto -e ip.len
Capturing on '-'
10.0.0.70 10.0.0.236 6 1500
10.0.0.236 10.0.0.70 6 52
10.0.0.70 10.0.0.236 6 1500
10.0.0.236 10.0.0.70 6 52
10.0.0.70 10.0.0.236 6 1500
Capture 100 packets and print summary of the protocols:
$ docker run -p 6343:6343/udp -p 8008:8008 sflow/tshark -q -z io,phs -c 100
Capturing on '-'
100 packets captured

===================================================================
Protocol Hierarchy Statistics
Filter: 

eth                                      frames:100 bytes:85721
  ip                                     frames:99 bytes:85657
    tcp                                  frames:97 bytes:85119
      dsi                                frames:61 bytes:82122
        _ws.short                        frames:54 bytes:77180
        afp                              frames:6 bytes:4856
          _ws.short                      frames:5 bytes:4766
      _ws.short                          frames:15 bytes:1050
      http                               frames:1 bytes:499
        _ws.short                        frames:1 bytes:499
      iscsi                              frames:1 bytes:118
        iscsi.flags                      frames:1 bytes:118
          scsi                           frames:1 bytes:118
            _ws.short                    frames:1 bytes:118
    ipv6                                 frames:2 bytes:538
      tcp                                frames:2 bytes:538
        tls                              frames:2 bytes:538
          _ws.short                      frames:2 bytes:538
  arp                                    frames:1 bytes:64
    _ws.short                            frames:1 bytes:64
===================================================================
Capture 100 packets and print a summary of the IP traffic by address:
$ docker run -p 6343:6343/udp -p 8008:8008 sflow/tshark -q -z endpoints,ip -c 100
Capturing on '-'
100 packets captured

================================================================================
IPv4 Endpoints
Filter:
                       |  Packets  | |  Bytes  | | Tx Packets | | Tx Bytes | | Rx Packets | | Rx Bytes |
10.0.0.70                     95         81713         44           25507          51           56206   
10.0.0.236                    91         80820         50           55956          41           24864   
10.0.0.30                      6          2369          2            1508           4             861   
10.0.0.16                      1           587          1             587           0               0   
10.0.0.28                      1           587          0               0           1             587   
10.0.0.160                     1          1258          0               0           1            1258   
10.0.0.172                     1           218          1             218           0               0   
================================================================================
The following command prints packet decodes as JSON:
$ docker run -p 6343:6343/udp -p 8008:8008 sflow/tshark -T json
Capturing on '-'
[
  {
    "_index": "packets-2019-09-06",
    "_type": "pcap_file",
    "_score": null,
    "_source": {
      "layers": {
        "frame": {
          "frame.interface_id": "0",
          "frame.interface_id_tree": {
            "frame.interface_name": "-"
          },
          "frame.encap_type": "1",
          "frame.time": "Sep  6, 2019 19:41:12.000000000 UTC",
          "frame.offset_shift": "0.000000000",
          "frame.time_epoch": "1567798872.000000000",
          "frame.time_delta": "0.000000000",
          "frame.time_delta_displayed": "0.000000000",
          "frame.time_relative": "0.000000000",
          "frame.number": "1",
          "frame.len": "64",
          "frame.cap_len": "60",
          "frame.marked": "0",
          "frame.ignored": "0",
          "frame.protocols": "eth:ethertype:arp"
        },
        "eth": {
          "eth.dst": "70:10:6f:d8:13:30",
          "eth.dst_tree": {
            "eth.dst_resolved": "HewlettP_d8:13:30",
            "eth.addr": "70:10:6f:d8:13:30",
            "eth.addr_resolved": "HewlettP_d8:13:30",
            "eth.lg": "0",
            "eth.ig": "0"
          },
          "eth.src": "98:4b:e1:03:4a:61",
          "eth.src_tree": {
            "eth.src_resolved": "HewlettP_03:4a:61",
            "eth.addr": "98:4b:e1:03:4a:61",
            "eth.addr_resolved": "HewlettP_03:4a:61",
            "eth.lg": "0",
            "eth.ig": "0"
          },
          "eth.type": "0x00000806",
          "eth.padding": "00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"
        },
        "arp": {
          "arp.hw.type": "1",
          "arp.proto.type": "0x00000800",
          "arp.hw.size": "6",
          "arp.proto.size": "4",
          "arp.opcode": "1",
          "arp.src.hw_mac": "98:4b:e1:03:4a:61",
          "arp.src.proto_ipv4": "10.0.0.30",
          "arp.dst.hw_mac": "00:00:00:00:00:00",
          "arp.dst.proto_ipv4": "10.0.0.232"
        },
        "_ws.short": "[Packet size limited during capture: Ethertype truncated]"
      }
    }
  },
The tshark -T ek option formats the JSON output as a single line per packet making the output easy to parse in scripts. For example, the following emerging.py script downloads the Emerging Threats compromised IP address database, parses the JSON records, checks to see if source and destination addresses can be found in the database, and prints out information on any matches:
#!/usr/bin/env python

from sys import stdin
from json import loads
from requests import get

blacklist = set()
r = get('https://rules.emergingthreats.net/blockrules/compromised-ips.txt')
for line in r.iter_lines():
  blacklist.add(line)

for line in stdin:
  msg = loads(line)
  try:
    time = msg['timestamp']
    layers = msg['layers']
    ip = layers["ip"]
    src = ip["ip_ip_src"]
    dst = ip["ip_ip_dst"]
    if src in blacklist or dst in blacklist:
      print "%s %s %s" % (time,src,dst)
  except KeyError:
    pass
The following command runs the script:
$ docker run -p 6343:6343/udp -p 8008:8008 sflow/tshark -T ek | ./tshark.py
See the TShark man page for more options.

Forwarding using sFlow-RT describes how to set up and tear down sFlow streams using the sFlow-RT analytics engine. This is a simple way to direct a stream of sFlow to a desktop running sflowtool. For example, suppose sflowtool is running on host 10.0.0.30 and sFlow-RT is running on host 10.0.0.1, the following command would start a session:
curl -H "Content-Type:application/json" -X PUT --data '{"address":"10.0.0.30"}' \
http://10.0.0.1:8008/forwarding/tcpdump/json
and the following command would end the session:
curl -X DELETE http://10.0.0.1:8008/forwarding/tcpdump/json
Note: The sflow/sflow-rt Docker image is a convenient way to run sFlow-RT:
docker run -p 8008:8008 -p 6343:6343/udp sflow/sflow-rt
Finally, Triggered remote packet capture using filtered ERSPAN, shows how the broad visibility provided by sFlow can be combined with hardware filtering to trigger full packet capture of selected traffic.

Friday, September 6, 2019

Running sflowtool using Docker

The sflowtool command line utility is used to convert standard sFlow records into a variety of different formats. While there are a large number of native sFlow analysis applications, familiarity with sflowtool is worthwhile since it provides a simple way to verify receipt of sFlow, understand the contents of the sFlow telemetry stream, and build simple applications through custom scripting.

The sflow/sflowtool Docker image provides a simple way to run sflowtool. Run the following command to print the contents of sFlow packets:
$ docker run -p 6343:6343/udp sflow/sflowtool
startDatagram =================================
datagramSourceIP 10.0.0.111
datagramSize 144
unixSecondsUTC 1321922602
datagramVersion 5
agentSubId 0
agent 10.0.0.20
packetSequenceNo 3535127
sysUpTime 270660704
samplesInPacket 1
startSample ----------------------
sampleType_tag 0:2
sampleType COUNTERSSAMPLE
sampleSequenceNo 228282
sourceId 0:14
counterBlock_tag 0:1
ifIndex 14
networkType 6
ifSpeed 100000000
ifDirection 0
ifStatus 3
ifInOctets 4839078
ifInUcastPkts 15205
ifInMulticastPkts 0
ifInBroadcastPkts 4294967295
ifInDiscards 0
ifInErrors 0
ifInUnknownProtos 4294967295
ifOutOctets 149581962744
ifOutUcastPkts 158884229
ifOutMulticastPkts 4294967295
ifOutBroadcastPkts 4294967295
ifOutDiscards 101
ifOutErrors 0
ifPromiscuousMode 0
endSample   ----------------------
endDatagram   =================================
The -g option flattens the output so that it is more easily filtered using grep:
$ docker run -p 6343:6343/udp sflow/sflowtool -g | grep ifInOctets
2019-09-03T22:37:21+0000 10.0.0.231 0 3203000 0:6 0:2 0:1 ifInOctets 0
2019-09-03T22:37:23+0000 10.0.0.232 0 7242462 0:5 0:2 0:1 ifInOctets 53791415069
2019-09-03T22:37:23+0000 10.0.0.253 0 8178007 0:7 0:2 0:1 ifInOctets 31663763747
2019-09-03T22:37:23+0000 10.0.0.253 0 8178007 0:3 0:2 0:1 ifInOctets 1333603780050
2019-09-03T22:37:26+0000 10.0.0.253 0 8178008 0:1 0:2 0:1 ifInOctets 9116481296
The -L option prints out CSV records with the selected fields:
$ docker run -p 6343:6343/udp sflow/sflowtool -L agent,ifIndex,ifInOctets
10.0.0.253,23,432680126074
10.0.0.30,2,54056144719
10.0.0.253,21,3860664000830
10.0.0.253,3,1345269893416
10.0.0.253,2,1910370790761
The -J option prints out the decoded sFlow datagrams as JSON (with a blank line between each datagram):
$ docker run -p 6343:6343/udp sflow/sflowtool -J
{
 "datagramSourceIP":"172.17.0.1",
 "datagramSize":"1388",
 "unixSecondsUTC":"1567707952",
 "localtime":"2019-09-05T18:25:52+0000",
 "datagramVersion":"5",
 "agentSubId":"0",
 "agent":"10.0.0.253",
 "packetSequenceNo":"8254753",
 "sysUpTime":"165436226",
 "samplesInPacket":"8",
 "samples":[{
   "sampleType_tag":"0:1",
   "sampleType":"FLOWSAMPLE",
   "sampleSequenceNo":"2594544",
   "sourceId":"0:3",
   "meanSkipCount":"500",
   "samplePool":"1622164761",
   "dropEvents":"584479",
   "inputPort":"21",
   "outputPort":"3",
   "elements":[{
     "flowBlock_tag":"0:1",
     "flowSampleType":"HEADER",
     "headerProtocol":"1",
     "sampledPacketSize":"118",
     "strippedBytes":"4",
     "headerLen":"116",
...
The -j option formats the JSON output as a single line per datagram making the output easy to parse in scripts. For example, the following emerging.py script downloads the Emerging Threats compromised IP address database, parses the JSON records, checks to see if source and destination addresses can be found in the database, and prints out information on any matches:
#!/usr/bin/env python

from sys import stdin
from json import loads
from requests import get

blacklist = set()
r = get('https://rules.emergingthreats.net/blockrules/compromised-ips.txt')
for line in r.iter_lines():
  blacklist.add(line)

for line in stdin:
  datagram = loads(line)
  localtime = datagram["localtime"]
  samples = datagram["samples"]
  for sample in samples:
    sampleType = sample["sampleType"]
    elements = sample["elements"]
    if sampleType == "FLOWSAMPLE":
      for element in elements:
        tag = element["flowBlock_tag"]
        if tag == "0:1":
          try:
            src = element["srcIP"]
            dst = element["dstIP"]
            if src in blacklist or dst in blacklist:
              print "%s %s %s" % (localtime,src,dst)
          except KeyError:
            pass
Run the command:
docker run -p 6343:6343/udp sflow/sflowtool -j | ./emerging.py
These were just a few examples, see the sflowtool home page for additional information.

Forwarding using sFlow-RT describes how to set up and tear down sFlow streams using the sFlow-RT analytics engine. This is a simple way to direct a stream of sFlow to a desktop running sflowtool. For example, suppose sflowtool is running on host 10.0.0.30 and sFlow-RT is running on host 10.0.0.1, the following command would start a session:
curl -H "Content-Type:application/json" -X PUT --data '{"address":"10.0.0.30"}' \
http://10.0.0.1:8008/forwarding/sflowtool/json
and the following command would end the session:
curl -X DELETE http://10.0.0.1:8008/forwarding/sflowtool/json
Note: The sflow/sflow-rt Docker image is a convenient way to run sFlow-RT:
docker run -p 8008:8008 -p 6343:6343/udp sflow/sflow-rt

Tuesday, September 3, 2019

Forwarding using sFlow-RT

The diagrams show different two different configurations for sFlow monitoring:
  1. Without Forwarding Each sFlow agent is configured to stream sFlow telemetry to each of the analysis applications. This configuration is appropriate when a small number of applications is being used to continuously monitor performance. However, the overhead on the network and agents increases as additional analyzers are added. Often it is not possible to increase the number of analyzers since many embedded sFlow agents have limited resources and only support a small number of sFlow streams. In addition, the complexity of configuring each agent to add or remove an analysis application can be significant since agents may reside in Ethernet switches, routers, servers, hypervisors and applications on many different platforms from a variety of vendors.
  2. With Forwarding In this case all the agents are configured to send sFlow to a forwarding module which resends the data to the analysis applications. In this case analyzers can be added and removed simply by reconfiguring the forwarder without any changes required to the agent configurations.
There are many variations between these two extremes. Typically there will be one or two analyzers used for continuous monitoring and additional tools, like Wireshark, might be deployed for troubleshooting when the continuous monitoring tools detect anomalies.

This article will demonstrate how to forward sFlow using sFlow-RT.

Download and install and install the software and configure the sFlow agents to stream telemetry to the sFlow-RT instance.
The sFlow-RT status page, accessible on HTTP port 8008, can be used to verify that sFlow is being received from the agents. Click on the API option then click on the Open REST API Explorer button to access documentation on the sFlow-RT REST API.
The following REST API call creates a forwarding session, SessionA, directing a stream of sFlow to analyzer 10.0.0.30:
curl -H "Content-Type:application/json" -X PUT --data '{"address":"10.0.0.30"}' \
http://127.0.0.1:8008/forwarding/SessionA/json
Create a second session, SessionB, to a non-standard port, 7343:
curl -H "Content-Type:application/json" \
-X PUT --data '{"address":"10.0.0.30","port":7343}' \
http://127.0.0.1:8008/forwarding/SessionB/json
Create a third session, SessionC, to forward sFlow from selected agent, 10.0.0.254:
curl -H "Content-Type:application/json" \
-X PUT --data '{"address":"10.0.0.30","port":8343,"agents":["10.0.0.254"]}' \
http://127.0.0.1:8008/forwarding/SessionC/json
See the all forwarding sessions:
curl http://127.0.0.1:8008/forwarding/json
Delete forwarding session, SessionB:
curl -X DELETE http://127.0.0.1:8008/forwarding/SessionB/json
In addition, sFlow-RT supports the complex filtering and forwarding operations needed stream per-tenant views of the sFlow telemetry in a shared network, see Multi-tenant sFlow.
Finally, the streaming analytics capabilities of sFlow-RT can be used to simultaneously deliver metrics to time series databases (e.g. Prometheus and Grafana), send events to SIEM tools like Splunk or Logstash (e.g. Exporting events using syslog), and export flow data (e.g. sFlow to IPFIX/NetFlow) while also running embedded applications to visualize data, mitigate DDoS attacks, and optimize routing.

Tuesday, August 13, 2019

sFlow-RT 3.0 released

The sFlow-RT 3.0 release has a simplified user interface that focusses on metrics needed to manage the performance of the sFlow-RT analytics software and installed applications.

Applications are available that replace features from the previous 2.3 release. The following instructions show how to install sFlow-RT 3.0 along with basic data exploration applications.

On a system with Java 1.8+ installed:
wget https://inmon.com/products/sFlow-RT/sflow-rt.tar.gz
tar -xvzf sflow-rt.tar.gz
./sflow-rt/get-app.sh sflow-rt flow-trend
./sflow-rt/get-app.sh sflow-rt browse-metrics
./sflow-rt/start.sh
On a system with Docker installed:
mkdir app
docker run -v $PWD/app:/sflow-rt/app --entrypoint /sflow-rt/get-app.sh sflow/sflow-rt sflow-rt flow-trend
docker run -v $PWD/app:/sflow-rt/app --entrypoint /sflow-rt/get-app.sh sflow/sflow-rt sflow-rt browse-metrics
docker run -v $PWD/app:/sflow-rt/app -p 6343:6343/udp -p 8008:8008 sflow/sflow-rt
The product user interface can be accessed on port 8008. The Status page, shown at the top of this article, displays key metrics about the performance of the software.
The Apps tab lists the two applications we installed, browse-metrics and flow-trend, and the green color of the buttons indicates both applications are healthy.

Click on the flow-trend button to open the application and trend traffic flows in real-time. The RESTflow article describes the flow analytics capabilities of sFlow-RT in detail.
Click on the browse-metrics button to open the application and trend statistics in real-time. The Cluster performance metrics article describes the metrics analytics capabilities of sFlow-RT in more detail.
The API tab provides a link to Writing Applications, an introductory article on programming sFlow-RT.
Clicking on the Open REST API Explorer button to access documentation on the sFlow-RT REST API and make queries.

Applications lists additional applications that can be downloaded to export metrics to Prometheus, mitigate DDoS attacks, report on performance of leaf and spine networks, monitor an Internet exchange network, visualize real-time flows, etc.

Friday, July 12, 2019

Arista BGP FlowSpec


The video of a talk by Peter Lundqvist from DKNOG9 describes BGP FlowSpec, use cases, and details of Arista's implementation.

FlowSpec for real-time control and sFlow telemetry for real-time visibility is a powerful combination that can be used to automate DDoS mitigation and traffic engineering. The article, Real-time DDoS mitigation using sFlow and BGP FlowSpec, gives an example using the sFlow-RT analytics software.

EOS 4.22 includes support for BGP FlowSpec. This article uses a virtual machine running vEOS-4.22 to demonstrate how to configure FlowSpec and sFlow so that the switch can be controlled by an sFlow-RT application (such as the DDoS mitigation application referenced earlier).

The following output shows the EOS configuration statements related to sFlow and FlowSpec:
!
service routing protocols model multi-agent
!
sflow sample 16384
sflow polling-interval 30
sflow destination 10.0.0.70
sflow run
!
interface Ethernet1
   flow-spec ipv4 ipv6
!
interface Management1
   ip address 10.0.0.96/24
!
ip routing
!
router bgp 65096
   router-id 10.0.0.96
   neighbor 10.0.0.70 remote-as 65070
   neighbor 10.0.0.70 transport remote-port 1179
   neighbor 10.0.0.70 send-community extended
   neighbor 10.0.0.70 maximum-routes 12000 
   !
   address-family flow-spec ipv4
      neighbor 10.0.0.70 activate
   !
   address-family flow-spec ipv6
      neighbor 10.0.0.70 activate
The following JavaScript statement configures the FlowSpec connection on the sFlow-RT side:
bgpAddNeighbor("10.0.0.96","65070","10.0.0.70",{flowspec:true,flowspec6:true});
The FlowSpec functionality is exposed through sFlow-RT's REST API.
The sFlow-RT REST API Explorer is a simple way to exercise the FlowSpec functionality. In this case we are going to push a rule that blocks traffic from UDP port 53 targeted at host 10.0.0.1. This type of rule is typically used to block a DNS amplification attack.

The following output on the switch verifies that the rule has been received:
localhost#sho bgp flow-spec ipv4 detail
BGP Flow Specification rules for VRF default
Router identifier 10.0.0.96, local AS number 65096
BGP Flow Specification Matching Rule for 10.0.0.1/32;*;IP:17;SP:53;
 Rule identifier: 3851506952
 Matching Rule:
   Destination Prefix: 10.0.0.1/32
   Source Prefix: *
   IP Protocol: 17
   Source Port: 53
 Paths: 1 available
  65070
    from 10.0.0.70 (10.0.0.70)
      Origin IGP, metric -, localpref 100, weight 0, valid, external, best
      Actions: Drop
In practice the process of adding and removing filtering rules can be completely automated by an sFlow-RT application. The combination of real-time sFlow analytics with the real-time control provided by FlowSpec allows DDoS attacks to be detected and blocked within seconds.

Friday, June 14, 2019

Mininet flow analytics with custom scripts

Mininet flow analytics describes how to use the sflow.py helper script that ships with the sFlow-RT analytics engine to enable sFlow telemetry, e.g.
sudo mn --custom sflow-rt/extras/sflow.py --link tc,bw=10 \
--topo tree,depth=2,fanout=2
Mininet, ONOS, and segment routing provides an example using a Custom Topology, e.g.
sudo env ONOS=10.0.0.73 mn --custom sr.py,sflow-rt/extras/sflow.py \
--link tc,bw=10 --topo=sr '--controller=remote,ip=$ONOS,port=6653'
This article describes how to incorporate sFlow monitoring in a fully custom Mininet script. Consider the following simpletest.py script based on Working with Mininet:
#!/usr/bin/python                                                                            
                                                                                             
from mininet.topo import Topo
from mininet.net import Mininet
from mininet.util import dumpNodeConnections
from mininet.log import setLogLevel

class SingleSwitchTopo(Topo):
    "Single switch connected to n hosts."
    def build(self, n=2):
        switch = self.addSwitch('s1')
        # Python's range(N) generates 0..N-1
        for h in range(n):
            host = self.addHost('h%s' % (h + 1))
            self.addLink(host, switch)

def simpleTest():
    "Create and test a simple network"
    topo = SingleSwitchTopo(n=4)
    net = Mininet(topo)
    net.start()
    print "Dumping host connections"
    dumpNodeConnections(net.hosts)
    print "Testing bandwidth between h1 and h4"
    h1, h4 = net.get( 'h1', 'h4' )
    net.iperf( (h1, h4) )
    net.stop()

if __name__ == '__main__':
    # Tell mininet to print useful information
    setLogLevel('info')
    simpleTest()
Add the highlighted lines to incorporate sFlow telemetry:
#!/usr/bin/python                                                                            
                                                                                             
from mininet.topo import Topo
from mininet.net import Mininet
from mininet.util import dumpNodeConnections
from mininet.log import setLogLevel
from mininet.util import customClass
from mininet.link import TCLink

# Compile and run sFlow helper script
# - configures sFlow on OVS
# - posts topology to sFlow-RT
execfile('sflow-rt/extras/sflow.py') 

# Rate limit links to 10Mbps
link = customClass({'tc':TCLink}, 'tc,bw=10')

class SingleSwitchTopo(Topo):
    "Single switch connected to n hosts."
    def build(self, n=2):
        switch = self.addSwitch('s1')
        # Python's range(N) generates 0..N-1
        for h in range(n):
            host = self.addHost('h%s' % (h + 1))
            self.addLink(host, switch)

def simpleTest():
    "Create and test a simple network"
    topo = SingleSwitchTopo(n=4)
    net = Mininet(topo,link=link)
    net.start()
    print "Dumping host connections"
    dumpNodeConnections(net.hosts)
    print "Testing bandwidth between h1 and h4"
    h1, h4 = net.get( 'h1', 'h4' )
    net.iperf( (h1, h4) )
    net.stop()

if __name__ == '__main__':
    # Tell mininet to print useful information
    setLogLevel('info')
    simpleTest()
When running the script the highlighted output confirms that sFlow has been enabled and the topology has been posted to sFlow-RT:
pp@mininet:~$ sudo ./simpletest.py
*** Creating network
*** Adding controller
*** Adding hosts:
h1 h2 h3 h4 
*** Adding switches:
s1 
*** Adding links:
(10.00Mbit) (10.00Mbit) (h1, s1) (10.00Mbit) (10.00Mbit) (h2, s1) (10.00Mbit) (10.00Mbit) (h3, s1) (10.00Mbit) (10.00Mbit) (h4, s1) 
*** Configuring hosts
h1 h2 h3 h4 
*** Starting controller
c0 
*** Starting 1 switches
s1 ...(10.00Mbit) (10.00Mbit) (10.00Mbit) (10.00Mbit) 
*** Enabling sFlow:
s1
*** Sending topology
Dumping host connections
h1 h1-eth0:s1-eth1
h2 h2-eth0:s1-eth2
h3 h3-eth0:s1-eth3
h4 h4-eth0:s1-eth4
Testing bandwidth between h1 and h4
*** Iperf: testing TCP bandwidth between h1 and h4 
*** Results: ['6.32 Mbits/sec', '6.55 Mbits/sec']
*** Stopping 1 controllers
c0 
*** Stopping 4 links
....
*** Stopping 1 switches
s1 
*** Stopping 4 hosts
h1 h2 h3 h4 
*** Done
Mininet dashboard and Mininet weathermap describe the sFlow-RT Mininet Dashboard application shown at the top of this article. The tool provides a real-time visualization of traffic flowing over the Mininet network. Writing Applications describes how to develop custom analytics applications for sFlow-RT.

Wednesday, May 8, 2019

Secure forwarding of sFlow using ssh

Typically sFlow datagrams are sent unencrypted from agents embedded in switches and routers to a local collector/analyzer. Sending sFlow datagrams over the management VLAN or out of band management network generally provides adequate isolation and security within the site. Inter-site traffic within an organization is typically carried over a virtual private network (VPN) which encrypts the data and protects it from eavesdropping.

This article describes a simple method of carrying sFlow datagrams over an encrypted ssh connection which can be useful in situations where a VPN is not available, for example, sending sFlow to an analyzer in the public cloud, or to an external consultant.

The diagram shows the elements of the solution. A collector on the site receives sFlow datagrams from the network devices and uses the sflow_fwd.py script to convert the datagrams into line delimited hexadecimal strings that are sent over an ssh connection to another instance of sflow_fwd.py running on the analyzer that converts the hexadecimal strings back to sFlow datagrams.

The following sflow_fwd.py Python script accomplishes the task:
#!/usr/bin/python

import socket
import sys
import argparse

parser = argparse.ArgumentParser(description='Serialize/deserialize sFlow')
parser.add_argument('-c', '--collector', default='')
parser.add_argument('-s', '--server')
parser.add_argument('-p', '--port', type=int, default=6343)
args = parser.parse_args()

sock=socket.socket(socket.AF_INET,socket.SOCK_DGRAM)

if(args.server != None):
  while True:
    line = sys.stdin.readline()
    if not line:
      break
    buf = bytearray.fromhex(line[:-1])
    sock.sendto(buf, (args.server, args.port))
else: 
  sock.bind((args.collector,args.port))
  while True:
    buf = sock.recv(2048)
    if not buf:
      break
    print buf.encode('hex')
    sys.stdout.flush()
Create a user account on both the collector and analyzer machines, in this example the user is pp. Next copy the script to both machines.

If you log into the collector machine, following command will send sFlow to the analyzer machine:
./sflow_fwd.py | ssh pp@analyzer './sflow_fwd.py -s 127.0.0.1'
If you log into the analyzer machine, the following command will retrieve sFlow from the collector machine:
ssh pp@collector './sflow_fwd.py' | ./sflow_fwd.py -s 127.0.0.1
If a permanent connection is required, it is relatively straightforward to create a daemon using systemd. In this example, the service is being installed on the collector machine by performing the following steps:
First log into the collector generate an ssh key:
ssh-keygen
Next, install the key on the analyzer system:
ssh-copy-id pp@analyzer
Now create the systemd service file, /etc/systemd/system/sflow-tunnel.service:
[Unit]
Description=sFlow tunnel
After=network.target

[Service]
Type=simple
User=pp
ExecStart=/bin/sh -c "/home/pp/sflow_fwd.py | /usr/bin/ssh pp@analyzer './sflow_fwd.py -s 127.0.0.1'"
Restart=on-failure
RestartSec=30

[Install]
WantedBy=multi-user.target
Finally, use the systemctl command to enable and start the daemon:
sudo systemctl enable sflow-tunnel.service
sudo systemctl start sflow-tunnel.service
A simple way to confirm that sFlow is arriving on the analyzer machine is to use sflowtool.

There are numerous articles on this blog describing how the sFlow-RT analytics software can be used to integrate sFlow telemetry with popular metrics and SIEM (security information and event management) tools.

Tuesday, April 23, 2019

Prometheus exporter

Prometheus is an open source time series database optimized to collect large numbers of metrics from cloud infrastructure. This article will explore how industry standard sFlow telemetry streaming supported by network devices (Arista, Aruba, Cisco, Dell, Huawei, Juniper, etc.) and Host sFlow agents (Linux, Windows, FreeBSD, AIX, Solaris, Docker, Systemd, Hyper-V, KVM, Nutanix AHV, Xen) can be integrated with Prometheus to extend visibility into the network.

The diagram above shows the elements of the solution: sFlow telemetry streams from hosts and switches to an instance of sFlow-RT. The sFlow-RT analytics software converts the raw measurements into metrics that are accessible through a REST API. The sflow-rt/prometheus application extends the REST API to include native Prometheus exporter functionality allowing Prometheus to retrieve metrics. Prometheus stores metrics in a time series database that can be queries by Grafana to build dashboards.

Update 19 October 2019, native support for Prometheus export added to sFlow-RT, Prometheus application no longer needed to run this example, use URL: /prometheus/metrics/ALL/ALL/txt. The Prometheus application is needed for exporting traffic flows, see Flow metrics with Prometheus and Grafana.

The Docker sflow/prometheus image provides a simple way to run the application:
docker run --name sflow-rt -p 8008:8008 -p 6343:6343/udp -d sflow/prometheus
Configure sFlow agents to send data to the collector, 10.0.0.70, on port 6343.

Verify that the metrics are available using cURL:
$ curl http://10.0.0.70:8008/prometheus/metrics/ALL/ALL/txt
sflow_ifinucastpkts{agent="10.0.0.30",datasource="2",host="server",ifname="enp3s0"} 9.44
sflow_ifoutdiscards{agent="10.0.0.30",datasource="2",host="server",ifname="enp3s0"} 0
sflow_ifoutbroadcastpkts{agent="10.0.0.30",datasource="2",host="server",ifname="enp3s0"} 0
sflow_ifinerrors{agent="10.0.0.30",datasource="2",host="server",ifname="enp3s0"} 0
If the sFlow agents don't provide host and ifname information, enable SNMP to retrieve sysName and ifName data to populate these fields:
docker run --name sflow-rt -p 8008:8008 -p 6343:6343/udp -d sflow/prometheus \
-Dsnmp.ifname=yes
By default SNMP version 2c will be used with the public community string. Additional System Properties can be used to override these defaults.
Now define a metrics "scraping" job in the Prometheus configuration file, prometheus.yml
global:
  scrape_interval:     15s
  evaluation_interval: 15s

rule_files:
  # - "first.rules"
  # - "second.rules"

scrape_configs:
  - job_name: 'sflow-rt'
    metrics_path: /prometheus/metrics/ALL/ALL/txt
    static_configs:
      - targets: ['10.0.0.70:8008']
Now start Prometheus:
docker run --name prometheus --rm -v $PWD/data:/prometheus \
-v $PWD/prometheus.yml:/etc/prometheus/prometheus.yml \
-p 9090:9090 -d prom/prometheus
The screen capture above shows the Prometheus web interface (accessed on port 9090).
Grafana is open source time series analysis software. The ability to pull data from many data sources and the extensive range of charting options makes Grafana an attractive tool for building operations dashboards.

The following command shows how to run Grafana under Docker:
docker run --name grafana -v $PWD/data:/var/lib/grafana \
-p 3000:3000 -d grafana/grafana
Access the Grafana web interface on port 3000, configure a data source for the Prometheus database, and start building dashboards. The screen capture above shows the same chart built earlier using the native Prometheus interface.

Wednesday, March 6, 2019

Loggly

Loggly is a cloud logging and and analysis platform. This article will demonstrate how to integrate network events generated from industry standard sFlow instrumentation build into network switches.
Loggly offers a free 14 day evaluation, so you can try this example at no cost.
ICMP unreachable describes how monitoring ICMP destination unreachable messages can help identify misconfigured hosts and scanning behavior. The article uses the sFlow-RT real-time analytics software to process the raw sFlow and report on unreachable messages.

The following script, loggly.js, modifies the sFlow-RT script from the article to send events to the Loggly HTTP/S Event Endpoint:
var token = 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx';
  
var url = 'https://logs-01.loggly.com/inputs/'+token+'/tag/http/';

var keys = [
  'icmpunreachablenet',
  'icmpunreachablehost',
  'icmpunreachableprotocol',
  'icmpunreachableport'
];

for (var i = 0; i < keys.length; i++) {
  var key = keys[i];
  setFlow(key, {
    keys:'macsource,ipsource,macdestination,ipdestination,' + key,
    value:'frames',
    log:true,
    flowStart:true
  });
}

setFlowHandler(function(rec) {
  var keys = rec.flowKeys.split(',');
  var msg = {
    flow_type:rec.name,
    src_mac:keys[0],
    src_ip:keys[1],
    dst_mac:keys[2],
    dst_ip:keys[3],
    unreachable:keys[4]
  };

  try { http(url,'post','application/json',JSON.stringify(msg)); }
  catch(e) { logWarning(e); };
}, keys);
Some notes on the script:
  1. Modify the script to use the correct token for your Loggly account.
  2. Including MAC addresses can help identify hosts even if they spoof IP addresses
  3. See Writing Applications for more information.
Run the script using the sflow/sflow-rt docker image:
docker run -p 6343:6343/udp -v $PWD/loggly.js:/loggly.js \
sflow/sflow-rt -Dscript.file=/loggly.js
Events should now start appearing in Loggly.
The Loggly Live Tail page can be used to verify that the logs are being received. The screen capture at the start of this article shows a chart trending events by the host that triggered them, identifying 10.0.0.30 as the source of the network scan.

The loggly.js script can easily be modified to track and log different types of network activity. For example, Blacklists describes how to download a set of blacklisted addresses, match traffic against the blacklist and generate events for the matches.

Intranet DDoS attacks describes the threats posed by IoT (Internet of Things) devices and the need for visibility throughout the network in order to tackle these threats. Incorporating sFlow in the monitoring strategy extends visibility beyond the firewalls to the entire network.

In addition to generating events, sFlow analytics can be used to deliver performance metrics. The article, Cloud analytics, describes how to use sFlow-RT to send performance metrics to the Librato cloud service - also part of Solarwinds.