Thursday, January 31, 2013

Down the rabbit hole

The article, Tunnels, describes the use of tunneling protocols such as GRE, NVGRE and VXLAN to create virtual networks in cloud environments. Tunneling is also an important tool in addressing challenges posed by IPv6 migration. However, while tunnels are an effective way to virtualize networking, they pose difficult challenges for application development and operations (DevOps) teams trying to optimize network performance and for network administrators who no longer have visibility into the applications running over the physical infrastructure.

This article uses sFlow-RT to demonstrate how sFlow monitoring, build into the physical and virtual network infrastructure, can be used to provide comprehensive visibility into tunneled traffic to application, operations and networking teams.

Note: The sFlow-RT analytics module is primarily intended to be used in automated performance aware software defined networking applications. However, it also provides a rudimentary web based user interface that can be used to demonstrate the visibility into tunneled traffic offered by the sFlow standard.

Application performance

One of the reasons that tunnels are popular for network virtualization is that they provide a useful abstraction that hides the underlying physical network topology. However, while this abstraction offers significant operational flexibility, lack of visibility into the physical network can result in poorly placed workloads, inefficient use of resources, and consequent performance problems (see NUMA).

In this example, consider the problem faced by a system manager troubleshooting poor throughput between two virtual machines: 10.0.201.1 and 10.0.201.2.
Figure 1: Tracing a tunneled flow
Figure 1 shows the Flows table with the following flow definition:
  1. Name: trace
  2. Keys: ipsource,ipdestination,ipprotocol
  3. Value: frames
  4. Filter: ipsource.1=10.0.201.1&ipdestination.1=10.0.201.2
These settings define a new flow definition called trace that is looking for traffic in which the inner (tenant) addresses are 10.0.201.1 and 10.0.201.2 and asks for information on the outer IP addresses.

Note: ipsource.1 has a suffix of 1, indicating a reference to the inner address. It is possible to have nested tunnels such that the inner, inner ipsource address would be indicated as ipsource.2 etc.

Figure 2: Outer addresses of a tunneled flow
Clicking on the flow in the Flows table brings up the chart shown in Figure 2. The chart shows a flow of approximately 15K packets per second and identifies the outer ipsource, ipdestination and ipprotocol as 10.0.0.151, 10.0.0.152 and 47 respectively.

Note: The IP protocol of 47 indicates that this is a GRE tunnel.
Figure 3: All data sources observing a flow
The sFlow-RT module has a REST/HTTP API and editing the URL modifies the query to reveal additional information. Figure 3 shows the effect of changing the query from metric to dump. The dump output shows each switch (Agent) and port (Data Source) that saw the traffic. In this case the traffic was seen traversing 2 virtual switches 10.0.0.28 and 10.0.0.20, and a physical switch 10.0.0.253.

Given the switch and port information, follow up queries could be constructed to look at utilizations, errors and discards on the links to see if there are network problems affecting the traffic.

Network performance

Tunnels hide the applications using the network from network managers, making it difficult to manage capacity, assess the impact of network performance problems and maintain security.

Consider the same example, but this time from a network manager's perspective, having identified a large flow from address 10.0.0.151 to 10.0.0.152.
Figure 4: Looking into a tunnel
Figure 4 shows the Flows table with the following definition:
  1. Name: inside
  2. Keys: ipsource.1,ipdestination.1,stack
  3. Value: frames
  4. Filter: ipsource=10.0.0.151&10.0.0.152
These settings define a new flow called inside that is looking for traffic in which the outer addresses are 10.0.0.151 and 10.0.0.152 and asks for information on the inner (tenant) addresses.
Figure 5: Inner addresses in a tunneled flow
Again, clicking on the entry in the Flows table brings up the chart shown in Figure 5. The chart shows a flow of 15K packets per second and identifies the inner ipsource.1, ipdestination.1 and stack as 10.0.201.1, 10.0.201.2 and eth.ip.gre.ip.tcp respectively.

Given the inner IP addresses and stack, follow up queries can identify the TCP port, server names, application names, CPU loads etc. needed to understand the application demand driving traffic and determine possible actions (moving a virtual machine for example).

Automation

This was a trivial example, in practice tunneled topologies are more complex and cloud data centers are far too large to be managed using manual processes like the one demonstrated here. sFlow-RT provides visibility into large, complex, multi-layered environments, including: QinQ, TRILL, VXLAN, NVGRE and 6over4. Programmatic access to performance data through sFlow-RT's REST API allows cloud orchestration and software defined networking (SDN) controllers to incorporate real-time network, server and application visibility to automatically load balance and optimize workloads.

9 comments:

  1. Hello,

    How to make ddos-protect manage to analyze traffic in a GRE tunnel please?

    Header of sampled packet: 4c82cbaafc706d1593a500810000ac0800450005ecd4c1…
    Ethernet II, Src: Cisco_00:00:01 (00:00:00:00:00:01), Dst: Cisco_00:00:02 (00:00:00:00:00:02)
    802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 172
    Internet Protocol Version 4, Src: 172.XXX.XXX.XXX, Dst: 172.XXX.XXX.XXX
    Generic Routing Encapsulation (IP)
    --> Internet Protocol Version 4, Src: 45.XXX.XXX.XXX, Dst: 185.XXX.XXX.XXX <--
    Transmission Control Protocol, Src Port: 443, Dst Port: 39501, Seq: 446843820, Ack: 3665896498
    Transport Layer Security

    Where should I specify in the program ipdestination.1 instead of ipdestination so that ddos-protect can act during an attack ?

    Thanks for your help

    ReplyDelete
    Replies
    1. Rather than modifying ddos-protect, it's probably easier to start with a simpler example, Real-time DDoS mitigation using sFlow and BGP FlowSpec.

      You might also want to install browse-flows and consult Defining Flows as a way to experiment with different flow definitions.

      The following flow keys should work for a UDP reflection attack in a GRE tunnel, 'ipdestination.1,udpsourceport', along with a filter on the protocol stack, e.g. 'prefix:stack:.:5=eth.ip.gre.ip.udp'

      It's not clear to me how you will filter the traffic in your router? Is it an end point for the tunnel? Is there a BGP instance responsible for routing the decapsulated traffic?

      Delete
  2. We have a couple of asr cisco routers interconnected through old nexus switch 3064 that do not support vxlan protocols or other mpls features, so we build an underlay network running ospf on the nexus, and manually configured GRE Tunnel between our Routers. With this solution, we can minimise the routing table on the nexus.

    So you're right, the end point router for the tunnel are running bgp instance.

    I did some tests with the Browse-flows and then I edited the ddos.js file

    // IPv4 attacks
    var keys = 'ipdestination.1,group:ipdestination.1:ddos_protect';
    var filter = '';
    //var filter = 'first:stack:.:ip:ip6=ip&group:ipsource:ddos_protect='+externalGroup+'&group:ipdestination.1:ddos_protect!='+excludedGroups;
    setFlow('ddos_protect_ip_flood', {
    keys: keys+',ipprotocol.1',
    value:'frames',
    filter:filter,
    t:flow_t
    });

    It works I can see the traffic in the GRE tunnel.

    However the filter 'first:stack:.:ip:ip6=ip&group:ipsource:ddos_protect='+externalGroup+'&group:ipdestination.1:ddos_protect!='+excludedGroups does not work, would you have an idea please?

    ReplyDelete
    Replies
    1. It looks like you need to change "group:ipsource" to "group:ipsource.1".

      Also the "first:stack:.:ip:ip6=ip" might not work right because it is matching on the tunnel prototocol, not the inner protocol. If you add the token "stack" to Browse Flows, what to you see? It should be something like "eth.ip.gre.ip.udp", in which case you could filter on the prefix, e.g. "prefix:stack:.:4=eth.ip.gre.ip" You can check the filter in Browse Flows before changing ddos.js

      Delete
  3. I just tried but no result :

    keys=ipdestination.1,group:ipdestination.1:ddos_protect,ipprotocol.1
    Value=fps
    Filter=prefix:stack:.:4=eth.ip.gre.ip


    ReplyDelete
    Replies
    1. Try keys=stack,ipdestination.1,group:ipdestination.1:ddos_protect,ipprotocol.1
      Remove the filter. What values are you seeing for stack?

      Delete
    2. eth.q.ip.gre.ip.tcp
      eth.q.ip.gre.ip.udp
      eth.q.ip.gre.ip.icmp
      eth.q.ip.gre.ip.udp.quic

      Delete
    3. Thanks for the information. You filter for ip over the gre tunnel should be:
      prefix:stack:.:5=eth.q.ip.gre.ip

      Delete