Monday, November 18, 2013

Error Of the Day - Static Route from Global RT to a Loopback Interface On The Same Router - VRF Leaking

This Message was Generated after i configured a static route from the global routing table pointing to a loopback interface in a VRF.

this is a snippet from the configuration of the same router

ip vrf a
 rd 100:1
 route-target export 100:1 route-target import 100:1
!
interface Loopback0
 ip address 13.13.13.13 255.255.255.255 !
!
interface Loopback10
 ip vrf forwarding a
 ip address 130.130.130.130 255.255.255.255
 !
!
ip route 130.130.130.130 255.255.255.255 loopback10 130.130.130.130


Now when i try to ping from lo0 in the global routing table to loopback10 in vrf a


*Nov 18 22:14:12.067: %IP-3-LOOPPAK: Looping packet detected and dropped -
src=130.130.130.130, dst=130.130.130.130, hl=20, tl=100, prot=1, sport=0, dport=0 in=Loopback10, nexthop=130.130.130.130, out=Loopback10
options=none -Process= "IP Input", ipl= 0, pid= 97,  -Traceback= 0x61ECB070z 0x61ECC2B4z 0x61ECD200z 0x61ECD63Cz 0x61ECD9FCz 0x61EB6780z 0x61EB71B4z 0x61EB7638z 0x61EB772Cz 0x61EB7980z 0x6339B980z 0x6339B964z

The funny part is even though Cisco allowed you to configure that static route and didn't generate an error while configuration. But what's interesting is that the source packet is 130.130.130.130 although it should've been 13.13.13.13 of lo0 in the GRT

Feel free to drop a comment regarding this issue :)




Friday, November 15, 2013

Tuesday, November 12, 2013

Conditional Policy-Based Routing

Previously, we've seen how Policy Based Routing can divert traffic from on path to another, even though the best path is a totally different one. The problem with policy-based routing is that it only specifies one option, if that option isn’t met, packets might be dropped or it can take a path that is still not preferred.

Conditional policy-based routing introduces additional options that help policy-based routing be more flexible. Let’s check this topology




Assuming that we have a PBR on R3 that diverts traffic from R4 loopback to interface S0/0 even though the best path is through interface F0/1. What will happen if interface S0/0 went down? All traffic sources from R4 Lo0 4.4.4.4 will be dropped and a black hole will be created in the network since it is enforced to go through S0/0 which is down.

What if we can tell R3 to enforce traffic sources from 4.4.4.4 to S0/0, BUT, in case S0/0 went down, reroute the traffic to F0/0? This can be done using IP SLA tracking  in conjunction with policy routing.

 Here’s the configuration of PBR  on R3

R3#sh ip policy
Interface      Route map
Fa1/0          PBR

R3#sh run int f1/0
Building configuration...

Current configuration : 136 bytes
!
interface FastEthernet1/0
 ip address 10.3.4.3 255.255.255.0
 ip policy route-map PBR
 ip ospf 1 area 0
 duplex auto
 speed auto
end

R3#sh route-map PBR
route-map PBR, permit, sequence 10
  Match clauses:
    ip address (access-lists): PBR
  Set clauses:
    interface Serial0/0
  Policy routing matches: 15 packets, 1278 bytes

R3#sh ip access-lists
Extended IP access list PBR
    10 permit ip host 4.4.4.4 host 1.1.1.1 (21 matches)

Now let’s traceroute from 4.4.4.4 to 1.1.1.1

R4#traceroute 1.1.1.1 source 4.4.4.4

Type escape sequence to abort.
Tracing the route to 1.1.1.1

  1 10.3.4.3 32 msec 24 msec 16 msec
  2 10.1.3.1 56 msec *  32 msec

Assuming a failure happened between between R1 and R3, traffic sourced from 4.4.4.4 will be dropped on R3

R3#show ip int br
Interface                  IP-Address      OK? Method Status                Protocol
FastEthernet0/0            unassigned      YES NVRAM  administratively down down   
Serial0/0                  10.1.3.3        YES NVRAM  up                    down   
FastEthernet0/1            10.2.3.3        YES NVRAM  up                    up     
FastEthernet1/0            10.3.4.3        YES NVRAM  up                    up     
Loopback0                  3.3.3.3         YES NVRAM  up                    up     
R3#
*Mar  1 00:17:34.131: %LINEPROTO-5-UPDOWN: Line protocol on Interface Serial0/0, changed state to down
*Mar  1 00:17:34.155: %OSPF-5-ADJCHG: Process 1, Nbr 1.1.1.1 on Serial0/0 from FULL to DOWN, Neighbor Down: Interface down or detached


R4#ping 1.1.1.1 source lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 1.1.1.1, timeout is 2 seconds:
Packet sent with a source address of 4.4.4.4
.....
Success rate is 0 percent (0/5)

The policy-map is brainless, it matches the source IP and insists that it must forward it through S0/0 even though its down.

Now, to fix that, we can make the PBR a little bit more flexiable by setting some conditions for it to act upon.

First, let’s create a track object to track Serial0/0 and F0/1 interfaces with a delay of 2 sec for going up or down

R3(config)#track 1 interface s0/0 line-protocol
R3(config-track)#delay up 2
R3(config-track)#delay down 2

R3(config)#track 2 interface f0/1 line-protocol
R3(config-track)#delay up 2
R3(config-track)#delay down 2

now  Let’s change our route-map PBR a bit to make it more flexible based on that track

R3(config)#route-map PBR permit 10

R3(config-route-map)#set ip next-hop verify-availability 10.1.3.1 ?
  <1-65535>  Sequence to insert into next-hop list

R3(config-route-map)#set ip next-hop verify-availability 10.1.3.1 1 track 1

Basically, we’re telling the route-map to track the availability of 10.1.3.1 next-hop through object tracking 1 as step #1, Now let’s create a second step in our route-map in case the first one failed

R3(config-route-map)#set ip next-hop verify-availability 10.2.3.2 2 track 2

The moment we do that you’ll notice that the first path is down and the second is up by showing the route-map

R3#show route-map PBR  
route-map PBR, permit, sequence 10
  Match clauses:
    ip address (access-lists): PBR
  Set clauses:
    ip next-hop verify-availability 10.1.3.1 1 track 1  [down]
    ip next-hop verify-availability 10.2.3.2 2 track 2  [up]
  Policy routing matches: 62 packets, 5178 bytes

Now let’s traceroute again from R4

R4#traceroute 1.1.1.1 source 4.4.4.4

Type escape sequence to abort.
Tracing the route to 1.1.1.1

  1 10.3.4.3 20 msec 36 msec 20 msec
  2 10.2.3.2 40 msec 36 msec 40 msec
  3 10.1.2.1 72 msec *  60 msec

It seems that indeed R3 is routing through the second path now. For Final test, we’ll bring the interface between R3 and R1 back on and traceroute again

R3#
*Mar  1 00:47:46.379: %TRACKING-5-STATE: 1 interface Se0/0 line-protocol Down->Up

R3#show track
Track 1
  Interface Serial0/0 line-protocol
  Line protocol is Up
    2 changes, last change 00:00:28
  Delay up 2 secs, down 2 secs
  Tracked by:
    ROUTE-MAP 0
Track 2
  Interface FastEthernet0/1 line-protocol
  Line protocol is Up
    1 change, last change 00:12:32
  Delay up 2 secs, down 2 secs
  Tracked by:
    ROUTE-MAP 0

Now both tracks are up and used by the route-map

R3#show route-map PBR
route-map PBR, permit, sequence 10
  Match clauses:
    ip address (access-lists): PBR
  Set clauses:
    ip next-hop verify-availability 10.1.3.1 1 track 1  [up]
    ip next-hop verify-availability 10.2.3.2 2 track 2  [up]
  Policy routing matches: 98 packets, 8634 bytes

A Final traceroute should prove that everything is OK now

R4#traceroute 1.1.1.1 source 4.4.4.4

Type escape sequence to abort.
Tracing the route to 1.1.1.1

  1 10.3.4.3 16 msec 36 msec 20 msec
  2 10.1.3.1 40 msec *  32 msec

Now we can rest assured that in case I have a problem in my forwarding interface used by PBR, no black hole can be created in the network.



Saturday, November 9, 2013

Cisco Object Tracking using IP SLA

One of the very useful features of Cisco's IOS is the track command which allows you to track certain events, based on that events you can take actions. Since this is a very big topic i will split on different posts. Let's first get familiar with the track  command and we will build our scenarios on what it can do

The track command as the name implied detects the state of a certain variable, this variable can be one of many, but i'll just give an example of some of the things that it can observe


  • a route in the routing table, wether it exists or not
  • an Interface state
  • reachability to a certain host ( in conjunction with IP SLA )
this can be very useful in different situations where the router or switch would be able to act on its on in case an event happened. Let's elaborate this with the following topology



Lets assume here that R1 is connected to two switches, both switches are connected to R4.
R1 has two default routes, one which is the main default route with admin distance of 1 and the backup default route with the admin distance of 200. There is no IGP running in this topology.

However, the first default route is pointing to next-hop IP 10.1.4.4, the backup default route is pointing to next-hop IP of 20.1.4.4. Now, here's the real problem. Ethernet unlike many other L2 protocols doesn't detect remote hops failure, meaning that if the link between R4 and SW1 went down, the link between R1 and SW1 will still be up even though the Layer-3 termination of the subnet 10.1.4.0/24 is on R1 and R4, R1 will know nothing about the failed link between R4 and SW4. 

Let’s first check the normal operation of the setup we have on hand.

R1#show run | i ip route
ip route 0.0.0.0 0.0.0.0 10.1.4.4
ip route 0.0.0.0 0.0.0.0 20.1.4.4 200

R1#show ip route
Gateway of last resort is 10.1.4.4 to network 0.0.0.0

     1.0.0.0/32 is subnetted, 1 subnets
C       1.1.1.1 is directly connected, Loopback0
     20.0.0.0/24 is subnetted, 1 subnets
C       20.1.4.0 is directly connected, FastEthernet0/1
     10.0.0.0/24 is subnetted, 1 subnets
C       10.1.4.0 is directly connected, FastEthernet0/0
S*   0.0.0.0/0 [1/0] via 10.1.4.4

As you can see, the first default route is the only one installed in the routing table due to its lower admin distance.

R1#show ip int brief
Interface                  IP-Address      OK? Method Status                Protocol
FastEthernet0/0            10.1.4.1        YES manual up                    up     
FastEthernet0/1            20.1.4.1        YES manual up                    up     
Loopback0                  1.1.1.1         YES manual up                    up

All the interfaces are up and everything seems good, now let’s ping R4 loopback sources from R1 loopback

R1#ping 4.4.4.4 source lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
Packet sent with a source address of 1.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 4/52/124 ms

Seems legit, now let’s simulate a failure between R4 and SW1 by shutting down the interface from R4 side

R4(config)#int f0/1
R4(config-if)#shut
R4(config-if)#
*Mar  1 00:47:51.691: %LINK-5-CHANGED: Interface FastEthernet0/1, changed state to administratively down
*Mar  1 00:47:52.691: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/1, changed state to down

Now the whole path of the main default route isn’t usable, but still the F0/0 interface and the main default route is pointing to F0/0

R1#show ip int brief
Interface                  IP-Address      OK? Method Status                Protocol
FastEthernet0/0            10.1.4.1        YES manual up                    up     
FastEthernet0/1            20.1.4.1        YES manual up                    up     
Loopback0                  1.1.1.1         YES manual up                    up     

R1#show ip route
Gateway of last resort is 10.1.4.4 to network 0.0.0.0

     1.0.0.0/32 is subnetted, 1 subnets
C       1.1.1.1 is directly connected, Loopback0
     20.0.0.0/24 is subnetted, 1 subnets
C       20.1.4.0 is directly connected, FastEthernet0/1
     10.0.0.0/24 is subnetted, 1 subnets
C       10.1.4.0 is directly connected, FastEthernet0/0
S*   0.0.0.0/0 [1/0] via 10.1.4.4

Now if we try to ping, the ping will ofcourse fail

R1#ping 10.1.4.4

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.1.4.4, timeout is 2 seconds:
.....
Success rate is 0 percent (0/5)

This is where tracking comes into play, since R1 isn’t by default aware by remote Ethernet links state, we can make it track events that might indicate that the link is down, and based on that it can remove the main default route and install the backup routes instead. Here’s how we can do this.

To make it more easier, we’re going to send probes to  10.1.4.4 through interface F0/0 , and we’re going to track it, in case the probes failed, we’re going to switch to remove the main default-route, which ultimately means the other default route will be installed instead

First let’s create a SLA object to start pinging our next-hop IP from the desired interface

R1(config)#ip sla 1?
<1-2147483647> 

R1(config)#ip sla 1
R1(config-ip-sla)#icmp-echo 10.1.4.4 source-interface f0/0

Now we need to specify atleast three parameters to make this work

R1(config-ip-sla-echo)#frequency ?
  <1-604800>  Frequency in seconds (default 60)
R1(config-ip-sla-echo)#frequency 5

R1(config-ip-sla-echo)#timeout ?
  <0-604800000>  Timeout in milliseconds
R1(config-ip-sla-echo)#timeout 1000

R1(config-ip-sla-echo)#threshold ?
  <0-2147483647>  Millisecond threshold value
R1(config-ip-sla-echo)#threshold 1000

Basically, here’s the definition of each of those

·         Frequency (sec) is how often do you want to send a probe
·         Timeout (msec) is what is the absolute timeout if there’s no reply for the probe sent
·         Threshold (msec) the probe was replied but it exceeded a certain amount of time

Keep in mind that the threshold has to have a lower value than the timeout, which makes sense.

After we created the SLA object, we need to activate it by determining when it should run and for how long this probe should be periodically sent.

R1(config)#ip sla schedule 1 life forever start-time now

We just indicated that I want to start SLA object 1 immediately and make it loop forver

Let’s check if it’s working

R1#sh ip sla statistics 1

Round Trip Time (RTT) for       Index 1
        Latest RTT: 32 milliseconds
Latest operation start time: *01:16:38.391 UTC Fri Mar 1 2002
Latest operation return code: OK
Number of successes: 1
Number of failures: 13
Operation time to live: Forever


R1#sh ip sla statistics 1

Round Trip Time (RTT) for       Index 1
        Latest RTT: 32 milliseconds
Latest operation start time: *01:16:38.391 UTC Fri Mar 1 2002
Latest operation return code: OK
Number of successes: 12
Number of failures: 0
Operation time to live: Forever

It seems that our probes are working just fine, now all we need to do is track these probes states to take actions in case it failed.

R1(config)#track 1 rtr 1 reachability

R1(config-track)#delay ?
  down  Delay down change notification
  up    Delay up change notification

R1(config-track)#delay up ?
  <0-180>  Seconds to delay
R1(config-track)#delay up 2

R1(config-track)#delay down ?
  <0-180>  Seconds to delay

R1(config-track)#delay down 2

Keep in mind that Cisco has always been inconsistent with it’s commands, the old name for IP SLA was RTR, now they changed the RTR to SLA syntax but for some unknown reason they didn’t change it under the track command, so rtr 1 here refers to the sla 1 object.

What we just configured here is a tracking instance that observes the start of the SLA object reachability to 10.1.4.4. the delay up refers to the amount of time the track should wait before reacting after it detects that the SLA has reachability, and delay down is what time should it wait until it indicated that the reachability is down. This is useful because in case of flapping links you don’t router to act instantly, you might need to give it time to switch between 2 states

R1#show track 1
Track 1
  Response Time Reporter 1 reachability
  Reachability is Up
    1 change, last change 00:08:31
  Delay up 2 secs, down 2 secs
  Latest operation return code: OK
  Latest RTT (millisecs) 84

Associating the track with our main default-route, we should be good to go

R1(config)#ip route 0.0.0.0 0.0.0.0 10.1.4.4 track 1

R1#sh run | i route
ip route 0.0.0.0 0.0.0.0 10.1.4.4 track 1
ip route 0.0.0.0 0.0.0.0 20.1.4.4 200

Now let’s simulate a failure again between R4 and SW1 by shutting the interface F0/1 on R4. Here’s what happens afterwards on R1

R1#
*Mar  1 01:34:18.587: %TRACKING-5-STATE: 1 rtr 1 reachability Up->Down
R1#
*Mar  1 01:34:18.587: RT: del 0.0.0.0 via 10.1.4.4, static metric [1/0]
*Mar  1 01:34:18.591: RT: delete network route to 0.0.0.0
*Mar  1 01:34:18.591: RT: NET-RED 0.0.0.0/0
*Mar  1 01:34:18.591: RT: NET-RED 0.0.0.0/0
*Mar  1 01:34:18.591: RT: add 0.0.0.0/0 via 20.1.4.4, static metric [200/0]
*Mar  1 01:34:18.591: RT: NET-RED 0.0.0.0/0
*Mar  1 01:34:18.591: RT: default path is now 0.0.0.0 via 20.1.4.4
*Mar  1 01:34:18.595: RT: new default network 0.0.0.0
*Mar  1 01:34:18.595: RT: NET-RED 0.0.0.0/0

R1#show ip route

Gateway of last resort is 20.1.4.4 to network 0.0.0.0

     1.0.0.0/32 is subnetted, 1 subnets
C       1.1.1.1 is directly connected, Loopback0
     20.0.0.0/24 is subnetted, 1 subnets
C       20.1.4.0 is directly connected, FastEthernet0/1
     10.0.0.0/24 is subnetted, 1 subnets
C       10.1.4.0 is directly connected, FastEthernet0/0
S*   0.0.0.0/0 [200/0] via 20.1.4.4

The backup default route is now installed in the routing table eliminating the Ethernet problem we had before.

R1#sh ip sla statistics

Round Trip Time (RTT) for       Index 1
        Latest RTT: NoConnection/Busy/Timeout
Latest operation start time: *01:35:58.391 UTC Fri Mar 1 2002
Latest operation return code: Timeout
Number of successes: 176
Number of failures: 70
Operation time to live: Forever

The IP SLA indicated that the reason for failure due to timeout, in case it was the threshold, the return code would’ve been threshold. And the latest RTT indicated that there is no connection.

R1#show track
Track 1
  Response Time Reporter 1 reachability
  Reachability is Down
    4 changes, last change 00:04:01
  Delay up 2 secs, down 2 secs
  Latest operation return code: Timeout
  Tracked by:
    STATIC-IP-ROUTING 0

Now we should be able to ping from R1 lo0 to R4 lo0 without a problem
R1#ping 4.4.4.4 source lo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
Packet sent with a source address of 1.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/25/36 ms



In the next couple of posts, we’ll dig deeper with advanced configuration of SLA and tracking, overcoming lots of problems we face in our networks.