One of the very useful features of Cisco's IOS is the
track command which allows you to track certain events, based on that events you can take actions. Since this is a very big topic i will split on different posts. Let's first get familiar with the
track command and we will build our scenarios on what it can do
The track command as the name implied detects the state of a certain variable, this variable can be one of many, but i'll just give an example of some of the things that it can observe
- a route in the routing table, wether it exists or not
- an Interface state
- reachability to a certain host ( in conjunction with IP SLA )
this can be very useful in different situations where the router or switch would be able to act on its on in case an event happened. Let's elaborate this with the following topology
Lets assume here that R1 is connected to two switches,
both switches are connected to R4.
R1 has two default routes, one which is the main default
route with admin distance of 1 and the backup default route with the admin
distance of 200. There is no IGP running in this topology.
However, the first default route is pointing to next-hop IP
10.1.4.4, the backup default route is pointing to next-hop IP of 20.1.4.4. Now,
here's the real problem. Ethernet unlike many other L2 protocols doesn't detect
remote hops failure, meaning that if the link between R4 and SW1 went down, the
link between R1 and SW1 will still be up even though the Layer-3 termination of
the subnet 10.1.4.0/24 is on R1 and R4, R1 will know nothing about the failed
link between R4 and SW4.
Let’s first check the normal operation of the setup we
have on hand.
R1#show
run | i ip route
ip
route 0.0.0.0 0.0.0.0 10.1.4.4
ip
route 0.0.0.0 0.0.0.0 20.1.4.4 200
R1#show
ip route
Gateway
of last resort is 10.1.4.4 to network 0.0.0.0
1.0.0.0/32 is subnetted, 1 subnets
C 1.1.1.1 is directly connected, Loopback0
20.0.0.0/24 is subnetted, 1 subnets
C 20.1.4.0 is directly connected,
FastEthernet0/1
10.0.0.0/24 is subnetted, 1 subnets
C 10.1.4.0 is directly connected,
FastEthernet0/0
S* 0.0.0.0/0 [1/0] via 10.1.4.4
As you can see, the first default route is the only one
installed in the routing table due to its lower admin distance.
R1#show
ip int brief
Interface IP-Address OK? Method Status Protocol
FastEthernet0/0 10.1.4.1 YES manual up up
FastEthernet0/1 20.1.4.1 YES manual up up
Loopback0 1.1.1.1 YES manual up up
All the interfaces are up and everything seems good, now
let’s ping R4 loopback sources from R1 loopback
R1#ping
4.4.4.4 source lo0
Type
escape sequence to abort.
Sending
5, 100-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
Packet
sent with a source address of 1.1.1.1
!!!!!
Success
rate is 100 percent (5/5), round-trip min/avg/max = 4/52/124 ms
Seems legit, now let’s simulate a failure between R4 and
SW1 by shutting down the interface from R4 side
R4(config)#int
f0/1
R4(config-if)#shut
R4(config-if)#
*Mar 1 00:47:51.691: %LINK-5-CHANGED: Interface
FastEthernet0/1, changed state to administratively down
*Mar 1 00:47:52.691: %LINEPROTO-5-UPDOWN: Line
protocol on Interface FastEthernet0/1, changed state to down
Now the whole path of the main default route isn’t
usable, but still the F0/0 interface and the main default route is pointing to
F0/0
R1#show
ip int brief
Interface IP-Address OK? Method Status Protocol
FastEthernet0/0 10.1.4.1 YES manual up up
FastEthernet0/1 20.1.4.1 YES manual up up
Loopback0 1.1.1.1 YES manual up up
R1#show
ip route
Gateway
of last resort is 10.1.4.4 to network 0.0.0.0
1.0.0.0/32 is subnetted, 1 subnets
C 1.1.1.1 is directly connected, Loopback0
20.0.0.0/24 is subnetted, 1 subnets
C 20.1.4.0 is directly connected,
FastEthernet0/1
10.0.0.0/24 is subnetted, 1 subnets
C 10.1.4.0 is directly connected,
FastEthernet0/0
S* 0.0.0.0/0 [1/0] via 10.1.4.4
Now if we try to ping, the ping will ofcourse fail
R1#ping
10.1.4.4
Type
escape sequence to abort.
Sending
5, 100-byte ICMP Echos to 10.1.4.4, timeout is 2 seconds:
.....
Success
rate is 0 percent (0/5)
This is where tracking comes into play, since R1 isn’t by
default aware by remote Ethernet links state, we can make it track events that
might indicate that the link is down, and based on that it can remove the main
default route and install the backup routes instead. Here’s how we can do this.
To make it more easier, we’re going to send probes to 10.1.4.4 through interface F0/0 , and we’re
going to track it, in case the probes failed, we’re going to switch to remove
the main default-route, which ultimately means the other default route will be
installed instead
First let’s create a SLA object to start pinging our
next-hop IP from the desired interface
R1(config)#ip
sla 1?
<1-2147483647>
R1(config)#ip
sla 1
R1(config-ip-sla)#icmp-echo
10.1.4.4 source-interface f0/0
Now we need to specify atleast three parameters to make
this work
R1(config-ip-sla-echo)#frequency
?
<1-604800> Frequency in seconds (default 60)
R1(config-ip-sla-echo)#frequency
5
R1(config-ip-sla-echo)#timeout
?
<0-604800000> Timeout in milliseconds
R1(config-ip-sla-echo)#timeout
1000
R1(config-ip-sla-echo)#threshold
?
<0-2147483647> Millisecond threshold value
R1(config-ip-sla-echo)#threshold
1000
Basically, here’s the definition of each of those
·
Frequency (sec) is how
often do you want to send a probe
·
Timeout (msec) is what is
the absolute timeout if there’s no reply for the probe sent
·
Threshold (msec) the probe
was replied but it exceeded a certain amount of time
Keep in mind that the threshold has to have a lower value
than the timeout, which makes sense.
After we created the SLA object, we need to activate it
by determining when it should run and for how long this probe should be periodically
sent.
R1(config)#ip
sla schedule 1 life forever start-time now
We just indicated that I want to start SLA object 1 immediately
and make it loop forver
Let’s check if it’s working
R1#sh
ip sla statistics 1
Round
Trip Time (RTT) for Index 1
Latest RTT: 32 milliseconds
Latest
operation start time: *01:16:38.391 UTC Fri Mar 1 2002
Latest
operation return code: OK
Number
of successes: 1
Number
of failures: 13
Operation
time to live: Forever
R1#sh
ip sla statistics 1
Round
Trip Time (RTT) for Index 1
Latest RTT: 32 milliseconds
Latest
operation start time: *01:16:38.391 UTC Fri Mar 1 2002
Latest
operation return code: OK
Number of successes: 12
Number
of failures: 0
Operation
time to live: Forever
It seems that our probes are working just fine, now all
we need to do is track these probes states to take actions in case it failed.
R1(config)#track
1 rtr 1 reachability
R1(config-track)#delay
?
down
Delay down change notification
up
Delay up change notification
R1(config-track)#delay
up ?
<0-180>
Seconds to delay
R1(config-track)#delay
up 2
R1(config-track)#delay
down ?
<0-180>
Seconds to delay
R1(config-track)#delay
down 2
Keep in mind that Cisco has always been inconsistent with
it’s commands, the old name for IP SLA was RTR, now they changed the RTR to SLA
syntax but for some unknown reason they didn’t change it under the track
command, so rtr 1 here refers to the sla 1 object.
What we just configured here is a tracking instance that
observes the start of the SLA object reachability to 10.1.4.4. the delay up
refers to the amount of time the track should wait before reacting after it
detects that the SLA has reachability, and delay down is what time should it
wait until it indicated that the reachability is down. This is useful because
in case of flapping links you don’t router to act instantly, you might need to
give it time to switch between 2 states
R1#show
track 1
Track
1
Response Time Reporter 1 reachability
Reachability is Up
1 change, last change 00:08:31
Delay
up 2 secs, down 2 secs
Latest operation return code: OK
Latest RTT (millisecs) 84
Associating the track with our main default-route, we
should be good to go
R1(config)#ip
route 0.0.0.0 0.0.0.0 10.1.4.4 track 1
R1#sh
run | i route
ip
route 0.0.0.0 0.0.0.0 10.1.4.4 track 1
ip
route 0.0.0.0 0.0.0.0 20.1.4.4 200
Now let’s simulate a failure again between R4 and SW1 by
shutting the interface F0/1 on R4. Here’s what happens afterwards on R1
R1#
*Mar 1 01:34:18.587: %TRACKING-5-STATE: 1 rtr 1
reachability Up->Down
R1#
*Mar 1 01:34:18.587: RT: del 0.0.0.0 via 10.1.4.4,
static metric [1/0]
*Mar 1 01:34:18.591: RT: delete network route to
0.0.0.0
*Mar 1 01:34:18.591: RT: NET-RED 0.0.0.0/0
*Mar 1 01:34:18.591: RT: NET-RED 0.0.0.0/0
*Mar 1 01:34:18.591: RT: add 0.0.0.0/0 via
20.1.4.4, static metric [200/0]
*Mar 1 01:34:18.591: RT: NET-RED 0.0.0.0/0
*Mar 1 01:34:18.591: RT: default path is now
0.0.0.0 via 20.1.4.4
*Mar 1 01:34:18.595: RT: new default network
0.0.0.0
*Mar 1 01:34:18.595: RT: NET-RED 0.0.0.0/0
R1#show
ip route
Gateway
of last resort is 20.1.4.4 to network 0.0.0.0
1.0.0.0/32 is subnetted, 1 subnets
C 1.1.1.1 is directly connected, Loopback0
20.0.0.0/24 is subnetted, 1 subnets
C 20.1.4.0 is directly connected,
FastEthernet0/1
10.0.0.0/24 is subnetted, 1 subnets
C 10.1.4.0 is directly connected,
FastEthernet0/0
S* 0.0.0.0/0 [200/0] via
20.1.4.4
The backup default route is now installed in the routing
table eliminating the Ethernet problem we had before.
R1#sh
ip sla statistics
Round
Trip Time (RTT) for Index 1
Latest RTT:
NoConnection/Busy/Timeout
Latest
operation start time: *01:35:58.391 UTC Fri Mar 1 2002
Latest operation return code: Timeout
Number
of successes: 176
Number of failures: 70
Operation
time to live: Forever
The IP SLA indicated that the reason for failure due to
timeout, in case it was the threshold, the return code would’ve been threshold.
And the latest RTT indicated that there is no connection.
R1#show
track
Track
1
Response Time Reporter 1 reachability
Reachability is Down
4 changes, last change 00:04:01
Delay up 2 secs, down 2 secs
Latest operation
return code: Timeout
Tracked by:
STATIC-IP-ROUTING 0
Now we should be able to ping from R1 lo0 to R4 lo0 without
a problem
R1#ping
4.4.4.4 source lo0
Type
escape sequence to abort.
Sending
5, 100-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
Packet
sent with a source address of 1.1.1.1
!!!!!
Success
rate is 100 percent (5/5), round-trip min/avg/max = 20/25/36 ms
In the next couple of posts, we’ll dig deeper with
advanced configuration of SLA and tracking, overcoming lots of problems we face
in our networks.