Saturday, January 17, 2015

BGP fast-external-fallover

By default, BGP fast external fallover is enabled on Cisco IOS. What this feature do is allow BGP neighbors on a directly connected interface to annound the neighbor down as soon as the carrier signal is lost on that interface. This allows faster convergence in case of link failure since the failure will be detected instantiously, but the down side for it is that a flapping link might introduce a problem for that matter.


Examining the simple topology above. R1 and R2 are EBGP peers using the directly connected interface between them.

R1#show ip bgp summary

BGP router identifier 10.1.2.1, local AS number 1

BGP table version is 1, main routing table version 1

 

Neighbor        V           AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd

10.1.2.2        4            2       9      10        1    0    0 00:05:23        0



The peering is up and everything seems fine. As mentioned earlier, the fast-external-failover feature is enabled by default, so when the link between those two routers goes down, the peering will instantly go down.

R1(config)#int f0/0
R1(config-if)#shut
R1(config-if)#
*Jan 17 01:36:11.143: BGP: tbl IPv4 Unicast:base Service reset requests
*Jan 17 01:36:11.143: BGP: tbl IPv4 Multicast:base Service reset requests
*Jan 17 01:36:11.147: BGP: 10.1.2.2 reset due to Interface flap
*Jan 17 01:36:11.151: %BGP-5-ADJCHANGE: neighbor 10.1.2.2 Down Interface flap
*Jan 17 01:36:11.151: %BGP_SESSION-5-ADJCHANGE: neighbor 10.1.2.2 IPv4 Unicast topology base removed from session  Interface flap
*Jan 17 01:36:13.115: %LINK-5-CHANGED: Interface FastEthernet0/0, changed state to administratively down
*Jan 17 01:36:13.419: BGP: Regular scanner timer event
*Jan 17 01:36:13.419: BGP: Performing BGP general scanning
*Jan 17 01:36:13.419: BGP: tbl IPv4 Unicast:base Performing BGP Nexthop scanning for general scan
*Jan 17 01:36:13.419: BGP(0): Future scanner version: 14, current scanner version: 13
*Jan 17 01:36:13.419: BGP: tbl IPv4 Multicast:base Performing BGP Nexthop scanning for general scan
*Jan 17 01:36:13.419: BGP(6): Future scanner version: 15, current scanner version: 14
*Jan 17 01:36:14.115: %LINEPROTO-5-UPDOWN: Line protocol on Interface FastEthernet0/0, changed state to down



The reason that the BGP  peering went down was the interface flap between R1 and R2, this happened in mere milliseconds.
Now we’ll enable the interface between them and we’ll disable this feature. As a note, the hold time interval for BGP = 60 x 3 which is 180 sec

R1(config)#router bgp 1
R1(config-router)#no bgp fast-external-fallover
R1(config)#interface f0/0
R1(config-if)#shut
*Jan 17 01:43:45.659: BGP: 10.1.2.2 reset due to BGP Notification sent
*Jan 17 01:43:45.663: %BGP-5-ADJCHANGE: neighbor 10.1.2.2 Down BGP Notification sent
*Jan 17 01:43:45.663: %BGP-3-NOTIFICATION: sent to neighbor 10.1.2.2 4/0 (hold time expired) 0 bytes
*Jan 17 01:43:45.671: BGP: tbl IPv4 Unicast:base Service reset requests
*Jan 17 01:43:45.671: BGP: tbl IPv4 Multicast:base Service reset requests
*Jan 17 01:43:45.675: %BGP_SESSION-5-ADJCHANGE: neighbor 10.1.2.2 IPv4 Unicast topology base removed from session  BGP Notification sent
After three minutes, BGP peering went down due to hold time expiration.

So the logical question arise, should I leave this feature enabled or should I disable it? Well, that is dependent on many factors.
Depending on the link quality, if the feature is enabled and the link flaps a lot, that will cause many some instability of the BGP routing table, even more; remote service providers might dampen the routes you’re advertising and the more it flaps the greater the penalty on those routes.