Presentation is loading. Please wait.

Presentation is loading. Please wait.

Delayed Internet Routing Convergence Craig Labovitz, Abha Ahuja, Abhijit Bose, Farham Jahanian Presented By Harpal Singh Bassali.

Similar presentations


Presentation on theme: "Delayed Internet Routing Convergence Craig Labovitz, Abha Ahuja, Abhijit Bose, Farham Jahanian Presented By Harpal Singh Bassali."— Presentation transcript:

1 Delayed Internet Routing Convergence Craig Labovitz, Abha Ahuja, Abhijit Bose, Farham Jahanian Presented By Harpal Singh Bassali

2 Introduction  Conventional Wisdom - Rapid restoration and rerouting in the event of link or router failure.  Actual convergence time of the order of minutes!!  What happens to the data packets till then?  Loss of connectivity  Packet Loss  Latency

3 Infrastructure  Used both passive data collection and fault-injection machines.  Data collected over a 2 year period.  Injected over 250,000 routing faults from diverse locations.  Used RouteView probes to monitor BGP updates in core internet routers.  Active probe machines measured end-to-end performance by sending ICMP echo messages to random web sites.

4 Infrastructure

5 Taxonomy T up : A previously unavailable route is announced as available. T down : A previously available route is withdrawn. T short : An active route with a long ASPath is implicitly replaced by a new route with a shorter ASPath. T long : An active route with a long ASPath is implicitly replaced by a new route with a shorter ASPath.

6 Routing Measurements Latency Vs Number of BGP updates

7 Observations  Long Tailed distribution.  20% of T long and 40% of T down take more than 3 minutes to converge.  (T short, T up ) and (T long, T down ) form equivalence classes.  A 20 second separation between T long and T down.  T down and T long had twice as many update messages as T short and T up.  Strong correlation between number of updates and latency.

8 Routing Measurements Latency Vs Type of BGP update

9 Observations  Significant variation in convergence latencies for the ISPs.  No correlation between convergence latency and geographic or network distance.  Factors contributing to Internet fail-over delay are independent of network load and congestion.

10 End-to-End Measurements

11 Observations Packet Loss Vs Type of BGP update  Less than 1% packet loss throughout the 10 minute period.  T long event has 17% and T short event has 32% packet loss.  Wider curve of T long due to the slower speed of routing table convergence.

12 Observations Latency Vs Type of BGP update  Wider curve of T long due to the slower speed of routing table convergence.  T up event had all it’s packet within 1 minute.

13 BGP Convergence Upper Bound on Convergence

14 Assumptions  Each AS is a single node.  We have a complete graph of Ases.  Exclude the analysis of MinRouteAdver.  Model the BGP processing as a single linear, global queue.

15 BGP Convergence Upper Bound on Convergence

16 Results  Loop detection, if performed at both sender and receiver side, all mutual dependencies could be discovered and eliminated in a single round.  Convergence Latency is independent of geographic and network distance.  These variations are directly related to topological factors like the length and number of possible paths between ASes.

17 The Impact of Internet Policy and Topology on Delayed Routing Convergence Craig Labowitz, Roger Wattenhofer, Srinivasan Venkatachary and Abha Ahuja harpal: vbfdsvdjn harpal: vbfdsvdjn

18 Major Results  Internet fail-over convergence =, where n is the length of the longest backup path between source and destination.  Customers of bigger ISPs exhibit faster convergence.  Errant paths are frequently explored during delayed convergence.

19 Methodology  Inject BGP route transitions into more than 10 geographically and topologically diverse providers.  A set of probe machines actively injected faults at random intervals of roughly 2 hours.  Generated faults over a six month period.  Treated the address space as a customer wrt to policy and filtering by the cooperating providers.  Logged periodic routing table snapshots and all BGP updates from additional 20 ISPs.

20 Inter-provider Relationships Peer : Bilateral exchange of customer and backbone routing information. Routes learnt from other peers and upstream providers are not exchanged. Customer/Transit : The customer announces its backbone and downstream routes to an upstream provider. Backup transit : A peer relationship in which a provider only provides transit after detection of a fault. Both are peers in steady-state but after a failure, the backup transit peer begins advertising its now downstream peer’s backbone and customer routes.

21 Relationships

22 Convergence Topologies

23 Observations

24 Conclusions  Vagabond paths are responsible for delays in convergence.  The more densely the router is peered, the more time it takes to converge.  MinRouteAdver responsible for significant additional latency during delayed convergence.

25 Topology Impact on Convergence

26 Observations  Long-tailed distribution due to vagabond paths.  ISP3 exhibits significantly slower convergence times.  Average convergence latency for a route failure corresponds to the longest possible backup path allowed by policy and topology.

27 Latency Vs Longest ASPath explored Observations(contd.)

28 Provider Type Vs Observed ASPath length

29 Conclusions  Customers sensitive to fail-over latency should multi-home to larger providers.  Smaller providers should limit their number of transit and backup transit interconnections.  A large number of vagabond paths suggest a need for a better route validation and authentication mechanism.


Download ppt "Delayed Internet Routing Convergence Craig Labovitz, Abha Ahuja, Abhijit Bose, Farham Jahanian Presented By Harpal Singh Bassali."

Similar presentations


Ads by Google