Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Study of Multiple IP Link Failure Fang Yu

Similar presentations


Presentation on theme: "A Study of Multiple IP Link Failure Fang Yu"— Presentation transcript:

1 A Study of Multiple IP Link Failure Fang Yu fyu@eecs.berkeley.edu

2 Motivation Baltimore Tunnel Fire 18 July 2001  “… Keynote Systems … says the July 19 Internet slowdown was not caused by the spreading of Code Red. Rather, a train wreck in a Baltimore tunnel that knocked out a major UUNet cable caused it.”  “The fire severed two OC-192 links between Vienna, VA and New York, NY as well as an OC-48 link from, D.C. to Chicago. … Metromedia routed traffic around the fiber break, relying heavily on switching centers in Chicago, Dallas, and D.C.”  “Traffic slowdowns were also seen in Seattle, Los Angeles and Atlanta, possibly resulting from re-routing around the affected  “The accident caused certain connections 10 times slower than normal, such as the ones between Washington, D.C., and San Diego backbones.” R. Katz, “CS294-3: Distributed Service Architectures in Converged Networks”

3 Transport Layer Why Multiple IP Link Failure? IP Layer Multiple Link Failure Does Occur! J. Strand “Optical Network” with Modification

4 Methodology Use SSFNet network simulator with OSPF v2 extension (RFC 2328) Use a 24-node 54-link IP network from SSFNet Evaluation matrix  Convergence Time  Number of Route Changes  Loop: record the duration of each loop sum up as total loop time  Invalid Routes: record the duration of the each route containing failed link sum up as total Invalid route time  Unreachable Routes: Record the total number of routes failed due to network partitioning

5 A Case Study of Two-Link Failure Two links fail simultaneously from time 50s to 150s.  OSPF detects the first link failure at time 81.03 second, converges at time 86.04 second  OSPF detects the first link up at time 156.08 second, converges at time 161.1 second

6 LSA Update Messages and Route Pathology Caused at Link Down Time

7 OSPF Convergence Time and Number of Route Change There are dramatic differences between failure cases 3-link failure converges slower than node failure although node failure brings down an average of 4.5 links!  Neighbor node has some what synchronized clock  detect multiple link failure almost at the same time 55% cases, node failure generates less route changes than multiple failure cases

8 Route Pathology Node failure causes a lot of loop routes Multiple Link failure cause more invalid routes

9 Summary of Observations Multiple link failure is more problematic than node failure  Each node failure will bring down An average of 4.5 links All the connections originated from it or destined to it.  2-link failure and 3-link failure cause average 1/2 or 2/3 less link failures than a node failure 55% of the cases, it will create more route changes and 10 times more invalid routes during OSPF re-route time.  Different combinations of multiple link failure have dramatic different impact on OSPF Reason: The multiple link failure won’t bring down nodes, so OSPF has to re-route a larger number of connections compared to node failure

10 Future Work Study on the real IP network  E.g. AT&T Common IP Backbone Study of correlated IP link failure based on optical topology  IP link failure are not randomly correlated  A Fiber cut will cause more link failures Propose multi-layer routing scheme to effectively deploy IP layer on Optical network  Avoid severe multiple IP link failure scenarios  Minimize the re-route duration and route pathology under failure


Download ppt "A Study of Multiple IP Link Failure Fang Yu"

Similar presentations


Ads by Google