The Impact of Internet Policy and Topology on Delayed Routing Convergence.

The Impact of Internet Policy and Topology on Delayed Routing Convergence

Good Old Days Internet always was and still is BAD in reliability, availability & QoS. For historical reasons QoS just was not there initially: “ Best Effort ” Principle. e-mail & web surfing did not place high standards.

Money Time It CAN ’ T be tolerated any more Internet is to became an major factor in economy. e-commerce, VoIP, real-time video, etc.

QoS Battle Efforts to bring QoS to Internet are enormous, BUT: Stable underlying infrastructure MUST exist for any application level solution!

Bad News Existing Internet Backbone DO NOT provide rapid restoration and rerouting NO effective interdomain path fail-over! Fail-Over for single failure takes milliseconds in PSTN, minutes in Internet

It hurts! Impact on performance is huge. While restoring path : 30 times more packet loss 4 times end-to-end latency Some fail-overs takes 15 minutes, average 3 minutes.

What ’ s the Problem? Slow Convergence during Fail-Over Routing tables oscillates after failure for long period seeking for consistent network view. WHO is to be blamed ? BGP - Currently used inter-domain routing protocol.

Nasty things about BGP AS path based BGP solves count-to- infinity of RIP, but exacerbates the number of routing table oscillations. For unbounded delay BGP : ALL possible paths may be explored after single failure : O(N!).

And More … Even assuming bounded delay, BGP convergence for full mesh topology without filters is O(N * T DELAY ) N is number of AS and there are 70000 of them in Internet. T DELAY is about 30 sec. (recommendation is 30 sec +/- short random jitter).

And Even More … It is possible for autonomous systems to define “ unsafe ” policies causing persistent route oscillation.

So What !? All this stuff is interesting in theory but has little touch with reality.

Any-Way ? BGP4 used in Internet routers has bounded delay, provided by MinRouteAdver timer delaying distribution of too rapid updates. So, O(N!) performance is irrelevant in real Internet.

Who cares? BGP divergence was never observed in practice and remains theoretical problem. There are modifications to BGP policies guaranteeing convergence.

What a Mesh !? Internet topology is long way from being complete mesh. BGP Updates filtering is done by almost every BGP node.

Now What ? Experimental results indicates fail-over problems caused by bad BGP performance. For studying and resolving those problems, much more realistic Internet BGP processes models should be developed.

Drug “ Providers ” ? Internet retains hierarchy with several tiers of ISPs. This hierarchy is specified by commercial relationships. Smaller ISP are customers of big ones.

Talk to me … Transit – upstream provider transits service to the customer. Default-free routing tables passed downstream. Customers & backbone routes passed upstream. Peer – symmetric connection providing access to each other customers. Never used for transit to other ISP. Only customers & backbone routes exchanged. Backup transit – normally acts like Peer, provides transit after fault detection.

It is strictly business Filtering mechanism of AS boarder routers are used for emphasizing those commercial relationships: If You don ’ t want other side to use some route – You should not announce it. So: Send customer & backbones routes to all peers. Provide with other routing information (learned from peers & upstream) only customers.

AB C D E F G H J I Peer ___ Transit ___ Back-up ___ Tier 1 Tier 2 Tier 3

No Free Lunches Transit relations – Inbound filters Prefix filters limiting customer announcement to “ legitimate ” address space of the customer. Used by 100% ISPs. Upstream customer is willing to transit routes for its customers only.

Friend to friend Peer relations – Outbound filters Community filters is based on tagging routes to distinguish customer routes. Only updates from routes tagged as customer routes will pass the filter. Used by 73% of ISPs

Don ’ t talk too much Peer relations – Outbound filters (cont.) Prefix filters also may be used to distinguish customer routes. Applying prefix filter only (used by 13% of ISPs) may cause creation of unintentional back-up transit path.

Check it Peer relations – Outbound filters (cont.) ASPaths regular expressions are used to explicitly permit routes advertising. Combination of ASPaths & prefix filters prevents creation of unintentional back-up transit path. Both ASPaths & prefix filters are used by 13 % of IPSs.

AB C D E F G H J I Peer ___ Transit ___ Back-up ___ Unintentional Back-up ___ Tier 1 Tier 2 Tier 3 D-C Example: In absence of ASPath check : path “D-C” learned after AD link failure will be announced to B by A (after DA link failure) providing unintentional back-up path from C to B through A.

Trust Me … Peer relations – Outbound filters Generally ISPs just trusts their peers to send only valid information. Only “ bogon ” filters identifying generally illegal (private, unallocated, etc.) addresses are applied. 80% ISPs use “ bogon ” filters. 20% ISPs use none.

Let Us Introduce … Model of BGP convergence is a directed graph. Node represent AS. Model is given for fixed destination X. The shortest path is chosen Arc e(u,v) exists iff u informs v about its best route to X (not vice versa) The graph is not symmetric Topology of graph differs for different destinations X

Up And Down Given X – client connected to network by single arc to node A (AS of X). Link goes down : T DOWN is the time elapsed until every node knows there is no path to X (new stable state) Connection reestablished : T UP is the time elapsed until all nodes add route to X to their tables.

What We Want to Hear After establishing connection : Node learns about its best path to X in time dependent on its shortest path to X Proof by simple induction. T UP convergence is ruled by d - maximal shortest distance from X to any node. O(d * T DELAY ), where T DELAY is T WAIT + T SEND T DELAY may be of the same order as MinDelayAdver,especially if implemented on per peer (not peer + destination) basis

And What We don ’ t … After A-X link goes down multiple update messages are sent along arcs. Nodes will announce back-up paths for them withdraw wasn ’ t received yet. Generally updates will propagate more slowly via long paths because router add 0 to 30 sec delay Always add ~30 sec after initial update received. Simple Path from X to A is covered by time T if any node in the path received update from preceding node and resend update to the next node before time T.

Long Down Node U has no route to X in time T iff all simple paths from X to U are covered. Simple path of length L is covered in O(L* T DELAY ) time. T DOWN convergence is ruled by D – length of longest simple path from X to any node. O(D * T DELAY )

What Do You Want ? Minimize network diameter for improving T UP - increase connectivity! Minimize longest possible paths for improving T DOWN - decrease connectivity! NP-complete problem For full mesh – diameter is 1, longest path is N

Welcome to the Reality 6 months of experimental studies. Geographically and topologically diverse BGP sessions with > 20 IPSs. Artificial BGP transitions (announcement & withdraws) injected in > 10 providers. Broad spectrum of other IPSs surveyed.

Real World Example Japanese ISP (ISP4) have BGP peer sessions with providers IPS1, ISP2, ISP3 at Mae-West. Withdraw route Ri from IPSi. Observe paths announced by IPS4 for every case.

ISP4 ISP1 ISP5 R1 Fault Steady State  The only back-up path explored is ISP1 -> ISP5 -> ISP4.  The path explored in 96% events, 92 sec. Average.  No path was explored in 4% events, 32 sec. Average.

ISP4 ISP2 ISP5 R2 Fault Steady State ISP6 ISP10 ISP13 Vagabond Path !  No path was explored in 7% events, 54 sec. Average.  Only ISP2-ISP5-ISP4 was explored in 63% events, 79 sec. Average.  ISP2-ISP5-ISP4 & ISP2-ISP5-ISP6-ISP4 was explored in 7% events, 88 sec. Average.  11 more unique paths in 45 distinct sequences of announcements.  Most of them are “vagabond” back-up paths resulting from router configuration errors. ISP11 ISP12

It Was an Easy One … Withdraw of R3 from ISP3 causes exploring fairly complex topology.  More than 20 distinct paths were announced.  Almost 150 different combinations of announcements.  Much bigger convergence times (~ 140 sec)  Only 35% of those paths are “legitimate” and the rest are “vagabond” unintentional back-up paths.

Do not Interfere! Selection & Order of back-up paths depends on interaction of MinRouteAdver timers on routers MinRouteAdver is usually implemented on peer (not peer+address) basis, so earlier instability interferes. For example: In ISP1 case in 4% of cases initial delay on IPS4 was longer than delay needed to propagate back-up path.

LA to SF via Haifa Vagabond paths were found in the majority of 200 monitored ISP pairs. Usually persist for short period (several days) Those erroneous paths do not conform any intended or published policy. Single error may have global impact mainly cause of lack of inbound filters on peer connections. Vagabond paths may impact performance and need to be automatically detected!

You call It line? Average convergence delay clearly corresponds to the length of the longest back-up path. Back-up paths are determined by policy and topology. Data contains significant variability but linear relationships is clued by the experimental data.

But Some are more equal Topology is dependent on ISP tier. Smaller ISP typically purchase transit from multiple upstream providers. Smaller ISP implements back-up transits policy unnecessary in large ISPs. Longest legitimate path : 9 ASes for Tier 1, 12 ASes for Tier 2.

This way Supported by the provided example: ISP1 is large tier-1 backbone provider ISP2 is moderate sized US-based tier-2 provider ISP3 is regional tier-3 network Tier-1 & tier-2 topology is much simpler and their customers are much less impacted by fail-over problems.

Now You See Internet lacks the level of reliability required by its future role. Route fail-over complexity scales linearly with longest back-up for the route. The back-up paths length depends on number of contractuals & policy implementation.

Advices are for free For Customer: If You do mission-critical stuff, connect to large providers. For Small ISP: Limit number of transit & backup transit connections. For All ASes: Avoid vagabond paths. Better route validation & authentication mechanism are needed.

Any Proposals?? Adaptive MinRouteAdver timers? Additional information inclusion into BGP withdrawal messages? Other?

The Impact of Internet Policy and Topology on Delayed Routing Convergence.

Similar presentations

Presentation on theme: "The Impact of Internet Policy and Topology on Delayed Routing Convergence."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Impact of Internet Policy and Topology on Delayed Routing Convergence.

Similar presentations

Presentation on theme: "The Impact of Internet Policy and Topology on Delayed Routing Convergence."— Presentation transcript:

Similar presentations

About project

Feedback