Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding Andrey Ermolinskiy, Scott Shenker University of California – Berkeley and ICSI.

Similar presentations


Presentation on theme: "Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding Andrey Ermolinskiy, Scott Shenker University of California – Berkeley and ICSI."— Presentation transcript:

1 Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding Andrey Ermolinskiy, Scott Shenker University of California – Berkeley and ICSI

2 What’s the problem? One of the central goals of the Internet - continuous end-to-end connectivity BGP convergence is a major cause of connectivity disruption  Routers operate upon potentially inconsistent local views  Temporary inconsistencies give rise to anomalies such as loops and black holes that disrupt end-to-end packet delivery

3 Example: transient routing loop with BGP A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA withdraw BA

4 A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA withdraw BA Routing loop between C and D incurs temporary loss of connectivity between {B, C, D, E, F} and A. Example: transient routing loop with BGP

5 Related Work Shrinking the convergence time window through BGP protocol extensions  Ghost flushing  Consistency assertions Protecting end-to-end packet delivery from adverse effects of convergence  R-BGP Forward packets on pre-computed failover paths, Propagate root cause information to prevent loops  Consensus Routing Enforce a globally-consistent view via distributed snapshots and strategically delay adoption of incoming BGP updates  Anomaly-Cognizant Forwarding

6 Anomaly-Cognizant Forwarding (ACF) Approach  Accept routing anomalies as an unavoidable fact  Protect end-to-end packet delivery by detecting and recovering from anomalies on the forwarding path Main hypothesis  Several simple and lightweight extensions to conventional IP forwarding enable us to sustain packet delivery during periods of BGP instability without the use of pre-computed backup paths without modifying the core routing protocol or altering its timing dynamics

7 Domain S has anomalous forwarding state for destination D if S’s outgoing packets destined for D arrive back to S as result of a routing loop. Main idea of ACF:  Detect occurrences of anomalous state  Avoid forwarding packets via domains that are known to have anomalous state. S D Anomalous forwarding state ACF Overview Each packet carries a list of prior AS-level hops (pathTrace) Each packet carries a blackList of domains with anomalous state pathTraceblackList Packet header

8 ACF Overview Forward (packet p ) { if ( localASNum in p.pathTrace ) Move loop elements from p.pathTrace to p.blackList nextHop  lookupNextHop ( p.destAddr ) if ( nextHop in p.blackList ) Invoke the control plane, look for alternate non-blacklisted routes in the RIB if ( nextHop != NONE ) { Append localASNum to p.pathTrace SendPacket( p, nextHop ) } else Initiate recovery-mode forwarding for p }

9 ACF Recovery-mode forwarding Normal-mode forwarding Recovery-mode forwarding Intuition: R or some router along the path to R may know a working alternate route to the original destination. If a router is unable to forward a packet because it does not have a valid non- blacklisted route, it initiates recovery forwarding.  Chooses a recovery destination R from a static and well-known set of highly- connected Tier-1 domains.  Detours the packet through R. R1R1 R2R2 nextHop=NONE Recovery destinations

10 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p p.Header pathTrace = [ C ]blackList = { } dst = AorigDst =

11 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p p.Header pathTrace = [ C D ]blackList = { } dst = AorigDst =

12 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p pathTrace = [ C D ]blackList = {D } p.Header dst = AorigDst = C initiates recovery forwarding through domain F

13 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p p.Header pathTrace = [ ]blackList = {C D } dst = ForigDst = A C initiates recovery forwarding through domain F

14 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p p.Header pathTrace = [ ]blackList = {C D } dst = ForigDst = A C initiates recovery forwarding through domain F

15 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p p.Header pathTrace = [ C]blackList = {C D } dst = ForigDst = A C initiates recovery forwarding through domain F

16 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p p.Header pathTrace = [ C]blackList = {C D } dst = ForigDst = A C initiates recovery forwarding through domain F

17 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p p.Header pathTrace = [ C]blackList = {C D E} dst = ForigDst = A C initiates recovery forwarding through domain F

18 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p p.Header pathTrace = [ C E]blackList = {C D E} dst = ForigDst = A C initiates recovery forwarding through domain F

19 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p p.Header pathTrace = [ C E]blackList = {C D E} dst = ForigDst = A C initiates recovery forwarding through domain F

20 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p p.Header pathTrace = [ ]blackList = {C D E} dst = ForigDst = A C initiates recovery forwarding through domain F F resumes normal-mode forwarding

21 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p p.Header pathTrace = [ F]blackList = {C D E} dst = ForigDst = A C initiates recovery forwarding through domain F F resumes normal-mode forwarding

22 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p p.Header pathTrace = [ F G]blackList = {C D E} dst = ForigDst = A C initiates recovery forwarding through domain F F resumes normal-mode forwarding

23 Anomaly-Cognizant Forwarding A B CD EF G 1. BA 2. CBA 1. BA 2. DBA 1. CBA 2. DBA 1. ECBA 2. GA p p.Header pathTrace = [ F G]blackList = {C D E} dst = ForigDst = A C initiates recovery forwarding through domain F F resumes normal-mode forwarding

24 Anomaly-Cognizant Forwarding A B CD EF G

25 ACF: Observations ACF does not use pre-computed failover paths  Discovers alternate routes dynamically using state in the packet header  The two forwarding modes make use of the same forwarding table Paths to recovery destinations are not assumed to be stable and anomaly-free  We protect recovery-mode forwarding using the same mechanism (pathTrace and blackList)

26 ACF: Preliminary Evaluation Evaluation metrics  Effectiveness in eliminating transient disconnectivity  Efficiency of alternate paths  Packet header overhead

27 ACF: Preliminary Evaluation Simulation methodology  CAIDA AS-level topology (27969 nodes) annotated with inferred inter-AS relationships  12937 multihomed edge domains, 29426 adjacent provider links  Provider link failure experiment For each multihomed domain D, and each provider link L  Fail L and simulate packet delivery from every other domain to D during convergence D S1S1 S2S2 S4S4 S3S3 Recovery destinations = 10 highly-connected Tier-1 ISPs Packet TTL = 32 hops

28 ACF: Preliminary Evaluation Transient disconnection after a link failure  BGP with conventional forwarding 51% of failures cases produce unwarranted disconnection Widespread disconnection (>50% of ASes) in 17% of cases  BGP with ACF No disconnection in 92% of failure cases <1% of ASes see disconnection in 98% of failure cases

29 ACF: Preliminary Evaluation Transient path efficiency  Causes of path dilation in ACF Transient loops Detouring via a recovery destination F – failure cases that produce transient disconnection with conventional forwarding  In 65% of failure cases that produce disconnectivity, ACF recovers packets using ≤ 2 extra hops  9% of cases require 7 hops or more

30 ACF: Preliminary Evaluation Packet header overhead % of ASes disconnected 0%0.09%0.9%9%90% pathTrace length 1116 2013 blackList length 4119 16 Maximum number of pathTrace and blackList entries in a representative sample of failure cases.  Worst-case pathTrace – 20 entries 40 bytes of overhead assuming 16-bit AS numbers  Worst-case blackList – 16 entries 10 bytes of overhead for a Bloom filter with 1% error rate

31 Challenges / Concerns Feasibility of deployment  ACF adds fields to packet header and modifies core IP forwarding logic. Packet processing overhead  Control plane is invoked only during periods of instability  Common case: check pathTrace and blackList. Both operations admit efficient implementation in hardware and parallelization. ACF and routing policies

32 Thank you. Questions?


Download ppt "Reducing Transient Disconnectivity using Anomaly-Cognizant Forwarding Andrey Ermolinskiy, Scott Shenker University of California – Berkeley and ICSI."

Similar presentations


Ads by Google