4/12/20152 Today Known problems of BGP –Multi-homing –Instability –Delayed convergence Slow failover Discussing fixes –Root cause, ghost flushing etc.
4/12/20153 Failover BGP is designed for scaling more than fast failover –Many mechanisms favor this balance –Route flap damping, for example. If excess routing changes (“flapping”), ignore for some time. Has unexpected effects on convergence times. –Route advertisement/withdrawal timers in the 30 second range –Effect: tens of seconds to many minutes to recover from “simple” failures. –15-30 minute outages not uncommon.
4/12/20154 Multi-homing Connect to multiple providers –Goal: Higher availability, more capacity Problems: –Provider-based addressing breaks –Everyone needs their own address space
4/12/20155 Multi-homing increases routing table size Mutil-home.com 18.104.22.168/8 22.214.171.124/8 126.96.36.199/16 ISP2 ISP1 You can reach 188.8.131.52/8 And 184.108.40.206/16 via ISP1 ISP3 220.127.116.11/16 ISP1 18.104.22.168/16 22.214.171.124/8 ISP1126.96.36.199/16 ISP2188.8.131.52/8 ISP2
4/12/20156 Global routing tables continue to grow Source: http://bgp.potaroo.net/as6447/
4/12/20157 Other BGP problems Convergence: BGP may explore many routes before finding the right new one. –Labovitz et al., SIGCOMM 2000 Correctness: routes may not be valid, visible, or loop-free. Security: There is none! –Some providers filter what announcements their customers can make. Not all do. –See paper discussion site for pointers
4/12/20159 Internet Routing Instability Goals: how often BGP sends updates to change routes Methodology: –Analyzing BGP logs for a long time
Terms WADiff: withdrawal announcement AADiff: announcement announcement WADup: same route withdrawal announcement AADup: same route announcement announcement WWDup: same route withdrawal wthdrawal 4/12/2015CPS 21410
Observed pathologies Repeated WWDup, WADup, AADup Why are they pathologies? 4/12/2015CPS 21411
4/12/201512 Majority of BGP updates are WWDup WWDup belong to ASes that never announce them Why? –Stateless BGP, does not remember what have sent to peers –Send withdrawals to all peers
Possible origins of instability Stateless BGP Physical link errors Unjittered timers IGP, BGP interactions Conflicting routing policies 4/12/201513
4/12/201514 Data analysis techniques Time series analysis Frequency analysis –Fast Fourier transform –Maximum entropy spectral estimation Different estimation methods, but both find significant frequencies at seven days, and 24 hours
Main results Much more updates than expected –99% is pathological. Impressive! –A taxonomy to analyze pathologies Speculation of causes –Configuration errors, router bugs –Correlate with traffic load, perhaps due to router architectures –Open research questions: root cause of updates Motivated much follow-up work 4/12/201515
4/12/201516 End-to-end routing behavior Goals: study routing pathologies, route stability, and routing asymmetry Methodology: –End-to-end measumrents –Traceroutes from N sites, N 2 paths –Exponentially spaced-out sampling Nice properties! Unbiased PASTA: the fraction of measures that observe a given state is equal to the time that the system spends in that state –Two datasets: D1 and D2 for different intervals 1~2 days, 2 hrs, 2.75 days
Measurement infrastructure 37 hosts 8% of Internet at that time Pair-wise traceroute 4/12/201517
4/12/201518 Pathologies Persistent loops: 0.13-0.16%, some lasted hours –Same router appearing in traceroute more than three times Erroneous routing: packets to UCL sent to Israel Mid-stream change, 0.16% (D1) and 0.44% (D2) –Suggests route change Infrastructure Failure: availability 99.8%, 99.5% –Unreachable to host –Telephone networks: 2 hours in 40 years, five nines Outages: more than six packet losses in a traceroute
4/12/201519 Stabilities Prevalence: the probability of observing a particular route Persistence: how long a route lasts Examples: –R1, R2, R1, R2 –R1, R1, R2, R2 –Same prevalence, different persistence
4/12/2015CPS 21420 Prevalence of dominant route P domp =k p /n p Internet paths are strongly dominated by one path
4/12/201522 Path asymmetry Common –Don’t assume path symmetry in your design –49% of measures have asymmetric paths differed by at least one city –30% observed different ASes –20% differ by more than one city/AS –Q: what might cause it?
Comments on this paper Seminal work on Internet measurement Solid data Rigorous analysis 4/12/201523
4/12/201524 Delayed Internet Convergence Measurement Problem discovery Modeling & analysis Improvement Methodologies
4/12/201525 Experiments setup Actively inject BGP faults –How is fault injected? Passively listen at peering sessions, and use NTP synchronized machines to calculate the convergence time Actively send probe packets to observe end-to-end packet loss and latency Much BGP work later uses similar measurement techniques.
4/12/201526 Results show delayed convergence Bad news travels slow.
4/12/201527 Slow routing convergence results in poor end-to-end performance
4/12/201528 What causes the delayed routing convergence? A simple BGP convergence model reveals that in the worse case, all possible paths are explored before a prefix is withdrawn. No minimum advertisement timer: synchronized network, global message queue 0 12 R (*0R, ∞,, 2R) (∞, ∞, *2R) (∞, ∞, *20R) (*0R, 1R, ∞,) (01R, *1R, ∞) (*01R, 10R, ∞,) (∞, *1R, 2R) (∞, ∞, *2R) (∞, ∞, ∞) 01R 10R 20R
4/12/201529 Min router Advertisement interval timer(MRAI) reduces message count Why? –MATI introduces synchronization. Multiple announcements are combined into one announcement, reducing the total message count. However, the convergence time becomes proportional to timer_interval * (n-3)
4/12/201530 Let’s brain storm… How can we fix the slow convergence problem? –What is the solution proposed by the authors? Sender-side loop detection. When a sender detects a loop, it sends a withdrawal to a neighbor immediately. Since withdrawal is not subject to MATI delay, this improvement reduces both message count and convergence time. –What exactly is the root cause of BGP’s slow convergence problem? –Can you come up with any solution?
4/12/201531 Sender-side loop detection Without sender-side loop detection: –AS3 AS1: 301R –This announcement is sent out when MRAI timer expires With sender-side loop detection: –AS3->AS1: withdrawal –Withdrawal is sent out immediately. AS1 knows it has no path.
4/12/201532 BGP assertion Detect path inconsistency between different neighbors If inconsistency is found, give path learned from direct neighbors high priority Sensitive to topology Does not eliminate all invalid paths N1R D XY N2 X N1 R: N1 X Y D R: N1 D N2 N1 D
4/12/201533 Ghost flushing If new path is worse than last announced path, and router advertisement timer has not expired yet, send a withdrawal immediately. The withdrawal flushes “ghost” information. Reduces the convergence time because withdrawals are not delayed by MRAI, but does not help much with “Tlong.” N1R D XY N2 X N1 R: withdrawal R: N1 D N2 N1 D
4/12/201534 BGP root cause notification Neither BGP-assertion nor ghost flushing works well in this topology. –Why? –BGP-assertion: 3 and 6 are both direct neighbors to 5, but their announcements may be inconsistent –BGP ghost flushing: the newer path is subject to MATI delay Explicitly send out link up/down information Essentially adds link-state information into BGP Sequence number is used to order the notifications. Open research problem: can you get rid of sequence number? 5 3 4 2 1 0 X 6
4/12/201535 Summary BGP’s slow convergence problem and other problems It represents a message overhead, processing overhead, and latency tradeoff. We do not yet know the best solution to address this problem.
Comments Measurement paper –Data –Data collection techniques –Data analysis When to write a measurement paper 4/12/201536