Routing Convergence Global Routing Internet Routing Convergence An Experimental Study of Delayed Internet Routing Convergence Craig Labovitz, Abha Ahuja,

Slides:



Advertisements
Similar presentations
Experimental Measurement of Delayed Convergence Abha Ahuja Internap/Merit Network, Inc. Craig Labovitz Microsoft Research/Merit Network, Inc. Farnam Jahanian,
Advertisements

The Impact of Policy and Topology on Internet Routing Convergence NANOG 20 October 23, 2000 Abha Ahuja InterNap *In collaboration with.
Advanced Networks 1. Delayed Internet Routing Convergence 2. The Impact of Internet Policy and Topology on Delayed Routing Convergence.
Delayed Internet Routing Convergence Craig Labovitz, Microsoft Research Abha Ahuja, University of Michigan Farnam Jahanian, University of Michigan Abhit.
Network Layer4-1 Hierarchical Routing scale: with 200 million destinations: r can’t store all dest’s in routing tables! r routing table exchange would.
Lecture 9 Overview. Hierarchical Routing scale – with 200 million destinations – can’t store all dests in routing tables! – routing table exchange would.
Path Vector Routing NETE0514 Presented by Dr.Apichan Kanjanavapastit.
1 Interdomain Routing Protocols. 2 Autonomous Systems An autonomous system (AS) is a region of the Internet that is administered by a single entity and.
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol –Datagram format.
4a-1 CSE401: Computer Networks Hierarchical Routing & Routing in Internet S. M. Hasibul Haque Lecturer Dept. of CSE, BUET.
Dynamic routing Routing Algorithm (Dijkstra / Bellman-Ford) – idealization –All routers are identical –Network is flat. Not true in Practice Hierarchical.
Lecture 14: Inter-domain Routing Stability CS 268 class March 8 th, 2004 (slides from Timothy Griffin’s tutorial and Craig Labovitz’s NANOG talk)
Network Layer4-1 Chapter 4 roadmap 4.1 Introduction and Network Service Models 4.2 Routing Principles 4.3 Hierarchical Routing 4.4 The Internet (IP) Protocol.
(c) Anirban Banerjee, Winter 2005, CS-240, 2/1/2005. The Impact of Internet Policy and Topology on Delayed Routing convergence C. Labovitz, A. Ahuja, R.
Delayed Internet Routing Convergence Craig Labovitz, Abha Ahuja, Abhijit Bose, Farham Jahanian Presented By Harpal Singh Bassali.
Network Layer4-1 Chapter 4 Network Layer Computer Networking: A Top Down Approach Featuring the Internet, 2 nd edition. Jim Kurose, Keith Ross Addison-Wesley,
Spring Routing & Switching Umar Kalim Dept. of Communication Systems Engineering 04/05/2007.
14 – Inter/Intra-AS Routing
04/05/20011 ecs298k: Routing in General... lecture #2 Dr. S. Felix Wu Computer Science Department University of California, Davis
Feb 12, 2008CS573: Network Protocols and Standards1 Border Gateway Protocol (BGP) Network Protocols and Standards Winter
Announcement r Project 2 Extension ? m Previous grade allocation: Projects 40% –Web client/server7% –TCP stack21% –IP routing12% Midterm 20% Final 20%
Routing in Wired Nets CS 215 W 01 - Mario Gerla. Routing Principles Routing: delivering a packet to its destination on the best possible path Routing.
1 ECE453 – Introduction to Computer Networks Lecture 10 – Network Layer (Routing II)
Computer Networks Layering and Routing Dina Katabi
The Routing & the IP network data link physical network data link physical network data link physical network data link physical network data link physical.
14 – Inter/Intra-AS Routing Network Layer Hierarchical Routing scale: with > 200 million destinations: can’t store all dest’s in routing tables!
I-4 routing scalability Taekyoung Kwon Some slides are from Geoff Huston, Michalis Faloutsos, Paul Barford, Jim Kurose, Paul Francis, and Jennifer Rexford.
1 Computer Communication & Networks Lecture 22 Network Layer: Delivery, Forwarding, Routing (contd.)
CS 3700 Networks and Distributed Systems Inter Domain Routing (It’s all about the Money) Revised 8/20/15.
Introduction 1 Lecture 19 Network Layer (Routing Protocols) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer Science &
CS 3830 Day 29 Introduction 1-1. Announcements r Quiz 4 this Friday r Signup to demo prog4 (all group members must be present) r Written homework on chapter.
10-1 Last time □ Transitioning to IPv6 ♦ Tunneling ♦ Gateways □ Routing ♦ Graph abstraction ♦ Link-state routing Dijkstra's Algorithm ♦ Distance-vector.
Routing in the Internet The Global Internet consists of Autonomous Systems (AS) interconnected with eachother: Stub AS: small corporation Multihomed AS:
Network Layer r Introduction r Datagram networks r IP: Internet Protocol m Datagram format m IPv4 addressing m ICMP r What’s inside a router r Routing.
Network Layer4-1 Distance Vector Algorithm Bellman-Ford Equation (dynamic programming) Define d x (y) := cost of least-cost path from x to y Then d x (y)
Network Layer4-1 Intra-AS Routing r Also known as Interior Gateway Protocols (IGP) r Most common Intra-AS routing protocols: m RIP: Routing Information.
TCOM 509 – Internet Protocols (TCP/IP) Lecture 06_a Routing Protocols: RIP, OSPF, BGP Instructor: Dr. Li-Chuan Chen Date: 10/06/2003 Based in part upon.
Internet Protocols. ICMP ICMP – Internet Control Message Protocol Each ICMP message is encapsulated in an IP packet – Treated like any other datagram,
An internet is a combination of networks connected by routers. When a datagram goes from a source to a destination, it will probably pass through many.
4: Network Layer4b-1 OSPF (Open Shortest Path First) r “open”: publicly available r Uses Link State algorithm m LS packet dissemination m Topology map.
Routing in the Inernet Outcomes: –What are routing protocols used for Intra-ASs Routing in the Internet? –The Working Principle of RIP and OSPF –What is.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 11 - Inter-Domain Routing - BGP (Border Gateway Protocol)
IP. Classless Inter-Domain Routing Classful addressing scheme wasteful – IP address space exhaustion – A class B net allocated enough for 65K hosts Even.
4: Network Layer4a-1 Distance Vector Routing Algorithm iterative: r continues until no nodes exchange info. r self-terminating: no “signal” to stop asynchronous:
1 Chapter 4: Internetworking (IP Routing) Dr. Rocky K. C. Chang 16 March 2004.
Network Layer4-1 Chapter 4: Network Layer r 4. 1 Introduction r 4.2 Virtual circuit and datagram networks r 4.3 What’s inside a router r 4.4 IP: Internet.
Inter-domain Routing Outline Border Gateway Protocol.
Application Layer 2-1 Chapter 4 Network Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012 A.
1 Internet Routing 11/11/2009. Admin. r Assignment 3 2.
1 CS716 Advanced Computer Networks By Dr. Amir Qayyum.
Dynamic routing Routing Algorithm (Dijkstra / Bellman-Ford) – idealization All routers are identical Network is flat. Not true in Practice Hierarchical.
14 – Inter/Intra-AS Routing
Chapter 4: Network Layer
NAT – Network Address Translation
CS 5565 Network Architecture and Protocols
Dynamic routing Routing Algorithm (Dijkstra / Bellman-Ford) – idealization All routers are identical Network is flat. Not true in Practice Hierarchical.
ICMP ICMP – Internet Control Message Protocol
Chapter 4: Network Layer
Dynamic routing Routing Algorithm (Dijkstra / Bellman-Ford) – idealization All routers are identical Network is flat. Not true in Practice Hierarchical.
Routing.
Department of Computer and IT Engineering University of Kurdistan
CS 3700 Networks and Distributed Systems
Chapter 4: Network Layer
CS 3700 Networks and Distributed Systems
Chapter 4: Network Layer
Chapter 4: Network Layer
COS 461: Computer Networks
Computer Networks Protocols
Routing.
Dynamic routing Routing Algorithm (Dijkstra / Bellman-Ford) – idealization All routers are identical Network is flat. Not true in Practice Hierarchical.
Presentation transcript:

Routing Convergence Global Routing

Internet Routing Convergence An Experimental Study of Delayed Internet Routing Convergence Craig Labovitz, Abha Ahuja, Farnam Jahanian, Abhijit Bose ACM Sigcomm September 2000

Hierarchical Routing -- Review scale: with 50 million destinations: can’t store all dest’s in routing tables! routing table exchange would swamp links! administrative autonomy internet = network of networks each network admin may want to control routing in its own network Untruths about Internet Routing: all routers identical network “flat” … not true in practice

Hierarchical Routing aggregate routers into regions, “autonomous systems” (AS) routers in same AS run same routing protocol –“inter-AS” routing protocol –routers in different AS can run different inter- AS routing protocol special routers in AS run inter-AS routing protocol with all other routers in AS also responsible for routing to destinations outside AS –run intra-AS routing protocol with other gateway routers gateway routers

Intra-AS and Inter-AS routing Gateways: perform inter-AS routing amongst themselves perform intra-AS routers with other routers in their AS inter-AS, intra-AS routing in gateway A.c network layer link layer physical layer a b b a a C A B d A.a A.c C.b B.a c b c

Intra-AS and Inter-AS routing Host h2 a b b a a C A B d c A.a A.c C.b B.a c b Host h1 Intra-AS routing within AS A Inter-AS routing between A and B Intra-AS routing within AS B

AS graphs obscure topology! The AS graph may look like this. Reality may be closer to this… Tim Griffin, Leiden 2000

Inter-AS routing (cont) BGP (Border Gateway Protocol): the de facto standard Path Vector protocol: and extension of Distance Vector Each Border Gateway broadcast to neighbors (peers) the entire path (ie, sequence of ASs) to destination For example, Gateway X may store the following path to destination Z: Path (X,Z) = X,Y1,Y2,Y3,…,Z

Inter-AS routing (cont) Now, suppose Gwy X send its path to peer Gwy W Gwy W may or may not select the path offered by Gwy X, because of cost, policy ($$$$) or loop prevention reasons. If Gwy W selects the path advertised by Gwy X, then: Path (W,Z) = w, Path (X,Z) Note: path selection based not so much on cost (eg,# of AS hops), but mostly on administrative and policy issues (e.g., do not route packets through competitor’s AS)

Inter-AS routing (cont) Peers exchange BGP messages using TCP. OPEN msg opens TCP connection to peer and authenticates sender UPDATE msg advertises new path (or withdraws old) KEEPALIVE msg keeps connection alive in absence of UPDATES; it also serves as ACK to an OPEN request NOTIFICATION msg reports errors in previous msg; also used to close a connection

Why different Intra- and Inter-AS routing ? Policy: Inter is concerned with policies (which provider we must select/avoid, etc). Intra is contained in a single organization, so, no policy decisions necessary Scale: Inter provides an extra level of routing table size and routing update traffic reduction above the Intra layer Performance: Intra is focused on performance metrics; needs to keep costs low. In Inter it is difficult to propagate performance metrics efficiently (latency, privacy etc). Besides, policy related information is more meaningful. We need BOTH!

What is Routing Policy? Description of the routing relationship between autonomous systems –Who are the peers? –What routes are Originated by a peer? Imported from each peer? Exported to each peer? Preferred when multiple routes exist? –What to do if no route exists?

The example I mentioned earlier Date: Fri, 25 Apr :16: (CDT) Subject: ** ALERT – Massive Routing Failures *** At about 10:30 AM today, one of Sprints customers (AS7007, Florida Internet Exchange) began announcing a /24 route for every CIDR block in the core routing table. This was due to a configuration problem in that they imported all their routing into a classfull interior routing protocol and then redistributed the route back into BGP, becoming a source for the first class C network in every CIDR block. Sprint does no border routing filters, so they happily accepted these routes and gave them away to all…

Motivation Why we should care about convergence? Routing reliability/fault-tolerance on small time scales (minutes) not previously a priority Emerging transaction oriented and interactive applications (e.g. Internet Telephony) will require higher levels of end2end network reliability How well does the Internet routing infrastructure tolerate faults?

Conventional Routing Wisdom The Internet is designed to survive a nuclear cataclysm. Internet routing is robust under faults –Supports path re-routing and restoral on the order of seconds The internet supports fast path rerouting and restoral. BGP has good convergence properties –Does not exhibit looping/bouncing problems of RIP Internet fail-over will improve with faster routers and faster links More redundant connections (multi-homing) to Internet will always improve site fault-tolerance

Contribution Labovitz et al show that most of the conventional wisdom about routing convergence is not accurate… –Measurement of BGP convergence in the Internet –Analysis/intuition behind delayed BGP routing convergence –Modifications to BGP implementations which would improve convergence times

Motivation Why has fail-over and fault-tolerance not previously been a priority? –Applications like not delay sensitive and possess fault-tolerance –TCP/IP fault-tolerance (resend) –Content replication helps improve reliability for static content Network support is required for emerging transaction oriented and interactive applications (e.g. Internet Telephony, QoS)

Building a Reliable Internet What Network support has been proposed already? Significant recent improvement on data-link fail- over (e.g. SRP, Sonet). Solves some enterprise, intra-domain reliability problems Also significant research on QoS and resource reservation protocols for the Internet –But, all of these protocols assume stable underlying IP forwarding path

Background Internet sites multi-home, or purchase connectivity from multiple Internet providers to improve fault tolerance –Goal: tolerate a single link, router or ISP failure –35% Internet end-sites currently multi-homed

Background: Multi-homing Sprint Verio INTERNET BGP

PSTN versus Internet Public Switched Telephone Network (PSTN) is the “other” network in place. Trade-off between –scalability/extensibility/low cost and –fault-tolerance/service guarantees/high cost PSTN retains significant intermediate state (i.e. circuit setup) and services on relatively few nodes. A “Smart Network” Internet places all intelligence on end-nodes. A “Stupid Network”

Trade-Offs Scalability Flexibility Distributed Operation State Reliability Service Guarantees Development Time Switch Cost Coordination PSTN High Low High Low

Routing Unlike circuit-switched PSTN, packet-switched Internet uses hop-by-hop forwarding and next-hop selection Global state and circuit-setup used in PSTN –this is like owning an atlas and planning route Internet routers only keep local knowledge and routes learned from neighbors –like asking directions at each stop

Internet Routing Inter-domain Internet routing protocols are distance vector (i.e. Bellman-Ford) algorithms. Unlike PSTN, no pre-computed backup paths! Distance vector protocols are problematic –Require time to converge –Suffer from “counting to infinity”

Problems with Distance Vector Protocols Counting to Infinity A B 2 B2R3B2R3 A2R1A2R1 R 1 R5R5 R=3 R=5 R7R7 R=7 NodeDistance A B R

Internet Routing The Internet inter-domain routing protocol, BGP, “solves” count-to-infinity problem by keeping record of path the route announcement has traveled through network Internet routing commonly (and incorrectly) believed to converge within 30 seconds

BGP Routing AS1 R AS2 AS1 R R AS3 AS2 AS1 R

BGP Open Question After a fault in a path to multi-homed site, how long does it take for the majority of Internet routers to fail-over to the secondary path? –Routing table convergence (backbone routers reach steady- state) after a fault –End-to-end paths stable (“normal” levels of loss and latency) Customer Primary ISP Backup ISP BGP TRAFFIC

Internet Fail-Over Experiments Instrument the Internet –Inject routes into geographically and topologically diverse provider BGP peering sessions (Mae-West, Japan, Michigan, London) –Periodically fail and change these routes (i.e. send withdraws or new attributes) –Monitor impact faults through 1) recordings of BGP peering sessions with 20 tier1/tier2 ISPs and 2) active ICMP ECHO measurements (512 byte/second to 100 random web sites) –Write lots of Perl scripts –Wait two years… (125,000 routing events)

Experiment (For the Last Two Years)

Fault Scenarios Tup -- A new route is advertised Tdown -- A route is withdrawn (i.e. single-homed failure) Tshort -- Advertise a shorter/better ASPath (i.e. primary path repaired) Tlong -- Advertise a longer/worse ASPath (i.e.primary path fails)

Major Convergence Results Routing convergence requires an order of magnitude longer than expected (10s of minutes) Routes converge more quickly following Tup/Repair than Tdown/Failure events (“bad news travels more slowly”) Curiously, withdrawals (Tdown) generate several times the number of announcements than announcements (Tup)

Example of BGP Convergence TIMEBGP Message/Event 10:40:30Route Fails/Withdrawn by AS :41: announce :41: announce :41: announce :42: announce :43: announce :43: announce :43: sends withdraw BGP log of updates from AS2117 for route via AS2129 One BGP withdrawal triggers 6 announcements and one withdrawal from 2117 Increasing ASPath length until final withdraw

CDF of BGP Routing Table Convergence Times Short->Long Fail-Over New Route Long->Short Fail-over Failure Less than half of Tdown events converge within two minutes Tup/Tshort and Tdown/Tlong form equivalence classes Long tailed distribution (up to 15 minutes)

Impact of Delayed Convergence Why do we care about routing table convergence? It deleteriously impacts end-to-end Internet paths ICMP experiment results –Loss of connectivity, packet loss, latency, and packet re-ordering for an average of 3-5 minutes after a fault –Why? Routers drop packets for which they do not have a valid next hop. Also problems with cache flushing in some older routers.

End-to-End Impact Failover ICMP loss to 100 randomly chosen web sites with VIF source address of our probe Tlong/Tshort exhibit similar relationship as before

Delayed Convergence Background Well known that distance vector protocols exhibit poor convergence behaviors –Counting to infinity, looping, bouncing problem RIP redefines infinity and adds split-horizon, poison reverse, etc. –Still, slow convergence and not scalable BGP advertises ASPaths instead of distance –Solves counting to infinity and RIP looping problem, but… –BGP can still explore “invalid” paths during convergence (i.e. the bouncing problem)

BGP Convergence Example R AS0 AS1 AS2 AS3 *B Rvia 3 B R via 03 B R via 23 *B Rvia 3 B R via 03 B R via 13 *B Rvia 3 B R via 13 B R via 23 AS0AS1AS2 *** *B R via 203 *B R via 013 B R via 103

N > 4? AS1673 AS237 AS5696 AS2497 AS1239 AS6453 AS701 AS2914 AS6461 AS5000 AS6113 AS

MinRouteAdver Rounds Implementation of MinRouteAdver timer and receiver-side loop detection timer leads to 30 second rounds O(n-3)*30 seconds time complexity

An Experiment with SSF.OS.BGP4 The Model –Topology: full mesh of N ASes, each with just 1 router –No route filtering –Shortest path is best Advertise, Withdraw, Wait and Watch –Wait for system to reach stable state, then … –AS #1 advertises a bogus destination to everyone else –Wait for system to reach a stable state again, then … –AS #1 tells everyone that the bogus route is not reachable through it any more –Wait for system to reach a stable state again

bogus N avg # updates due to withdrawal (range) (35-84) (58-397) ( ) ( ) ( ) longest path convergence time after withdrawal (sec)

snd update to wds=bogus snd update to wds=bogus snd update to wds=bogus snd update to wds=bogus snd update to nlri=bogus,asp= snd update to wds=bogus snd update to wds=bogus snd update to nlri=bogus,asp= snd update to wds=bogus.

The Problem with BGP If we assume –unbounded delay on BGP processing and propagation –Full BGP mesh BGP peers –Constrained shortest path first selection algorithm BGP is O(N!), where N number of default-free BGP speakers There exists possible ordering of messages such that BGP will explore all possible ASPaths of all possible lengths

BGP and RIP RIP precisely monotonically increasing. Can explore metrics (1…N) BGP monotonically increasing. Multiple (N!) ways to represent a path metric of N. BGP “solved” RIP routing table loop problem by making it exponentially worse…

BGP Best Case What is the best we can expect from BGP? What is the best we can expect from BGP? Implementation of MinRouteAdver timer leads to 30 second rounds Implementation of MinRouteAdver timer leads to 30 second rounds Time complexity is O(n-3)*30 seconds Time complexity is O(n-3)*30 seconds State/Computational complexity O(n) State/Computational complexity O(n) At its best, BGP performs as well as RIP2 (but uses exponentially more memory in the process) At its best, BGP performs as well as RIP2 (but uses exponentially more memory in the process)

MinRouteAdver Minimum interval between successive updates sent to a peer for a given prefix –Allow for greater efficiency/packing of updates –Rate throttle Applied only to announcements (at least according to BGP RFC) Applied on (prefix destination, peer) basis, but implemented on (peer) basis

MinRouteAdver 30*(N-3) delay due to creation mutual dependencies. Provide proof that N-3 rounds necessarily created during bounded BGP MinRouteAdver convergence Rounds due to –Ambiguity in the BGP RFC and lack receiver loop detection –Inclusion of BGP withdrawals with MinRouteAdver (in violation of RFC)

Simulation Results

Intuition for Delayed BGP Convergence There exists possible ordering of messages such that BGP will explore ALL possible ASPaths of ALL possible lengths –BGP is O(N!), where N number of default-free BGP speakers in a complete graph with default policy Although seemingly very different protocols, BGP and RIP share very similar convergence behaviors. Major difference: –RIP explores metrics (1…N) –BGP ASPath provides multiple ways to represent metric (path) of length N, or (N-1)!

Lower Bound on BGP If assume optimal ordering of messages, what is the best we can expect from BGP? In practice, BGP timers (MinRouteAdver) provide synchronization and limit possible orderings of messages –MinRouteAdver timer specifies interval between successive updates sent to a peer for a given prefix –Useful for bundling updates together –According to RFC, MinRouteAdver applies only announcements But, interaction of MinRouteAdver and vendor ASPath loop detection implementation introduce “artificial” delay

Conclusions Internet does not posses effective inter-domain fail-over (15 minutes is a long time for phone call) Majority of BGP convergence delay due to vendor implementation decisions of MinRouteAdver and loop detection In practice, Internet is not a complete graph and same degree of message re-ordering unlikely. Our current work: –What is the impact of ISP policy and topology on BGP convergence? –Can we improve BGP convergence times?