i-4 routing scalability Taekyoung Kwon Some slides are from Geoff Huston, Michalis Faloutsos, Paul Barford, Jim Kurose, Paul Francis, and Jennifer Rexford
outline What is routing? Current Internet routing –Focus on BGP Routing scalability A case study in IP routing: ViAggre What is the design space?
routing How do packets get from A to B in the Internet? A B Internet What is routing
One example Connectionless forwarding –Each router (switch) makes a LOCAL decision to forward the packet towards B A B R1 R4 R2 R3 R6 R7 R5 R8
Routing is… How does each router know the correct local forwarding decision for any possible destination address? –Through info of the network topology –This info is maintained by a routing protocol Information –Table size * update rate
Routing taxonomy Distributed* vs. centralized Static vs. dynamic* –# of hops vs. traffic load Intra-domain vs. inter-domain
Goals of internet routing Inter-connection Fault-tolerant Scalability performance …. Current Internet routing
Internet routing: two levels Autonomous system (AS) level –Inter-domain –BGP Router level –Intra-domain –RIP, OSPF,…
Internet structure Original idea Backbone service provider Large corporation Small corporation “ Consumer ” ISP “Consumer” ISP “ Consumer” ISP “ Consumer ” ISP Small corporation Small corporation Small corporation
Internet structure The reality is… Source: Arbor Networks * Why peering?
Internet structure And many tiers Tier 1 ISP Tier-2 ISP local ISP local ISP local ISP local ISP local ISP Tier 3 ISP local ISP local ISP local ISP
Internet routing Prefix is advertised across ASs Client SNU /16 Path: 6, 5, 4, 3, 2, 1
Inter-AS routing: BGP BGP (Border Gateway Protocol): the de facto standard BGP provides each AS a means to: 1.Obtain subnet reachability information from neighboring ASs. 2.Propagate reachability information to all AS- internal routers. 3.Determine “good” routes to subnets based on reachability information and policy. allows subnet to advertise its existence to rest of Internet: “I am here”
BGP basics pairs of routers (BGP peers) exchange routing info over semi-permanent TCP connections: BGP sessions when AS1 advertises a prefix of AS2 to AS3: –AS1 promises it will forward datagrams towards that prefix. –AS1 can aggregate prefixes in its advertisement 3b 1d 3a 1c 2a AS3 AS1 AS2 1a 2c 2b 1b 3c eBGP session iBGP session / / /16
Distributing reachability info using eBGP session between 3a and 1c, AS3 sends prefix reachability info to AS1. –1c can then use iBGP do distribute new prefix info to all routers in AS1 –1b can then re-advertise new reachability info to AS2 over 1b-to-2a eBGP session when router learns of new prefix, it creates entry for prefix in its forwarding table. 3b 1d 3a 1c 2a AS3 AS1 AS2 1a 2c 2b 1b 3c eBGP session iBGP session
Path attributes & BGP routes advertised prefix includes BGP attributes. –prefix + attributes = “route” two important attributes: –AS-PATH: contains ASs through which prefix advertisement has passed: e.g. AS 6431, AS 7018 –NEXT-HOP: indicates specific internal-AS router to next-hop AS. (may be multiple links from current AS to next-hop-AS) when gateway router receives route advertisement, uses import policy to accept/decline.
Network Layer4-17 BGP route selection router may learn about more than 1 route to some prefix. Router must select a route. elimination rules: 1.local preference value attribute: policy decision 2.shortest AS-PATH 3.closest NEXT-HOP router: hot potato routing 4.additional criteria
Network Layer4-18 BGP messages BGP messages exchanged using TCP. BGP messages: –OPEN: opens TCP connection to peer and authenticates sender –UPDATE: advertises new path (or withdraws old) –KEEPALIVE keeps connection alive in absence of UPDATES; also ACKs OPEN request –NOTIFICATION: reports errors in previous msg; also used to close connection
Network Layer4-19 BGP routing policy (1/2) A,B,C are provider networks X,W,Y are customers (of provider networks) X is dual-homed: attached to two networks X does not want to route from B via X to C .. so X will not advertise to B a route to C A B C W X Y legend : customer network: provider network
Network Layer4-20 BGP routing policy (2/2) A advertises path AW to B B advertises path BAW to X Should B advertise path BAW to C? No way! B gets no “revenue” for routing CBAW since neither W nor C are B’s customers B wants to force C to route to w via A B wants to route only to/from its customers! A B C W X Y legend : customer network: provider network
Network Layer4-21 Why Intra- and Inter-AS routing different? Policy: Inter-AS: admin wants control over how its traffic routed, who routes through its net. Intra-AS: single admin, so no policy decisions needed Scale: hierarchical routing saves table size, reduced update traffic Performance: Intra-AS: can focus on performance Inter-AS: policy may dominate over performance
Routing table (RT) growth Multi-homing Traffic engineering Non-aggregatable prefix allocation Routing scalability
routing message updates BGP update messages
Why routing scalability matters? FIB is expensive ViAggre
Virtual aggregation (ViAggre)
ViAggre: Basic Idea
ViAggre: Control Plane
More practically,…
Data plane operations
Route stretch
Ingress -> aggregation point
Aggregation point -> egress (1/3)
Aggregation point -> egress (2/3)
Aggregation point -> egress (3/3)
now We will consider general routing design space –IP is just one of the possibilities –But IP networking environments had better be considered as much as possible Design Space
Design goal of routing 1. Scalability (memory): e.g. sublinear RT size scaling 2. Quality (stretch): the length of a chosen path by a routing scheme compared to shortest path 3. Reliability: fast convergence upon topology changes while minimizing communication costs to maintain coherent non-local knowledge about network topology 4. Name-independent routing: accommodate node addresses/labels assigned independently of the topology (otherwise need to split locator and ID parts in addressing architecture) 5. Message overhead
Issue 1: Addressing and routing Rekhter’s Law: “Addressing can follow topology or topology can follow addressing. Choose one.” Name-dependent routing Name-independent routing
Issue 2: state vs. stretch routing debate We want small state!!We want small stretch!! State: the routing table size describing the network topology Stretch: path length found by the routing algorithm optimal path length ≥1
More general trade-off Triangle of trade-offs: Adaptation costs = convergence measures (e.g. number of messages per topology change) Memory space = routing table size Stretch = path length inflation