Routing: Overview and Key Protocols

Slides:



Advertisements
Similar presentations
Rensselaer Polytechnic Institute 1 Today’s Big Picture Large ISP Dial-Up ISP Access Network Small ISP Stub Large number of diverse networks.
Advertisements

1 Interdomain Traffic Engineering with BGP By Behzad Akbari Spring 2011 These slides are based on the slides of Tim. G. Griffin (AT&T) and Shivkumar (RPI)
Border Gateway Protocol Ankit Agarwal Dashang Trivedi Kirti Tiwari.
CS540/TE630 Computer Network Architecture Spring 2009 Tu/Th 10:30am-Noon Sue Moon.
Lecture 9 Overview. Hierarchical Routing scale – with 200 million destinations – can’t store all dests in routing tables! – routing table exchange would.
Path Vector Routing NETE0514 Presented by Dr.Apichan Kanjanavapastit.
© J. Liebeherr, All rights reserved 1 Border Gateway Protocol This lecture is largely based on a BGP tutorial by T. Griffin from AT&T Research.
Border Gateway Protocol Autonomous Systems and Interdomain Routing (Exterior Gateway Protocol EGP)
Fundamentals of Computer Networks ECE 478/578 Lecture #18: Policy-Based Routing Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
1 Interdomain Routing Protocols. 2 Autonomous Systems An autonomous system (AS) is a region of the Internet that is administered by a single entity and.
Chapter 4: Network Layer 4. 1 Introduction 4.2 Virtual circuit and datagram networks 4.3 What’s inside a router 4.4 IP: Internet Protocol –Datagram format.
4a-1 CSE401: Computer Networks Hierarchical Routing & Routing in Internet S. M. Hasibul Haque Lecturer Dept. of CSE, BUET.
Practical and Configuration issues of BGP and Policy routing Cameron Harvey Simon Fraser University.
CS 164: Global Internet Slide Set In this set... More about subnets Classless Inter Domain Routing (CIDR) Border Gateway Protocol (BGP) Areas with.
Interdomain Routing and The Border Gateway Protocol (BGP) Courtesy of Timothy G. Griffin Intel Research, Cambridge UK
Network Layer4-1 Chapter 4 roadmap 4.1 Introduction and Network Service Models 4.2 Routing Principles 4.3 Hierarchical Routing 4.4 The Internet (IP) Protocol.
RD-CSY3021 Comparing Routing Protocols. RD-CSY3021 Criteria used to compare routing protocols includes  Time to convergence  Proprietary/open standards.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Exterior Gateway Protocols: EGP, BGP-4, CIDR Shivkumar Kalyanaraman Rensselaer Polytechnic Institute.
Ion Stoica October 2, 2002 (* this presentation is based on Lakshmi Subramanian’s slides) EE 122: Inter-domain routing – Border Gateway Protocol (BGP)
CSEE W4140 Networking Laboratory Lecture 5: IP Routing (OSPF and BGP) Jong Yul Kim
14 – Inter/Intra-AS Routing
Announcement r Project 2 Extension ? m Previous grade allocation: Projects 40% –Web client/server7% –TCP stack21% –IP routing12% Midterm 20% Final 20%
ROUTING PROTOCOLS PART IV ET4187/ET5187 Advanced Telecommunication Network.
Border Gateway Protocol(BGP) L.Subramanian 23 rd October, 2001.
1 ECE453 – Introduction to Computer Networks Lecture 10 – Network Layer (Routing II)
Inter-domain Routing Don Fussell CS 395T Measuring Internet Performance.
14 – Inter/Intra-AS Routing Network Layer Hierarchical Routing scale: with > 200 million destinations: can’t store all dest’s in routing tables!
Interior Gateway Protocols: RIP & OSPF
Inter-domain Routing: Today and Tomorrow Dr. Jia Wang AT&T Labs Research Florham Park, NJ 07932, USA
1 Computer Communication & Networks Lecture 22 Network Layer: Delivery, Forwarding, Routing (contd.)
Unicast Routing Protocols  A routing protocol is a combination of rules and procedures that lets routers in the internet inform each other of changes.
© 2009 Cisco Systems, Inc. All rights reserved. ROUTE v1.0—6-1 Connecting an Enterprise Network to an ISP Network BGP Attributes and Path Selection Process.
1 Interdomain Routing (BGP) By Behzad Akbari Fall 2008 These slides are based on the slides of Ion Stoica (UCB) and Shivkumar (RPI)
CS 3700 Networks and Distributed Systems Inter Domain Routing (It’s all about the Money) Revised 8/20/15.
Routing protocols Basic Routing Routing Information Protocol (RIP) Open Shortest Path First (OSPF)
Chapter 9. Implementing Scalability Features in Your Internetwork.
Border Gateway Protocol
Routing in the Internet The Global Internet consists of Autonomous Systems (AS) interconnected with eachother: Stub AS: small corporation Multihomed AS:
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Exterior Gateway Protocols: BGP-4, CIDR Shivkumar Kalyanaraman Rensselaer Polytechnic Institute.
Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states.
1 Internet Routing. 2 Terminology Forwarding –Refers to datagram transfer –Performed by host or router –Uses routing table Routing –Refers to propagation.
Border Gateway Protocol (BGP) W.lilakiatsakun. BGP Basics (1) BGP is the protocol which is used to make core routing decisions on the Internet It involves.
More on Internet Routing A large portion of this lecture material comes from BGP tutorial given by Philip Smith from Cisco (ftp://ftp- eng.cisco.com/pfs/seminars/APRICOT2004.
T. S. Eugene Ngeugeneng at cs.rice.edu Rice University1 COMP/ELEC 429/556 Introduction to Computer Networks Inter-domain routing Some slides used with.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 ECSE-6600: Internet Protocols Informal Quiz #08: SOLUTIONS Shivkumar Kalyanaraman: GOOGLE: “Shiv.
Network Layer4-1 Intra-AS Routing r Also known as Interior Gateway Protocols (IGP) r Most common Intra-AS routing protocols: m RIP: Routing Information.
TCOM 509 – Internet Protocols (TCP/IP) Lecture 06_a Routing Protocols: RIP, OSPF, BGP Instructor: Dr. Li-Chuan Chen Date: 10/06/2003 Based in part upon.
Chapter 14 1 Unicast Routing Protocols There isn’t a person anywhere that isn’t capable of doing more than he thinks he can. - Henry Ford.
Internet Protocols. ICMP ICMP – Internet Control Message Protocol Each ICMP message is encapsulated in an IP packet – Treated like any other datagram,
An internet is a combination of networks connected by routers. When a datagram goes from a source to a destination, it will probably pass through many.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 11 - Inter-Domain Routing - BGP (Border Gateway Protocol)
1 Agenda for Today’s Lecture The rationale for BGP’s design –What is interdomain routing and why do we need it? –Why does BGP look the way it does? How.
Network Layer4-1 Chapter 4: Network Layer r 4. 1 Introduction r 4.2 Virtual circuit and datagram networks r 4.3 What’s inside a router r 4.4 IP: Internet.
Text BGP Basics. Document Name CONFIDENTIAL Border Gateway Protocol (BGP) Introduction to BGP BGP Neighbor Establishment Process BGP Message Types BGP.
Border Gateway Protocol BGP-4 BGP environment How BGP works BGP information BGP administration.
ROUTING ON THE INTERNET COSC Jun-16. Routing Protocols  routers receive and forward packets  make decisions based on knowledge of topology.
1 Internet Routing 11/11/2009. Admin. r Assignment 3 2.
Chapter 4: Network Layer
CS 3700 Networks and Distributed Systems
Border Gateway Protocol
CS 3700 Networks and Distributed Systems
Border Gateway Protocol
Chapter 4: Network Layer
Interdomain Traffic Engineering with BGP
Dynamic Routing and OSPF
Chapter 4: Network Layer
Chapter 4: Network Layer
Chapter 4: Network Layer
COMP/ELEC 429/556 Introduction to Computer Networks
Computer Networks Protocols
Presentation transcript:

Routing: Overview and Key Protocols Shivkumar Kalyanaraman Rensselaer Polytechnic Institute shivkuma@ecse.rpi.edu Based in part upon slides of Prof. Raj Jain (OSU), S. Keshav (Cornell), J. Kurose (U Mass), Noel Chiappa (MIT), Tim Griffin (AT&T), Ion Stoica (UCB),

Overview Routing vs Forwarding vs Bridging Distance vector vs Link state routing Addressing and Routing: Scalability OSPF, RIP protocols Inter-domain Routing Issues BGP protocol

Routing vs. Forwarding Forwarding: select an output port based on destination address and routing table Data-plane function Often implemented in hardware Routing: process by which routing table is built.. … so that the series of local forwarding decisions takes the packet to the destination with high probability, and …(reachability condition) … the path chosen/resources consumed by the packet is efficient in some sense… (optimality and filtering condition) Control-plane function Implemented in software

Forwarding Table Can display forwarding table using “netstat -rn” Sometimes called “routing table” Destination Gateway Flags Ref Use Interface 127.0.0.1 127.0.0.1 UH 0 26492 lo0 192.168.2. 192.168.2.5 U 2 13 fa0 193.55.114. 193.55.114.6 U 3 58503 le0 192.168.3. 192.168.3.5 U 2 25 qaa0 224.0.0.0 193.55.114.6 U 3 0 le0 default 193.55.114.129 UG 0 143454

Interconnection Devices Extended LAN =Broadcast domain LAN= Collision Domain H H B H H Router Application Application Router Bridge/Switch Repeater/Hub Gateway Transport Transport Network Network Datalink Datalink Physical Physical

Routing problem Collect, process, and condense global state into local forwarding information Global state inherently large dynamic hard to collect Hard issues: consistency, completeness, scalability Impact of resource needs of sessions

Consistency Defn: A series of independent local forwarding decisions must lead to connectivity between any desired (source, destination) pair in the network. If the states are inconsistent, the network is said not to have “converged” to steady state (I.e. is in a transient state) Inconsistency leads to loops, wandering packets etc In general a part of the routing information may be consistent while the rest may be inconsistent. Large networks => inconsistency is a scalability issue. Consistency can be achieved in two ways: Fully distributed approach: a consistency criterion or invariant across the states of adjacent nodes Signaled approach: the signaling protocol sets up local forwarding information along the path.

Completeness Defn: The network as a whole and every node has sufficient information to be able to compute all paths. In general, with more information available locally, routing algorithms tend to converge faster, because the chances of inconsistency reduce. But this means that more distributed state must be collected at each node and processed. The demand for completeness also limits the scalability of the algorithm. Since both consistency and completeness pose scalability problems, large networks have to be structured hierarchically and abstract entire networks as a single node.

Internet Routing Model 2 key features: Dynamic routing Intra- and Inter-AS routing, AS = locus of admin control Internet organized as “autonomous systems” (AS). AS is internally connected Interior Gateway Protocols (IGPs) within AS. Eg: RIP, OSPF, HELLO Exterior Gateway Protocols (EGPs) for AS to AS routing. Eg: EGP, BGP-4

Dynamic Routing Model

Intra-AS and Inter-AS routing A.c A.a C.b B.a Gateways: perform inter-AS routing amongst themselves perform intra-AS routers with other routers in their AS b c a a C b a B d c A b inter-AS, intra-AS routing in gateway A.c network layer link layer physical layer

Intra-AS and Inter-AS routing: Example between A and B a b C A B d c A.a A.c C.b B.a Host h2 Host h1 Intra-AS routing within AS B Intra-AS routing within AS A

Basic Dynamic Routing Methods Source-based: source gets a map of the network, source finds route, and either signals the route-setup (eg: ATM approach) encodes the route into packets (inefficient) Link state routing: per-link information Get map of network (in terms of link states) at all nodes and find next-hops locally. Maps consistent => next-hops consistent Distance vector: per-node information At every node, set up distance signposts to destination nodes (a vector) Setup this by peeking at neighbors’ signposts.

DV & LS: consistency criterion The subset of a shortest path is also the shortest path between the two intermediate nodes. Corollary: If the shortest path from node i to node j, with distance D(i,j) passes through neighbor k, with link cost c(i,k), then: D(i,j) = c(i,k) + D(k,j) j D(k,j) i c(i,k) k

Distance Vector DV = Set (vector) of Signposts, one for each destination

Distance Vector (DV) Approach Consistency Condition: D(i,j) = c(i,k) + D(k,j) The DV (Bellman-Ford) algorithm evaluates this recursion iteratively. In the mth iteration, the consistency criterion holds, assuming that each node sees all nodes and links m-hops (or smaller) away from it (i.e. an m-hop view) A E D C B 7 8 1 2 Example network A E B 7 1 A’s 1-hop view (After 1st iteration) A E D C B 7 8 1 2 A’s 2-hop view (After 2nd Iteration)

Distance Vector (DV) Example A’s distance vector D(A,*): After Iteration 1 is: [0, 7, INFINITY, INFINITY, 1] After Iteration 2 is: [0, 7, 8, 3, 1] After Iteration 3 is: [0, 7, 5, 3, 1] After Iteration 4 is: [0, 6, 5, 3, 1] A E D C B 7 8 1 2 Example network A E B 7 1 A’s 1-hop view (After 1st iteration) A E D C B 7 8 1 2 A’s 2-hop view (After 2nd Iteration)

Link State (LS) Approach The link state (Dijkstra) approach is iterative, but it pivots around destinations j, and their predecessors k = p(j) Observe that an alternative version of the consistency condition holds for this case: D(i,j) = D(i,k) + c(k,j) Each node i collects all link states c(*,*) first and runs the complete Dijkstra algorithm locally. j c(k,j) i D(i,k) k

Dijkstra’s algorithm: example Step 1 2 3 4 5 set N A AD ADE ADEB ADEBC ADEBCF D(B),p(B) 2,A D(C),p(C) 5,A 4,D 3,E D(D),p(D) 1,A D(E),p(E) infinity 2,D D(F),p(F) infinity 4,E A E D C B F 2 1 3 5 The shortest-paths spanning tree rooted at A is called an SPF-tree

Summary: Distributed Routing Techniques Link State Vectoring Topology information is flooded within the routing domain Best end-to-end paths are computed locally at each router. Best end-to-end paths determine next-hops. Based on minimizing some notion of distance Works only if policy is shared and uniform Examples: OSPF, IS-IS Each router knows little about network topology Only best next-hops are chosen by each router for each destination network. Best end-to-end paths result from composition of all next-hop choices Does not require any notion of distance Does not require uniform policies at all routers Examples: RIP, BGP

RIP: Routing Information Protocol Uses hop count as metric (max: 16 is infinity) Tables (vectors) “advertised” to neighbors every 30 s. Each advertisement: upto 25 entries No advertisement for 180 sec: neighbor/link declared dead routes via neighbor invalidated new advertisements sent to neighbors (Triggered updates) neighbors in turn send out new advertisements (if tables changed) link failure info quickly propagates to entire net poison reverse used to prevent ping-pong loops (infinite distance = 16 hops)

RIPv1 Problems (Continued) Split horizon/poison reverse does not guarantee to solve count-to-infinity problem 16 = infinity => RIP for small networks only! Slow convergence Broadcasts consume non-router resources RIPv1 does not support subnet masks (VLSMs) No authentication

RIPv2 Why ? Installed base of RIP routers Provides: VLSM support Authentication Multicasting “Wire-sharing” by multiple routing domains, Tags to support EGP/BGP routes. Uses reserved fields in RIPv1 header. First route entry replaced by authentication info.

Link State Protocols Key: Create a network “map” at each node. 1. Node collects the state of its connected links and forms a “Link State Packet” (LSP) 2. Flood LSP => reaches every other node in the network and everyone now has a network map. 3. Given map, run Dijkstra’s shortest path algorithm (SPF) => get paths to all destinations 4. Routing table = next-hops of these paths. 5. Hierarchical routing: organization of areas, and filtered control plane information flooded.

Hello: Packet Format

Topology Dissemination A.k.a LSP distribution 1. Flood LSPs on links except incoming link Require at most 2E transfers for n/w with E edges 2. Sequence numbers to detect duplicates Why? Routers/links may go down/up Issue: wrap-around, larger sequence number is not the most recent!

OSPF Router-LSA: Scenario

Router-LSA:

Topology Dissemination (Continued) Checksum field: Drop packet if in error, get retransmission from neighbor Age field (similar to TTL) Number of seconds since LSA originated Periodically incremented after acceptance Originating router refreshes LSA after 30 min Delete if Age = MaxAge Low age field + large seq # => that LSA is flapping or frequently changing …

Recovering from a partition On partition, LSP databases can get out of synch Databases described by database descriptor records Routers on each side of a newly restored link talk to each other to update databases (determine missing and out-of-date LSPs) => selective synchronization

Inter-Domain Routing: Big Picture Large ISP Large ISP Stub Small ISP Dial-Up ISP Access Network Stub Stub Large number of diverse networks

Requirements for Inter-AS Routing Should scale for the size of the global Internet. Focus on reachability, not optimality Use address aggregation techniques to minimize core routing table sizes and associated control traffic At the same time, it should allow flexibility in topological structure (eg: don’t restrict to trees etc) Allow policy-based routing between autonomous systems Policy refers to arbitrary preference among a menu of available routes (based upon routes’ attributes) Fully distributed routing (as opposed to a signaled approach) is the only possibility. Extensible to meet the demands for newer policies.

Who speaks Inter-AS routing? BGP R3 R2 R1 R border router internal router Two types of routers Border router(Edge), Internal router(Core) Two border routers of different ASes will have a BGP session

Customers and Providers IP traffic provider customer customer Customer pays provider for access to the Internet

Nontransit vs. Transit ASes Internet Service providers (ISPs) have transit networks ISP 2 ISP 1 NET A Nontransit AS might be a corporate or campus network. Could be a “content provider” Traffic NEVER flows from ISP 1 through NET A to ISP 2

The Peering Relationship Peers provide transit between their respective customers Peers do not provide transit between peers Peers (often) do not exchange $$$ peer customer provider traffic allowed traffic NOT allowed

BGP-4 BGP = Border Gateway Protocol Is a Policy-Based routing protocol Is the de facto EGP of today’s global Internet Relatively simple protocol, but configuration is complex and the entire world can see, and be impacted by, your mistakes. 1989 : BGP-1 [RFC 1105] Replacement for EGP (1984, RFC 904) 1990 : BGP-2 [RFC 1163] 1991 : BGP-3 [RFC 1267] 1995 : BGP-4 [RFC 1771] Support for Classless Interdomain Routing (CIDR)

BGP Operations (Simplified) Establish session on TCP port 179 AS1 BGP session Exchange all active routes AS2 While connection is ALIVE exchange route UPDATE messages Exchange incremental updates

Four Types of BGP Messages Open : Establish a peering session. Keep Alive : Handshake at regular intervals. Notification : Shuts down a peering session. Update : Announcing new routes or withdrawing previously announced routes. announcement = prefix + attributes values

Two Types of BGP Neighbor Relationships External Neighbor (eBGP) in a different Autonomous Systems Internal Neighbor (iBGP) in the same Autonomous System AS1 iBGP is routed (using IGP!) eBGP iBGP AS2

I-BGP and E-BGP A AS1 AS2 B AS3 border router internal router I-BGP IGP: Interior Gateway Protocol. Examples: IS-IS, OSPF R3 R2 IGP announce B A R1 AS1 E-BGP AS2 R4 R5 B AS3 R border router internal router

IBGP vs EBGP I-BGP nodes: typically ABRs, or other nodes where default routes terminate I-BGP peering sessions between every pair of routers within an AS: full mesh. Physical link A IBGP session D C B AS1

Route Reflection AS1 ER AS2 128.23.0.0/16 RR2 RR-C4 RR-C1 RR1 RR3 EBGP 10.0.0.0/24 AS2 IBGP

AS Confederations Divide and conquer: Divides a large AS into sub-ASs 11 10 14 13 12 R1 AS-1 R2

Address Aggregation: CIDR 204.71.0.0 204.71.0.0 204.71.1.0 Global Internet Routing Mesh Service Provider 204.71.1.0 204.71.2.0 204.71.2.0 …...……. …...……. 204.71.255.0 204.71.255.0 Inter-domain Routing Without CIDR 204.71.0.0 204.71.1.0 Global Internet Routing Mesh Service Provider 204.71.2.0 204.71.0.0/16 …...……. 204.71.255.0 Inter-domain Routing With CIDR

RFC 1519: Classless Inter-Domain Routing (CIDR) Pre-CIDR: Network ID ended on 8-, 16, 24- bit boundary CIDR: Network ID can end at any bit boundary IP Address : 12.4.0.0 IP Mask: 255.254.0.0 Address 00001100 00000100 00000000 11111111 11111110 00000000 Mask Network Prefix for hosts Usually written as 12.4.0.0/15, a.k.a “supernetting”

Longest Prefix Match (Classless) Forwarding Destination =12.5.9.16 ------------------------------- payload Prefix Interface Next Hop 12.0.0.0/8 10.14.22.19 ATM 5/0/8 12.4.0.0/15 12.5.8.0/23 attached Ethernet 0/1/3 Serial 1/0/7 10.1.3.77 IP Forwarding Table 0.0.0.0/0 10.14.11.33 ATM 5/0/9 OK better even better best!

What is Routing Policy Policy refers to arbitrary preference among a menu of available routes (based upon routes’ attributes) Public description of the relationship between external BGP peers Can also describe internal BGP peer relationship Eg: Who are my BGP peers What routes are Originated by a peer Imported from each peer Exported to each peer Preferred when multiple routes exist What to do if no route exists?

BGP Route Processing Apply Policy = filter routes & tweak attributes Receive BGP Updates Based on Attribute Values Best Routes Transmit BGP Updates Apply Import Policies Best Route Selection Best Route Table Apply Export Policies Install forwarding Entries for best Routes. IP Forwarding Table

Policy Implementation Flow Adj RIB In Incom- ing Main BGP RIB Adj RIB Out Outgo- ing Main RIB/ FIB IGPs Static & HW Info

Import and Export Policies For inbound traffic Filter outbound routes Tweak attributes on outbound routes in the hope of influencing your neighbor’s best route selection For outbound traffic Filter inbound routes Tweak attributes on inbound routes to influence best route selection outbound routes inbound traffic inbound routes outbound traffic In general, an AS has more control over outbound traffic

BGP Policy Knob: Attributes Value Code Reference ----- --------------------------------- --------- 1 ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] ... 255 reserved for development We will cover a subset of these attributes Not all attributes need to be present in every announcement From IANA: http://www.iana.org/assignments/bgp-parameters

UPDATE message in BGP Primary message between two BGP speakers. Used to advertise/withdraw IP prefixes (NLRI) Path attributes field : unique to BGP Apply to all prefixes specified in NLRI field Optional vs Well-known; Transitive vs Non-transitive 2 octets Withdrawn Routes Length Withdrawn Routes (variable length) Total Path Attributes Length Path Attributes (variable length) Network Layer Reachability Info. (NLRI: variable length)

Path Attributes: ORIGIN Describes how a prefix came to BGP at the origin AS Prefixes are learned from a source and “injected” into BGP: Directly connected interfaces, manually configured static routes, dynamic IGP or EGP Values: IGP (EGP): Prefix learnt from IGP (EGP) INCOMPLETE: Static routes

Path Attributes: AS-PATH List of ASs thru which the prefix announcement has passed. AS on path adds ASN to AS-PATH Eg: 138.39.0.0/16 originates at AS1 and is advertised to AS3 via AS2. Eg: AS-SEQUENCE: “100 200” Used for loop detection and path selection AS1 (100) AS3 (15) 138.39.0.0/16 AS2 (200)

Traffic Often Follows ASPATH 135.207.0.0/16 ASPATH = 3 2 1 AS 1 AS 3 AS 4 AS 2 135.207.0.0/16 IP Packet Dest = 135.207.44.66

… But It Might Not AS 1 AS 3 AS 4 AS 2 AS 5 AS 2 filters all subnets with masks longer than /24 135.207.0.0/16 ASPATH = 1 135.207.0.0/16 ASPATH = 3 2 1 135.207.44.0/25 ASPATH = 5 AS 1 AS 3 AS 4 AS 2 135.207.0.0/16 IP Packet Dest = 135.207.44.66 From AS 4, it may look like this packet will take path 3 2 1, but it actually takes path 3 2 5 AS 5 135.207.44.0/25

Shorter AS-PATH Doesn’t Mean Shorter # Hops BGP says that path 4 1 is better than path 3 2 1 Duh! AS 4 AS 3 AS 2 AS 1

Path Attributes: NEXT-HOP Next-hop: node to which packets must be sent for the IP prefixes. May not be same as peer. UPDATE for 180.20.0.0, NEXT-HOP= 170.10.20.3 BGP Speakers Not a BGP Speaker

Recursive Lookup If routes (prefix) are learnt thru iBGP, NEXT-HOP is the iBGP router which originated the route. Note: iBGP peer might be several IP-level hops away as determined by the IGP Hence BGP NEXT-HOP is not the same as IP next-hop BGP therefore checks if the “NEXT-HOP” is reachable through its IGP. If so, it installs the IGP next-hop for the prefix This process is known as “recursive lookup” – the lookup is done in the control-plane (not data-plane) before populating the forwarding table. Example in next slide

Join EGP with IGP For Connectivity 135.207.0.0/16 Next Hop = 192.0.2.1 135.207.0.0/16 10.10.10.10 AS 1 AS 2 192.0.2.1 192.0.2.0/30 Forwarding Table destination next hop 192.0.2.0/30 10.10.10.10 Forwarding Table + destination next hop EGP 192.0.2.1 135.207.0.0/16 destination next hop 135.207.0.0/16 10.10.10.10 192.0.2.0/30 10.10.10.10

Load-Balancing Knobs in BGP LOCAL-PREF: outbound traffic, local preference (box-level knob) MED: Inbound-traffic, typically from the same ISP (link-level knob) AS1 AS2 Local Preference MED

Path Attribute: LOCAL-PREF Locally configured indication about which path is preferred to exit the AS in order to reach a certain network. Default value = 100. Higher is better.

Attributes: MULTI-EXIT Discriminator Also called METRIC or MED Attribute. Lower is better AS1:multihomed customer. AS2 (provider) includes MED to AS1 AS1 chooses which link (NEXTHOP) to use Eg: traffic to AS3 can go thru Link1, and AS2 thru Link2 Link A AS3 AS2 AS1 Link B AS4

MEDs Can Export Internal Instability Heavy Content Web Farm 2865 17 FLAP FLAP 192.44.78.0/24 MED = 56 OR 10 192.44.78.0/24 MED = 15 10 FLAP FLAP 56 15 FLAP 192.44.78.0/24

ASPATH Padding: Shed inbound traffic provider 192.0.2.0/24 ASPATH = 2 2 2 192.0.2.0/24 ASPATH = 2 Padding will (usually) force inbound traffic from AS 1 to take primary link primary backup customer 192.0.2.0/24 AS 2

Deaggregation + Multihoming If AS 1 does not announce the more specific prefix, then most traffic to AS 2 will go through AS 3 because it is a longer match 12.2.0.0/16 12.2.0.0/16 12.0.0.0/8 AS 3 AS 1 provider provider AS 2 customer 12.2.0.0/16 AS 2 is “punching a hole” in the CIDR block of AS 1=> subverts CIDR

CIDR at Work, No load balancing Table at ISP3 Prefix Next Hop ORIGIN AS 128.32/11 ISP1 140.64/10 ISP2 128.40/16 Link A ISP1 128.32/11 AS1 128.40/16 140.127/16 ISP3 Link B ISP2 140.64/10 140.127/16

CIDR Subverted for Load Balancing Table at ISP3 Prefix Next Hop ORIGIN AS 128.32/11 ISP1 140.64/10 ISP2 140.255.20/24 AS1 128.42.10/24 140.255.20/24, 128.40/16 Link A ISP1 128.32/11 AS1 128.40/16 140.127/16 ISP3 Link B ISP2 140.64/10 128.42.10/24, 140.127/16

How Can Routes be Colored? BGP Communities A community value is 32 bits By convention, first 16 bits is ASN indicating who is giving it an interpretation community number Used within and between ASes The set of ASes must agree on how to interpret the community value Very powerful BECAUSE it has no (predefined) meaning Community Attribute = a list of community values. (So one route can belong to multiple communities) Two reserved communities no_advertise 0xFFFFFF02: don’t pass to BGP neighbors no_export = 0xFFFFFF01: don’t export out of AS RFC 1997 (August 1996)

Communities Example 1:100 Customer routes 1:200 Peer routes 1:300 Provider Routes To Customers 1:100, 1:200, 1:300 To Peers 1:100 To Providers Import Export AS 1

BGP Route Selection Process Series of tie-breaker decisions... If NEXTHOP is inaccessible do not consider the route. Prefer largest LOCAL-PREF If same LOCAL-PREF prefer the shortest AS-PATH. If all paths are external prefer the lowest ORIGIN code (IGP<EGP<INCOMPLETE). If ORIGIN codes are the same prefer the lowest MED. If MED is same, prefer min-cost NEXT-HOP If routes learned from EBGP or IBGP, prefer paths learnt from EBGP Final tie-break: Prefer the route with I-BGP ID (IP address)

Route Selection Summary Highest Local Preference Enforce relationships Shortest ASPATH Lowest MED traffic engineering i-BGP < e-BGP Lowest IGP cost to BGP egress Throw up hands and break ties Lowest router ID

BGP Table Growth Thanks: Geoff Huston. http://www.telstra.net/ops/bgptable.html

Large BGP Tables Considered Harmful Routing tables must store best routes and alternate routes Burden can be large for routers with many alternate routes (route reflectors for example) Routers have been known to die Increases CPU load, especially during session reset

Summary Routing Concepts DV and LS algorithms RIP, OSPF, BGP