The Internet Network Layer forwarding table Host, router network layer functions: Routing protocols path selection RIP, OSPF, BGP IP protocol addressing conventions datagram format packet handling conventions ICMP protocol error reporting router “signaling” Transport layer: TCP, UDP Link layer Physical layer Network layer
CS 4284 Spring 2013 IP Datagram Format ver length 32 bits data (variable length, typically a TCP or UDP segment) 16-bit identifier Internet checksum time to live 32 bit source IP address IP protocol version number header length (bytes) max number remaining hops (decremented at each router) for fragmentation/ reassembly total datagram length (bytes) upper layer protocol to deliver payload to head. len type of service “type” of data flgs fragment offset upper layer 32 bit destination IP address Options (if any) E.g. timestamp, record route taken, specify list of routers to visit.
CS 4284 Spring 2013 IP Fragmentation & Reassembly network links have MTU (max.transfer size) - largest possible link-level frame. –different link types, different MTUs large IP datagram divided (“fragmented”) within net –one datagram becomes several datagrams –“reassembled” only at final destination –IP header bits used to identify, order related fragments fragmentation: in: one large datagram out: 3 smaller datagrams reassembly
CS 4284 Spring 2013 IP Fragmentation and Reassembly ID =x offset =0 fragflag =0 length =4000 ID =x offset =0 fragflag =1 length =1500 ID =x offset =185 fragflag =1 length =1500 ID =x offset =370 fragflag =0 length =1040 One large datagram becomes several smaller datagrams Example 4000 byte datagram MTU = 1500 bytes 1480 bytes in data field offset = 1480/8
CS 4284 Spring 2013 IP Addressing: Introduction IP address: 32-bit identifier for host or router interface interface: connection between host/router and physical link –routers typically have multiple interfaces –host may have multiple interfaces –IP addresses are associated with each interface –Link can be multipoint-link, e.g. LAN – or even entire network, e.g., ATM Key point: no routing table lookup is necessary to get to destination within subnet 188.8.131.52 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 126.96.36.199 188.8.131.52 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 = 11011111 00000001 00000001 00000001 223 111
CS 4284 Spring 2013 Subnets IP address: –subnet part (high order bits) –host part (low order bits) What’s a subnet ? –(a set of) device interfaces with a common subnet part of IP address –can physically reach each other without intervening router 126.96.36.199 188.8.131.52 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 126.96.36.199 188.8.131.52 184.108.40.206 220.127.116.11 network consisting of 3 subnets LAN
CS 4284 Spring 2013 Subnets 18.104.22.168/24 22.214.171.124/24 126.96.36.199/24 Recipe To determine the subnets, detach each interface from its host or router, creating islands of isolated networks. Each isolated network is called a subnet –And needs its own subnet address! Subnet mask: /24 255.255.255.0
CS 4284 Spring 2013 Addressing in IP IP addresses denote interfaces, not hosts Sets of interfaces form subnets –Subnets share common prefix Route to CIDR-ized subnet addresses –a.b.c.d/x Within subnet, reach destination directly 188.8.131.52 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 126.96.36.199 188.8.131.52 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 126.96.36.199 188.8.131.52184.108.40.206 220.127.116.11 18.104.22.168
CS 4284 Spring 2013 IP Addressing: CIDR CIDR: Classless InterDomain Routing –subnet portion of address of arbitrary length –address format: a.b.c.d/x, where x is # bits in subnet portion of address 11001000 00010111 00010000 00000000 subnet part host part 22.214.171.124/23
CS 4284 Spring 2013 Before CIDR: Classful Routing A, B, C: Pretty much only of historical interest today
CS 4284 Spring 2013 R2 R1 R3 Internet __________ Ethernet LAN 1 60 Machines Ethernet LAN 2 120 Machines Subnet address: ______________ Default gateway: ______________ Subnet address: ______________ Default gateway: ______________ __________ PPP Link 1 PPP Link 2
CS 4284 Spring 2013 R2 R1 R3 Internet 126.96.36.199 Ethernet LAN 1 60 Machines Ethernet LAN 2 120 Machines Subnet address: 188.8.131.52/26 Default gateway: 184.108.40.206 Subnet address: 220.127.116.11/25 Default gateway: 18.104.22.168 22.214.171.124 126.96.36.199 188.8.131.52 184.108.40.206 PPP Link 1 PPP Link 2 220.127.116.11/30 18.104.22.168/30
CS 4284 Spring 2013 Routing Tables in End Systems Typical: local subnets + default gateway (“first- hop router”) Example: “route print” on Windows XP –22.214.171.124 FastEthernet –126.96.36.199 802.11g wireless Active Routes: Network Destination Netmask Gateway Interface Metric 0.0.0.0 0.0.0.0 188.8.131.52 184.108.40.206 20 0.0.0.0 0.0.0.0 220.127.116.11 18.104.22.168 25 127.0.0.0 255.0.0.0 127.0.0.1 127.0.0.1 1 22.214.171.124 255.255.248.0 126.96.36.199 188.8.131.52 20 184.108.40.206 255.255.254.0 220.127.116.11 18.104.22.168 25 … Default Gateway: 22.214.171.124
CS 4284 Spring 2013 ICMP: Internet Control Message Protocol used by hosts & routers to communicate network-level information –error reporting: unreachable host, network, port, protocol –echo request/reply (used by ping) network-layer “above” IP: –ICMP msgs carried in IP datagrams ICMP message: type, code plus first 8 bytes of IP datagram causing error Type Code description 0 0 echo reply (ping) 3 0 dest. network unreachable 3 1 dest host unreachable 3 2 dest protocol unreachable 3 3 dest port unreachable 3 6 dest network unknown 3 7 dest host unknown 4 0 source quench (congestion control - not used) 8 0 echo request (ping) 9 0 route advertisement 10 0 router discovery 11 0 TTL expired 12 0 bad IP header
CS 4284 Spring 2013 Traceroute and ICMP Source sends series of UDP segments to dest –First has TTL =1 –Second has TTL=2, etc. –Unlikely port number When nth datagram arrives to nth router: –Router discards datagram –And sends to source an ICMP message (type 11, code 0) –Message includes name of router& IP address When ICMP message arrives, source calculates RTT Traceroute does this 3 times Stopping criterion UDP segment eventually arrives at destination host Destination returns ICMP “port unreachable” packet (type 3, code 3) When source gets this ICMP, stops. See also [Heideman 2008]Heideman 2008
CS 4284 Spring 2013 IP addresses: how to get one? Host gets IP address either hardcoded or via DHCP (Dynamic Host Configuration Protocol) Network gets subnet part of IP address allocated from ISP’s address space ISP gets address space assigned by ICANN (Internet Corporation for Assigned Names and Numbers) ISP's block 11001000 00010111 00010000 00000000 126.96.36.199/20 Organization 0 11001000 00010111 00010000 00000000 188.8.131.52/23 Organization 1 11001000 00010111 00010010 00000000 184.108.40.206/23 Organization 2 11001000 00010111 00010100 00000000 220.127.116.11/23... ….. …. …. Organization 7 11001000 00010111 00011110 00000000 18.104.22.168/23
CS 4284 Spring 2013 IPv6 Initial motivation: 32-bit address space soon to be completely allocated. Additional motivation: –header format helps speed processing/forwarding –header changes to facilitate QoS –easier configuration of both hosts & backbone routers IPv6 datagram format: –fixed-length 40 byte header –no fragmentation allowed
CS 4284 Spring 2013 IPv6 Header (Cont) Priority: identify priority among datagrams in flow Flow Label: identify datagrams in same “flow.” (concept of “flow” not well defined). Next header: identify upper layer protocol for data
CS 4284 Spring 2013 Other Changes from IPv4 Checksum: removed entirely to reduce processing time at each hop ICMPv6: new version of ICMP –additional message types, e.g. “Packet Too Big” –multicast group management functions Options: allowed, but outside of header, indicated by “Next Header” field
CS 4284 Spring 2013 Extension Headers Grouped in six types: –Hop-by-hop options, e.g. Jumbograms –Destination options –Routing, e.g. source routing –Fragment – can be done, but end hosts only! –Authentication –Encapsulation Routers quickly know which headers they must examine and which they can skip
CS 4284 Spring 2013 IPv6 Addresses Written as eight 16bit values –e.g. fe80::020e:7bff:fe32:d716 (made from 00:0E:7B:32:D7:16) 0000 Reserved 0000 0001Unassigned 0000 001Reserved for NSAP (non-IP addresses used by ISO) 0000 010Reserved for IPX (non-IP addresses used by IPX) 0000 011 - 0001Unassigned 001Unicast Address Space 010 - 110Unassigned 1110 - 1111 1110 0Unassigned 1111 110Unique Local Addresses (ULA) 1111 1110 10Link Local Use addresses 1111 1110 11Site Local Use addresses (Deprecated) 1111 Multicast addresses
CS 4284 Spring 2013 IPv6 autoconf stateless autoconfiguration see [Donzé 2004]Donzé 2004 –Plug in and interface creates link-local address based on adapter MAC –Interface can have link-local (fe80::…), site-local & global (2001::…) addresses VT’s campus has had IPv6 testbed since 1998, now connected to public IPv6 network Try it out yourself! –MacOS, Linux: enabled by default of recent installations –Windows XP: “ipv6 install” at command prompt –Tools add 6: ping6, traceroute6, etc..
CS 4284 Spring 2013 Transition From IPv4 To IPv6 Not all routers can be upgraded simultaneously –no “flag days” –How will the network operate with mixed IPv4 and IPv6 routers? Tunneling: IPv6 carried as payload in IPv4 datagram among IPv4 routers
CS 4284 Spring 2013 Tunneling A B E F IPv6 tunnel Logical view: Physical view: A B E F IPv6 C D IPv4 Flow: X Src: A Dest: F data Flow: X Src: A Dest: F data Flow: X Src: A Dest: F data Src:B Dest: E Flow: X Src: A Dest: F data Src:B Dest: E A-to-B: IPv6 E-to-F: IPv6 B-to-C: IPv6 inside IPv4 B-to-C: IPv6 inside IPv4
CS 4284 Spring 2013 IPv6 – Opposing View Bernstein points out some hindrances [The IPv6 mess]The IPv6 mess –Lack of interoperability b/c no embedding of addresses –Transition path (comparison to MX records) IPv6 – the next OSI? DoD requirement by 2008 –What happened to it? Federal 2012 deadline that all public-facing websites talk IPv6 Asian countries are pushing for transition
Hierarchical Addressing: Route Aggregation “Send me anything with addresses beginning 22.214.171.124/20” 126.96.36.199/23188.8.131.52/23184.108.40.206/23 Fly-By-Night-ISP Organization 0 Organization 7 Internet Organization 1 ISPs-R-Us “Send me anything with addresses beginning 220.127.116.11/16” 18.104.22.168/23 Organization 2...... Hierarchical addressing allows efficient advertisement of routing information:
CS 4284 Spring 2013 Hierarchical Addressing: More Specific Routes ISPs-R-Us has a more specific route to Organization 1 “Send me anything with addresses beginning 22.214.171.124/20” 126.96.36.199/23188.8.131.52/23184.108.40.206/23 Fly-By-Night-ISP Organization 0 Organization 7 Internet Organization 1 ISPs-R-Us “Send me anything with addresses beginning 220.127.116.11/16 or 18.104.22.168/23” 22.214.171.124/23 Organization 2......
CS 4284 Spring 2013 Intra-AS vs Inter-AS Routing In Internet: –Intra-AS known as Interior Gateway Protocols (IGP) –Most common Intra-AS routing protocols: RIP: Routing Information Protocol (original protocol, now rarely used) OSPF: Open Shortest Path First IGRP/EIGRP: (Enhanced) Interior Gateway Routing Protocol –Inter-AS known as Border Gateway Protocols: BGP4: Only protocol used
CS 4284 Spring 2013 RIP (Routing Information Protocol) Distance vector algorithm –Included in BSD-UNIX Distribution in 1982 Distance metric: # of hops (max = 15 hops) Distance vectors: exchanged among neighbors every 30 sec via Response Message (also called advertisement) Each advertisement: list of up to 25 destination nets within AS D C BA u v w x y z destination hops u 1 v 2 w 2 x 3 y 3 z 2 A’s routing table
CS 4284 Spring 2013 RIP: Example Destination Network Next Router Num. of hops to dest. wA2 yB2 zB7 x--1 ….…..... w xy z A C D B Routing table in D
CS 4284 Spring 2013 RIP: Example Destination Network Next Router Num. of hops to dest. wA2 yB2 zB A7 5 x--1 ….…..... w xy z A C D B Dest Next hops w - - x - - z C 4 …. …... Advertisement from A to D Routing table in D
CS 4284 Spring 2013 RIP: Link Failure and Recovery If no advertisement heard after 180 sec → neighbor/link declared dead –routes via neighbor invalidated –new advertisements sent to neighbors –neighbors in turn send out new advertisements (if tables changed) –poison reverse used to prevent ping-pong loops (infinite distance = 16 hops)
CS 4284 Spring 2013 RIP Table processing RIP routing tables managed by application-level process called route-d (daemon) advertisements sent in UDP packets, periodically repeated physical link network forwarding (IP) table Transprt (UDP) routed physical link network (IP) Transprt (UDP) routed forwarding table
CS 4284 Spring 2013 EIGRP Cisco proprietary –See [Cisco Whitepaper], [Malhotra 2002]Cisco WhitepaperMalhotra 2002 Distance Vector Protocol with enhancements –Explicit Signaling (HELLO packets) DUAL “diffusing update algorithm” –“feasible successor” concept guarantees loop freedom Intuition: rather than count to infinity, trigger route recomputation unless another loop-free path is known –Optimize this by keeping track of all advertised routes, not just best one
CS 4284 Spring 2013 OSPF (Open Shortest Path First) “open”: publicly available protocol (not proprietary) Uses Link State algorithm –LS packet dissemination –Topology map at each node –Route computation using Dijkstra’s algorithm OSPF advertisement carries one entry per neighbor router –Advertisements have age field to allow for expiration Advertisements disseminated to entire AS (via flooding) –Carried in OSPF messages directly over IP (rather than TCP or UDP)
CS 4284 Spring 2013 OSPF “advanced” features (not in RIP) Security: all OSPF messages authenticated (to prevent malicious intrusion) Multiple same-cost paths allowed (only one path in RIP) For each link, multiple cost metrics for different TOS (e.g., satellite link cost set “low” for best effort; high for real time) Integrated uni- and multicast support: –Multicast OSPF (MOSPF) uses same topology data base as OSPF Hierarchical OSPF in large domains.
CS 4284 Spring 2013 Hierarchical OSPF Two-level hierarchy: local area, backbone. –link-state advertisements only in same area –each nodes has detailed area topology; only know direction (shortest path) to nets in other areas. Area border routers: “summarize” distances to nets in own area, advertise to other Area Border routers. Backbone routers: run OSPF routing limited to backbone. Boundary routers: connect to other AS’s.
CS 4284 Spring 2013 Internet Inter-AS routing: BGP BGP (Border Gateway Protocol): the de facto standard BGP provides each AS a means to: 1.Obtain subnet reachability information from neighboring ASs. 2.Propagate the reachability information to all routers internal to the AS. 3.Determine “good” routes to subnets based on reachability information and policy. Allows a subnet to advertise its existence to rest of the Internet: “I am here”
CS 4284 Spring 2013 BGP Basics 3b 1d 3a 1c 2a AS3 AS1 AS2 1a 2c 2b 1b 3c eBGP session iBGP session Pairs of routers (BGP peers) exchange routing info over semi-permanent TCP conctns: BGP sessions Note that BGP sessions do not always correspond to physical links. When AS2 advertises a prefix to AS1, AS2 is promising it will forward any datagrams destined to that prefix towards the prefix. –AS2 can aggregate prefixes in its advertisement
CS 4284 Spring 2013 Distributing Reachability Info With eBGP session between 3a and 1c, AS3 sends prefix reachability info to AS1. 1c can then use iBGP do distribute this new prefix reach info to all routers in AS1 1b can then re-advertise the new reach info to AS2 over the 1b- to-2a eBGP session When router learns about a new prefix, it creates an entry for the prefix in its forwarding table. 3b 1d 3a 1c 2a AS3 AS1 AS2 1a 2c 2b 1b 3c eBGP session iBGP session
CS 4284 Spring 2013 Path Attributes & BGP Routes When advertising a prefix, advert includes BGP attributes. –prefix + attributes = “route” Two important attributes: –AS-PATH: contains the ASs through which the advert for the prefix passed: AS 67 AS 17 –NEXT-HOP: Indicates the specific internal-AS router to next-hop AS. (There may be multiple links from current AS to next-hop-AS.) When gateway router receives route advert, uses import policy to accept/decline.
CS 4284 Spring 2013 BGP Route Selection Router may learn about more than 1 route to some prefix. Router must select route. Elimination rules: 1.Local preference value attribute: policy decision 2.Shortest AS-PATH (like DV routing, except with more information!) 3.Closest NEXT-HOP router: hot potato routing 4.Additional criteria
CS 4284 Spring 2013 Path Vector Routing in BGP Accomplished via AS-PATH attributes –Each node is entire AS!
CS 4284 Spring 2013 BGP Messages BGP messages exchanged using TCP. BGP messages: –OPEN: opens TCP connection to peer and authenticates sender –UPDATE: advertises new path (or withdraws old) –KEEPALIVE keeps connection alive in absence of UPDATES; also ACKs OPEN request –NOTIFICATION: reports errors in previous msg; also used to close connection
CS 4284 Spring 2013 BGP routing policy A,B,C are provider networks X,W,Y are customer (of provider networks) X is dual-homed: attached to two networks –X does not want to route from B via X to C –.. so X will not advertise to B a route to C
CS 4284 Spring 2013 BGP routing policy (2) A advertises to B the path AW B advertises to X the path BAW Should B advertise to C the path BAW? –No way! B gets no “revenue” for routing CBAW since neither W nor C are B’s customers –B wants to force C to route to w via A –B wants to route only to/from its customers!
CS 4284 Spring 2013 Relationship between OSPF&BGP OSPF hierarchy is intra-AS BGP connects ASs
CS 4284 Spring 2013 Motivation for different Intra/Inter Protocols Policy: Inter-AS: admin wants control over how its traffic routed, who routes through its net. Intra-AS: single admin, so no policy decisions needed Scale: hierarchical routing saves table size, reduced update traffic Performance : Intra-AS: can focus on performance Inter-AS: policy may dominate over performance
CS 4284 Spring 2013 Usage of Routing Protocols Sample obtained by reverse-engineering router config files Source David Maltz et al: –Routing Design in Operational Networks – A Look from the inside, [SIGCOMM 2004]SIGCOMM 2004 EBGP Sessions IGP OSPFEIGRPRIPTotal Intra-1,4909,62412,74115622,521 Inter-13,8301,1611,3421612,664
CS 4284 Spring 2013 Summary IP –Addressing, subnets ICMP RIP OSPF BGP