Network Layer Chapter 5 Design Issues Routing Algorithms

Network Layer Chapter 5 Design Issues Routing Algorithms
Congestion Control Quality of Service Internetworking Network Layer of the Internet Gray units can be optionally omitted without causing later gaps Revised: August 2011 CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

The Network Layer Physical Link Network Transport Application Responsible for delivering packets between endpoints over multiple links Network Layer is the lowest layer in the OSI Reference Model that deals with end-to-end transmission. It provides services to the Transport Layer. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Design Issues Store-and-forward packet switching »
Connectionless service – datagrams » Connection-oriented service – virtual circuits » Comparison of virtual-circuits and datagrams » CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Store-and-Forward Packet Switching
Hosts send packets into the network; packets are forwarded by routers ISP’s equipment Routers treat packets as messages, receiving (storing) them and then forwarding them based on how the message is addressed. For completeness, it is a process running on the host that sends the packet into the network and receives packets at the destination. Questions: If P1 on Host H1 is sending a message to P2 on H2, for the packet at Host H1: What is the destination address for the packet’s network layer? What is the destination address for the packet’s data link layer? CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Connectionless Service – Datagrams
Packet is forwarded using destination address inside it Different packets may take different paths ISP’s equipment This model is like the postal service – each letter is sent through the network independently. Datagram is a packet that contains an absolute destination address; routers need only look up the destination address in a table to find the outgoing line to send the packet on its way. A’s table (initially) A’s table (later) C’s Table E’s Table Dest. Line CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Connection-Oriented – Virtual Circuits
Packet is forwarded along a virtual circuit using tag inside it Virtual circuit (VC) is set up ahead of time ISP’s equipment This model is like the telephone network. Packets contain tags that are not full addresses; they only need to be unique at a given link and thus are re-written at each router from an incoming tag to an outgoing tag. The virtual circuit is set up with the tag mapping along the entire path. Then packets are sent along it. The packets will thus all follow the same path (and arrive in order). “Virtual” refers to the fact that real/physical telephone circuits have both a path and a fixed bandwidth reservation of 64Kbps, whereas virtual circuits may have a variable bandwidth depending on how many VCs use a single link. A’s table C’s Table E’s Table In: Line Tag Line Tag: Out Question: For the Internet Protocol Suite, is there ANY connection-oriented protocol at the Network Layer whatsoever? CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

CONS in the Internet In addition to telephony, Tannenbaum argues that there are at least two other examples of connection-oriented protocols in the Internet: MultiProtocol Label Switching (MPLS) – see pages Virtual LANS (VLANs) – see pages Question: Are any of these three Network Layer protocols within the Internet Protocol Suite? CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Comparison of Virtual-Circuits & Datagrams
CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Routing Algorithms (1) Routing logically comprises two processes:
Forwarding: processing arriving packets by looking up appropriate outgoing link to use from routing tables Filling in and updating the routing tables. This is where routing algorithms occur. Optimality principle » Shortest path algorithm » Flooding » Distance vector routing » Link state routing » Hierarchical routing » Broadcast routing » Multicast routing » Anycast routing » Routing for mobile hosts » Routing in ad hoc networks » Routing Algorithms Grayed out topics are optional and can be omitted without loss of continuity CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Routing Algorithms (2) Routing is the process of discovering network paths Model the network as a graph of nodes and links Decide what to optimize (e.g., fairness vs efficiency) Update routes for changes in topology (e.g., failures) Forwarding is the sending of packets along a path Fairness Example where vertical Comms saturate horizontal links. Distinguish routing from forwarding. We focus on adaptive routing schemes that update routes in response to failures. Some traffic-aware schemes also adapt to changes in traffic, but we do not consider them in the algorithms that follow. For the graph, the traffic demands are A->A’, B->B’, C->C’ and X->X’. What would be fair? For each flow the get the same amount of bandwidth. If all network links have unit capacity then we would give each flow ½ a unit of capacity. The total network traffic is then 2 units. What would be efficient? If we gave the X->X’ flow no bandwidth then we could give each of the other three flows 1 unit. The total network traffic is then 3 units. So it is more efficient, but it is not fair. So we will have to decide what we want to optimize. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

The Optimality Principle
Each portion of a best path is also a best path; the union of them to a router is a tree called the sink tree Best means fewest hops in the example Network Sink tree of best paths to router B B Proof by contradiction: if a portion of a best path is not a best path then there is something better. Substitute this better portion and you would have a better overall path, which cannot be the case if the overall path is a best path. For sink trees, if there are multiple paths that are equally good, then one best path from one node to another is chosen at random. For example, H can be reached in 3 hops via H-D-A-B as shown, or by H-F-A-D (not shown). This is simple and useful as there is a single route from each router to each destination. If, instead, all equally best paths are kept then their union is a DAG (directed acyclic graph). This is a more general case that permits multiple paths from a router to a destination. The goal of all routing algorithms is to discover and use either sink trees or Directed Acyclic Graphs (DAG) to eliminate routing loops for all routers. DAGs are like sink trees except they allow all non-looping possible paths to be chosen in graphs. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Shortest Path Algorithm (1)
Shortest path selects the most efficient path through a graph in terms of a specific metric used by that Autonomous System (AS, e.g., number hops, distance, latency, bandwidth, average delay, comm cost, measured delay). Dijkstra’s algorithm computes a sink tree on the graph: Each link is assigned a non-negative weight/distance Shortest path is the one with lowest total weight Using weights of 1 gives paths with fewest hops Algorithm: Start with sink, set distance at other nodes to infinity Relax (i.e., evaluate) distance to adjacent nodes Pick the lowest adjacent distance node, add it to sink tree Repeat until all nodes are in the sink tree The notion of weight generalizes distance to other cost metrics. Setting weights to be 1 gives paths with fewest hops. Setting weight to be distance gives paths that are shortest or lowest delay. Setting weight to be lower for higher capacity links favors higher capacity paths. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Labels = (distance, path) Start at Sink and compute backwards A weighted, undirected graph of a network and the first five steps in computing the shortest paths from A to D. Pink arrows show the sink tree so far Note: Dijkstra’s Algorithm == Shortest Path Algorithm CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

. . . Start with the sink, all other nodes are unreachable Relaxation step. Lower distance to nodes linked to newest member of the sink tree . . . CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

. . . Find the lowest distance, add it to the sink tree, and repeat until done This code keeps going until it reaches a sought-after destination node; if it continued until there were no tentative nodes then it would have found the entire sink tree. The predecessor links can be reversed to find the path from t -> s instead of from s -> t. This is because the link has the same cost in each direction so paths are symmetrical (or the same in each direction). CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Flooding A simple method to send a packet to all network nodes
Flooding is SOLEY used by routing protocols at the IP Layer. For example, it is used by the Protocol Independent Multicast – Dense Mode (PIM-DM) routing protocol (i.e., flood and prune to create multicast paths). Flooding is NOT a service that is available to end users. A simple method to send a packet to all network nodes Each node floods a new packet received on an incoming link by sending it out all of the other links Nodes need to keep track of flooded packets to stop the flood; even using a hop limit can blow up exponentially Note that flooding doesn’t actually find any routes that can be reused. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Distance Vector Routing (1)
The Border Gateway Protocol (BGP) uses distance vector routing. BGP is the Inter-Domain Routing Protocol used by the Internet (i.e., the protocol used to route between Autonomous Systems (AS)). Distance Vector Routing uses the Bellman-Ford routing algorithm. Distance vector is a distributed routing algorithm Shortest path computation is split across nodes (each router maintains its own routing table giving the best known distance (and link to use) to every router in the network). Algorithm: Each node knows distance of links to its neighbors Each node advertises vector of lowest known distances to all neighbors Each node uses received vectors to update its own Repeat periodically How long will it take to converge? The diameter of the network. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Distance Vector Routing (2)
Network New vector for J Vectors received at J from Neighbors A, I, H and K CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

The Count-to-Infinity Problem
Distance Vector (DV) algorithm has a convergence issue in that it can converge to a correct routing map slowly because it reacts rapidly to good news but leisurely to bad news Failures can cause DV to “count to infinity” while seeking a path to an unreachable node System not know only path is thru B, B thinks there is a path thru C X Good news of a path to A spreads quickly B knows it has no link to A so it chooses one of its neighbors that is 3 hops away Router A is 4 routers away from Router E. The example is in terms of the implications to Router’s routing entry for A and metric is routing hops. Bad news of no path to A is learned slowly CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Link State Routing (1) Link state is an alternative to distance vector
Link state routing is often used for intra-domain routing protocols such as IS-IS and OSPF. These routing protocols are used for routing within an AS. Link state is an alternative to distance vector More computation but simpler dynamics Widely used in the Internet (OSPF, ISIS) Algorithm: Each node floods information about its neighbors in LSPs (Link State Packets); all nodes learn the full network graph with identical view of network topology Each node runs Dijkstra’s algorithm to compute the path to take from itself to each destination CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Link State Routing (2) – LSPs
LSP (Link State Packet) for a node lists neighbors and weights of links to reach them Sender ID Sequence # Age List neighbor and cost Network LSP for each node When a router is booted, it learns who its neighbors are by sending a Hello packet via each of its NICs. Adjacent router replies giving its names. Routers on Broadcast LANs select a designated router to reply for the LAN – LANs are therefore treated as if it were a single node. Each link has the same distance or cost metric. Delay can be determined by ECHO packets for systems that use delay as a metric. Link State Packets (LSP – see above) are then constructed Routers flood their LSP to all routers in the system. Age field decremented once per second and packet discarded once age hits zero CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Link State Routing (3) – Reliable Flooding
Seq. number and age are used for reliable flooding New LSPs are acknowledged on the lines they are received and sent on all other lines Example shows the packet buffer for router B E info arrived twice: EAB and EFB In the example, B has links to A, C and F in the network. It received LSPs from A, C, and F directly and so acknowledged A, C, and F respectively and sent that LSP on both other links. But B received E and D on two links, so it acknowledged both and sent only on the third link. One row of the database is used for each recently arrived but not as yet fully processed LSP. 1 in Send flag indicates the link that info needs to be sent on and 1 in ACK indicates where receipt of info needs to be ack to. Next step is to have each node locally run the Dijkstra Algorithm on the received info. Therefore, possible that different directions of same path might have different costs. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Hierarchical Routing Routing tables grow as networks grow which may cause issues. HR divides routers into regions for 2-level hierarchies; 3-level or more possible. Kamoun and Kleinrock – optimal number of levels for N router network is ln N Hierarchical routing reduces the work of route computation but may result in slightly longer paths than flat routing Hierarchical routing is what you think it is, e.g., to reach a given telephone first head towards the right country, then the right city in the country, then the phone in the city. Each node keeps only one entry per region for other regions, plus an entry for all nodes in the local region. The advantages are smaller routing tables, smaller routing computations to run at nodes, and fewer/smaller messages to send to describe the network. Best choice to reach nodes in 5 except for 5C CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Tannenbaum’s Use of “Broadcast” at the NW Layer
Tannenbaum confusingly uses “broadcast” to describe how routers support Multicast (MC). Broadcast is NOT a service available to the end user at the network layer within the Internet protocol suite. Routing Algorithms support Multicast via two alternative methods: Flood packets and then prune back to create a spanning tree Create a spanning tree from a common root location, known as Core Based Trees Routing forwarding for MC may use Reverse Path Forwarding (RPF) End users (including applications) have 3 service alternative choices at the Network Layer: Unicast Multicast Anycast. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Broadcast Routing Broadcast sends a packet to all nodes simultaneously
RPF (Reverse Path Forwarding): send broadcast received on the link to the source out all remaining links When a MC packet arrives at a router, the router checks the reverse path of packet to see if it is normally used to send MC packets. If router finds a matching routing entry for source IP addr, the RPF check passes and the packet is forwarded to all other interfaces of that MC group otherwise the packet is dropped. RPF can be used by distance vector routing systems Alternatively, can build and use sink trees (using link state) at all nodes Why use RPF? It requires only the regular (unicast) routing table at each node, such as built by distance vector, so it can be widely used. Sink trees are only available with a protocol that explicitly computes them such as link state. Note that broadcast with sink trees requires each node to compute all sink trees, since the broadcast is forwarded by looking up the sink tree for the source at each node, not a single broadcast tree for the network (as in the LAN spanning tree). However, using sink trees is more efficient since RPF over-sends. For example, D is reached from F (going down the sink tree), as well as from G (going out all remaining links). Network Sink tree for I is efficient broadcast RPF from I is larger than sink tree CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Multicast Routing (1) Multicast sends to a subset of the nodes called a group Uses a different tree for each group and source S Network with groups 1 & 2 Spanning tree from source S S S The example shows two of the multicast trees computed in the network. There are many more, the number of nodes times the number of groups. This is worth the effort when the group densely cover the network, i.e., most groups affect most nodes in the network so it makes sense for all nodes to build an efficient multicast tree. Multicast tree from S to group 1 Multicast tree from S to group 2 CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Multicast Routing (2) – Sparse Case
CBT (Core-Based Tree) uses a single tree to multicast Tree is the sink tree from core node to group members Multicast heads to the core until it reaches the CBT p 1. Tradeoff is that CBT is less efficient than computing the spanning tree for each source to reach each group, but it is less work to scale to large networks and many groups. Now, with CBT, nodes that are not on the group spanning tree do not need to compute it and can simply send to the core node using their regular routing tables. This is a good tradeoff when the groups sparsely cover the network, i.e., there are many groups that most nodes do not need to know about. Sink tree from core to group 1 Multicast is send to the core then down when it reaches the sink tree Used by PIM-SM CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Apparent topology of sink tree to “node” 1
Anycast Routing Anycast can be used by services – packet sent to the nearest member of a group (the group all use the same well-known IP address). E.g., DNS Anycast sends a packet to one (nearest) group member Falls out of regular routing with a node in many places Distance vector will send packet to shortest path of that addr Link state distinguishes between routers and host. It also can resolve anycast addr as long as Anycast nodes are in different parts of the network from each other (e.g., in different network areas, ASes). We will see later that anycast is used in practice to reach the nearest root DNS server. Apparent topology of sink tree to “node” 1 Anycast routes to group 1 Example pretends that 1 is a valid IP address CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Mobility Routers, data links, applications, and humans may have a different concept of what “mobility” is. Humans think “mobility” means changing locations. Networks only think “mobility” exists when the same IP address is used outside of its normal topological location. (Recall “Keys to Kingdom” lecture that IP addresses are locators, not identifiers.) Consider: User moves within a satellite’s “beam” is not considered mobile from satellite’s perspective even if it is a move over a substantial geographical distance User moves within the cell phone system is handled by cell protocols – not considered mobility from IP’s perspective User moves between wi-fi (IEEE ) hotspots. From IP’s perspective: Not mobility if user gets a new IP address at that new hotspot Is mobility if user doesn’t get a new IP address at that new hotspot (but it probably will in order to use that wireless LAN) Mobility from an Application Layer perspective (e.g., DNS) This is where Mobile IP is used because the retrieved IP address is a locator, not an ID CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Routing for Mobile Hosts
Mobile IP – for many apps (VoIP, VPN) sudden changes of IP addr cause problems. The Mobile IP protocol is often used when users carry mobile devices across multiple LAN subnets (e.g., IP over DVB, WLAN, WIMAX, BWA) Mobile hosts can be reached via a home agent Fixed home agent tunnels packets to reach the mobile host; reply can optimize path for subsequent packets No changes to routers or fixed hosts The tradeoff that is being made here is that the routing system that computes spanning trees is not being changed at all, but routes to reach mobile hosts can be circuitous when the mobile is far from home. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Routing in Ad Hoc Networks
The network topology changes as wireless nodes move Routes are often made on demand, e.g., AODV (below) A’s starts to find route to I A’s broadcast reaches B & D B’s and D’s broadcast reach C, F & G C’s, F’s and G’s broadcast reach H & I CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Timescales of approaches to congestion control.
Congestion causes packet delay and loss that degrades performance. Handling congestion is the responsibility of the Network and Transport layers working together We look at the Network portion here Traffic-aware routing » Section in textbook Admission control » Section in textbook Traffic throttling » Section in textbook Load shedding » Section in textbook Timescales of approaches to congestion control. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Congestion Control (2) Congestion results when too much traffic is offered; performance degrades due to loss/retransmissions Goodput (=useful packets) trails offered load As offered load increases, goodput should increase correspondingly until the capacity of the network is reached. Goodput will trail offered load because the load is bursty and queues will occasionally be too full and a packet will be discarded inside the network. Congestion collapse can occur if the protocols are not carefully designed when nodes retransmit packets many times, believing that they have been lost, when copies of the packet are still in the network (in queues at routers) pending delivery. While throughput at a receiver may be high, goodput falls because multiple copies of the same packet are being received and after the first copy the bandwidth is wasted. This really happened in the late 1980s as the Internet grew, and it lead to the design of modern TCP that includes congestion control mechanisms. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Congestion Control (3) – Approaches
Network must do its best with the offered load Different approaches at different timescales Nodes should also reduce offered load (Transport) Provisioning is simply sizing the network to fit the offered load, i.e., don’t build it too small, or with little West-to-East capacity if there is much West-to-East traffic. Provisioning – network deployment Traffic Aware – e.g., splitting traffic across multiple paths Admission Control – decrease network load (i.e., traffic entering the network) Traffic Throttling – e.g., explicit congestion notification (ECN) Load Shedding – packet drop approaches and algorithms CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Traffic-Aware Routing
Shifting traffic away from congested regions by setting the link weight to be a function of the link bandwidth and propagation delay plus the (variable) measured load or queuing delay. Least weight paths favor paths that are more lightly loaded. Rarely done today, preferring traffic engineering1 (TE) instead (e.g., QoS). Choose routes depending on traffic, not just topology E.g., use EI link for West-to-East traffic if CF is loaded But take care to avoid oscillations (i.e., convergence issues) Our previous routes only considered topology; this approach can get more traffic through the network. If not careful, then routing can notice CF is busy and switch traffic over to use EI, only to later notice that EI is busy and switch traffic back to CF. There are various techniques to avoid this: 1) change routes only slowly, e.g., traffic engineering in which an external system sets weights and the routing system does not otherwise adapt; and 2) using multiple paths at once, e.g., both CF and EI. 1TE done outside of routing protocols CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Admission Control Approach widely used in virtual-circuit nets (e.g., CONS, telephony). Admission control allows a new traffic load only if the network has sufficient capacity, e.g., with virtual circuits Can combine with looking for an uncongested route Network with some congested nodes Uncongested portion and route AB around congestion CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Traffic Throttling Congested routers signal hosts to slow down traffic
Network aims to operate just before the onset of congestion. Requires (1) routers must be able to discern when congestion is (about to) occur (e.g., queueing delay) and (2) routers must be able to deliver timely feedback to senders to throttle back rate Congested routers signal hosts to slow down traffic ECN (Explicit Congestion Notification) marks packets and receiver returns signal to sender Routers set the 2 ECN bits in IP packet header signals that router is experiencing congestion. Destination echoes this back to sender in reply ECN bits are the 2 least significant (rightmost) bit in DiffServ field in IP header In TCP, echo reply is indicated using ECE bit of TCP header – sender then knows to throttles back packet rate at the TRANSPORT Layer There are other designs, but this is the main one under deployment in the Internet. By marking existing packets using bits in the IP header, routers avoid sending additional packets at a time of congestion. Signal from receiver to sender is carried using a Transport protocol like TCP. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Load Shedding (1) When all else fails, network will drop packets (shed load) E.g., Random Early Detection (RED) – drop packets when they exceed a threshold like adv queue length Choke notification can be done end-to-end or link-by- link E2E – source quench (e.g., at TCP) LbL – routers start throttling once get a choke packet Link-by-link (right) produces rapid relief but requires larger buffering capability by intermediate routers 1 4 5 2 3 Link-by-link backpressure CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Load Shedding (2) 1 End-to-end (right) takes longer to have an effect, but can better target the cause of congestion 5 2 6 3 7 4 End-to-End backpressure CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Quality of Service Application requirements » Traffic shaping »
Packet scheduling » Admission control » Integrated services » Differentiated services » CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Application Requirements (1)
Different applications care about different properties We want all applications to get what they need . “High” means a demanding requirement, e.g., low delay CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Application Requirements (2)
QoS is part of the routing policy decisions of an Autonomous System (AS). Routers within an AS are configured to reflect the policies of that specific AS. QoS provides an optional mechanism for routers to tailor their behavior based upon the differing needs of specific applications. Network provides service with different kinds of QoS (Quality of Service) to meet application requirements Network Service Application Constant bit rate Telephony Real-time variable bit rate Videoconferencing Non-real-time variable bit rate Streaming a movie Available bit rate File transfer Video conferencing is variable bit rate because video is normally compressed, so the bit rate varies over time. Telephony is typically carried at a lower, fixed rate. Example of QoS categories from ATM networks CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Traffic Shaping (1) Traffic shaping regulates the average rate and burstiness of a flow of data entering the network Enables ASes to be able to make Service Level Agreement SLA “guarantees” For example, packets in excess of the agreed upon pattern might be dropped by the network or marked as having a lower priority Traffic Policing = monitoring Traffic Flow 2 common algorithms (leaky bucket, token bucket; see next slides) alternatively are used to the limit the long term rate of a flow but allow short term bursts up to a max regulated length Shape traffic here Traffic shaping regulates the offered traffic to a network. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

(need some water to send)
Traffic Shaping (2) Token/Leaky bucket algorithms limits both the average rate (R) and short-term burst (B) of traffic Leaky bucket algorithm – No matter what rate the packets enter the bucket, the outflow is at a constant rate (R) or less Token bucket algorithm – to send a packet must be able to take tokens out of bucket. No more than a fixed # of tokens (B) can accumulate in bucket. For token, bucket size is B, water enters at rate R and is removed to send; opposite for leaky. to send to send Leaky bucket (need not full to send) Token bucket (need some water to send) CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Traffic Shaping (3) Figure 5-29 in textbook
R = Token Arrival rate; B = Token Bucket capacity Bursty traffic Host traffic R=200 Mbps B=16000 KB Shaped by R=200 Mbps B=9600 KB Token Bucket Traffic queued on host for release into net, always a packet waiting to be sent when allowed Token Bucket Shaped by R=200 Mbps B=0 KB For the host traffic the descriptor R=200 Mbps, B=16000KB is the smallest token bucket that can let the traffic pass unchanged. To compute this we work out R as the average rate over the time period, then given we find the smallest B such that the bucket size only just reaches zero at some point. Using a Token Bucket algorithm; a-c shows traffic shaping results of different R and B Token Bucket configs and d-f shows corresponding Token Bucket status Smaller bucket size delays traffic and reduces burstiness CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Packet Scheduling (1) Packet scheduling provides a mechanism for net admins of an AS to reserve resources for certain types of traffic (“flows”). Resources can be bandwidth, buffer space, and/or CPU cycles. Packet scheduling divides router/link resources among traffic flows with alternatives to FIFO (First In First Out) 1 1 1 2 2 3 3 3 Example of round-robin queuing CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Packet Scheduling (2) WFQ permits queues to have different rates (i.e., priorities). Fair Queueing approximates bit-level fairness with different packet sizes; weights change target levels Result is WFQ (Weighted Fair Queueing) Virtual times are measured in rounds, where a round lets each input queue send 1 bit for weight 1, or W bits for weight W. The time to send a packet of length L is thus L/W. The formula says that the finish virtual time for a packet is the larger of its arrival time plus the time to send it, or the finish time of the previous packet in the same queue plus the time to send it. Fi = max(Ai, Fi-1) + Li/W Packets may be sent out of arrival order Finish virtual times determine transmission order CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Overview: 2 Different Protocols for QoS
2 QoS approaches supported by IP routers: DiffServ – per hop mechanism offering better scalability; uses ECN and DiffServ field in IP header Question: Who knows what “per hop” means? IntServ – Tighter E2E QoS mechanism for real time traffic for specific flows that are established using RSVP Cisco: The 2 approaches are complementary and NOT mutually exclusive. However, the textbook (Tanenbaum) is oriented to IntServ. Instructor’s experience: Never encountered a deployment that didn’t support DiffServ but have encountered many devices / deployments that could not support IntServ. During instructor’s career IntServ has been experimental while DiffServ has been mature. Question: What is the practical difference between experimental and mature? CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Example flow specification for IntServ
Admission Control (1) QoS “guarantees” are established through the process of admission control. Admission Control is a necessary part of IntServ QoS. Admission control takes a traffic flow specification and decides whether the network can carry it By contrast, DiffServ is best effort – admission control is solely to ensure the customer’s DiffServ labels on packets are appropriate for contractual service level agreements (policing) Sets up packet scheduling to meet QoS Token bucket for max sustained rate Token bucket for largest burst rate Max transmission rate tolerated Packet sizes reflect processing overheads supported Example flow specification for IntServ CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Admission Control (2) Example showing the Parekh and Gallagher method to relate flow specifications to router resources for IntServ Construction to guarantee bandwidth B and delay D: “Guarantee” accomplished by setting a high enough R,B weight to support the flow Shape traffic source to a (R, B) token bucket R = average rate; B = Burst Run WFQ with weight W / all weights > R/capacity Holds for all traffic patterns, all topologies Bandwidth is guaranteed at each router by setting a high enough weight on the flow; if this cannot be done then the flow must not be admitted. Delay guarantees are more subtle and the bound is not given here. Essentially a burst of traffic can arrive at one router and be delayed but then it will not be delayed at other routers because it has already been shaped to be less bursty. So the total delay is something like the propagation delay plus B/R. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Integrated Services (1)
Design with QoS for each flow; handles multicast traffic. Admission with RSVP (Resource reSerVation Protocol): Receiver sends a request back to the sender Each router along the way reserves resources Routers merge multiple requests for same flow Entire path is set up, or reservation not made CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Integrated Services (2)
Merge R3 reserves flow from S1 R3 reserves flow from S2 R5 reserves flow from S1; merged with R3 at H CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Differentiated Services (1)
Design with classes of QoS (done on a router-by-router level though configuring per hop behaviors (PHB) for DiffServ field of IP header); customers buy what they want through “service level agreements” Expedited class is sent in preference to regular class PHB given preferential treatment Less expedited traffic but better quality for applications CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Differentiated Services (2)
Implementation of DiffServ: Classifier – e.g., Customers mark desired PHB class on DiffServ field of IP packet Policer – Ingress router ensures the classification in line with service level agreement (i.e., markings have been paid for) ISP shapes traffic (priority/drop/queueing configs) according to how it implemented (configured) the PHP in its AS For example, Routers use WFQ to give different service levels Possible implementation of Assured Forwarding CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Internetworking The word “network” may mean several very different things in data communications (e.g., network layer, AS = network). Here the meaning is data link – networks in this section refer to differences between different kinds of data link layer protocols. IP regularizes and hides these differences from the Transport Layer, which is the layer it provides services for. Internetworking joins multiple, different networks into a single larger network How networks differ » How networks can be connected » Tunneling » Internetwork routing » Packet fragmentation » CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

How Networks Differ The network layer (IP) handles potentially substantial differences between underlying data links. These differences are not apparent to higher layers – this is part of the network layer’s service to the transport layer. Differences can be large; complicates internetworking CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

How Networks Can Be Connected
Internetworking based on a common network layer – IP Packet mapped to a VC here Common protocol (IP) carried all the way The top half of the figure shows the difficulties – a packet sent as a datagram may suddenly have to be sent over a virtual circuit, which requires some way to map between the two. The bottom half shows the solution – a common network layer protocol, IP, carries addresses and other information that identify the endpoints across networks. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

IPsec (IP Security) in tunnel mode
Tunneling (1) Connects two networks through a middle one Packets are encapsulates over the middle Tunneling can also be IPv4 in IPv4 and IPv6 in IPv6 IPsec (IP Security) in tunnel mode CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Tunneling (2) Tunneling analogy:
tunnel is a link; packet can only enter/exit at ends CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Packet Fragmentation (1)
Links have different packet size limits for many reasons Large packets sent with fragmentation & reassembly G1 fragments G2 reassembles Transparent – packets fragmented / reassembled in each network Non-transparent – fragments are reassembled at destination G3 fragments G4 reassembles … destination will reassemble CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Example of IP-style fragmentation: Packet number Start offset End bit Original packet: (10 data bytes) Fragmented: (to 8 data bytes) Re-fragmented: (to 5 bytes) CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Fragmentation is detrimental to performance due to header overheads for fragmented packets and the whole packet is lost if any fragments are lost. This is why packets are not fragmented in IPv6 (but they can be in IPv4). For IPv6, packets are dropped if they are larger than the MTU (Max Transmission Unit). MTU is a function of routing path (i.e., underlying links). MTU discovery used to learn the MTU for that path. Path MTU Discovery avoids network fragmentation Each packet sent with header bits set to “no fragmentation” If a router receives a packet that is too large for the link, it generates an error packet, sends it to the source, and drops the packet. Try 1200 Try 900 Question: In this example, how many times is the packet sent? CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Network Layer in the Internet (1)
IP Version 4 » IP Addresses » IP Version 6 » Internet Control Protocols » Label Switching and MPLS » OSPF—An Interior Gateway Routing Protocol » BGP—The Exterior Gateway Routing Protocol » Internet Multicasting » Mobile IP » CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

IP has been shaped by guiding principles (e.g., RFC 1958): Make sure it works Keep it simple Make clear choices Exploit modularity Expect heterogeneity Avoid static options and parameters Look for good design (not perfect) Strict sending, tolerant receiving Think about scalability Consider performance and cost Question: Is the rationale for the entries on this list clear? Please identify any item that you don’t know why it is important. Very much an open working design that has favored simplicity and practical engineering considerations rather than design by committee. CS 450’s Second Writing Assignment contrasts bulleted list on pages (summarized above) with Noel Chiappa’s Internet-Draft document used in the IPv6 creation process. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Internet is an interconnected collection of many networks that is held together by the IP protocol In the IETF participants often distinguish between 3 distinct ISP roles Tier 1, Tier 2, Tier 3 CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Figure 5-46 on Page 439 of Textbook
IP Version 4 Protocol (1) IPv4 (Internet Protocol) header is carried on all packets and has fields for the key parts of the protocol: Transmission must be big endian (left to right, high order bit first) Figure 5-46 on Page 439 of Textbook CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

IP Addresses (5) – Classful Addressing
Old (from beginning to mid-1990s) IPv4 addresses came in blocks of fixed size (A, B, C) Carries size within the address, but lacks flexibility Called classful (vs. classless) addressing This is just history. Prefixes are variable size, which is much more flexible and suits different kinds of usage, but now the prefix length needs to be carried separately by the routing protocols because it is not part of the address. Remember: IPv4 addresses are 32 bits; written as period denominated octets in decimal CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

IP Addresses (1) – Prefixes
Classless InterDomain Routing (CIDR) addresses (RFC 4632) have been used from the mid-1990s on. Addresses are allocated in blocks called prefixes Prefix is the network portion (routing topology locator) Host – identifies a specific network interface within that subnetwork Written: address/length, e.g., /24 / is pronounced “slash” Subnetwork mask for this example is Question: what is a subnetwork mask? CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

IP Addresses (2) – Subnets
Subnetting splits up IP prefix to help with management Looks like a single prefix outside the network Network divides into subnets internally Check the size of the different subnets: Prefix has 2^16=32K addresses, CS subnet is largest, 2^15=16K addresses, EE subnet has 2^14=8K addresses, Art has 2^13=4K addresses. There are 4K addresses left over. What is the prefix? It is found by writing the address ranges out to see that /19 is left. Can the prefix lengths just be changed between EE/CS/Art? No, then blocks of size 2^N would not always be aligned on a 2^N boundary. ISP gives network a single prefix Small entities get their IP addresses from their ISP Change ISP, then IP addresses also change Larger entities get their IP addresses from a registrar Larger entities own their IP addresses CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

IP Addresses (3) – Aggregation
Aggregation joins multiple IP prefixes into a single larger prefix to reduce routing table size CIDR key element for Internet’s scalability due to aggregation Consider the implication to aggregation of the old stateful IPv4 addresses This example only considers CIDR addresses ISP advertises a single prefix Same mechanism as subnets, just a different motivation (of reducing the size of routing tables instead of making it easier to use the block of addresses you have). ISP’s customers have prefixes with larger slash #s (thus fewer addresses) Cambridge: plus 211 host addr Oxford: plus 212 host addr Edinburgh: plus 210 host addr Question: How many host addr are really in each subnetwork in this example? CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

IP Addresses (4) – Longest Matching Prefix
Packets are forwarded to the entry with the longest matching prefix (i.e., higher slash number) == smallest address block Complicates forwarding but adds flexibility At New York, the LMP rule makes it easy to add an exception to go elsewhere to reach part of a prefix. Without it, we would have had to split the prefix up into its components and give a route for each component. Except for this part! Main prefix goes this way Longest Matching Prefix forwarding explains how anycast works. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

IP Addresses (6) – NAT NAT (Network Address Translation) box maps one external IP address to many internal IP addresses Uses TCP/UDP port to tell connections apart Violates layering; very common in homes, etc. So Internet traffic sent to/from port 1111 might really be going to a computer A in the home while traffic sent to/from port 2222 to the same IP address might be going to a computer B. The mapping in the NAT box is set up when a connection is established. A side-effect is that connections can only be made from inside the house to the Internet – you can’t run a server in your home without special configuration. This is a consequence of violating layering. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

IP Version 6 (1) Major upgrade in the 1990s due to impending address exhaustion, with various other goals: Support billions of hosts Reduce routing table size Simplify protocol Better security Attention to type of service Aid multicasting Roaming host without changing address Allow future protocol evolution Permit coexistence of old, new protocols, … Deployment has been slow & painful, but may pick up pace now that addresses are all but exhausted CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

IP Version 6 (2 ) IPv6 protocol header has much longer addresses (128 vs. 32 bits) and is simpler (by using extension headers) CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

IP Version 6 (3) IPv6 extension headers handles other functionality
Covered in textbook pages CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Internet Control Protocols (1)
IP works with the help of several control protocols: ICMP is a companion to IP that returns error info Required, and used in many ways, e.g., for traceroute, ping ARP finds Ethernet address of a local IP address Glue that is needed to send any IP packets Host queries an address and the owner replies DHCP assigns a local IP address to a host Gets host started by automatically configuring it Host sends request to server, which grants a lease CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Main ICMP (Internet Control Message Protocol) types: Incomplete list of ICMP message types given here, complete list found at Question: If you were making ping or traceroute application, which ICMP message type(s) would you use? CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

ARP (Address Resolution Protocol) lets nodes find target Ethernet addresses [pink] from their IP addresses Protocol to establish mapping between DL and Network addresses MAC broadcast asking “who owns the destination IP address?” Off-LAN traffic sent to the local router (i.e., default gateway) for forwarding CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Label Switching and MPLS (1)
MPLS (Multi-Protocol Label Switching) sends packets along established paths; ISPs can use for QoS Path indicated with label below the IP layer CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Label Switching and MPLS (2)
Label added based on IP address on entering an MPLS network (e.g., ISP) and removed when leaving it Forwarding only uses label inside MPLS network CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

OSPF— Interior Routing Protocol (1)
OSPF computes routes for a single network (e.g., ISP) Models network as a graph of weighted edges Intra-Domain Routing; Uses Link State algorithm (textbook pages 373 – 378) Network: Graph: The broadcast LAN connecting routers (LAN 3) could be modeled as a mesh since it connects each of R3, R4 and R5 to all of the others. Instead, it is modeled as a node (LAN 3) to which the other nodes connect. 3 Broadcast LAN is modeled as if it were a well-connected node (one designated router IDed per LAN) CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

OSPF optionally divides one large network (Autonomous System) into areas connected to a backbone area Helps to scale; summaries go over area borders CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

OSPF (Open Shortest Path First) is link-state routing: Uses messages below to reliably flood topology Then runs Dijkstra to compute routes CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

BGP— Exterior Routing Protocol (1)
BGP (Border Gateway Protocol) computes routes across interconnected, autonomous networks Internet’s Inter-Domain Routing Protocol Key role is to respect networks’ policy constraints Uses the Bellman-Ford algorithm (i.e., distance vector routing described on pages 370 to 373 of textbook) Implements AS’ policy vis-à-vis other networks Purposefully few BGP routers for an AS; often collated with Perimeter defense Firewalls BGP connections occur OVER TCP links – Question: what are the implications? Pairwise connections formed between specific routers in different ASes Example policy constraints handled by BGP: No commercial traffic for educational network Never put Iraq on route starting at Pentagon Choose cheaper network Choose better performing network Don’t go from Apple to Google to Apple Since different networks have different practices and goals we can’t reduce the preferred routes to a single weight number attached to links. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Common policy distinction is transit vs. peering: Transit carries traffic for pay; peers for mutual benefit AS1 carries AS2↔AS4 (Transit) but not AS3 (Peer) CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

BGP propagates messages along policy-compliant routes Message: Prefix, AS path, next-hop IP (to send over the local network) BGP therefore keeps track of the path used Path = next hop router & AS path (seq of ASes to detect loops) that route followed Message: CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Internet Multicasting
Groups have a reserved IP address range (class D) Membership in a group handled by IGMP (Internet Group Management Protocol) that runs at routers Routes computed by protocols such as PIM (protocol independent multicast): Dense mode uses RPF with pruning (PIM-DM) Sparse mode uses core-based trees (PIM-SM) IP multicasting is not widely used except within a single network, e.g., datacenter, cable TV network. CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Mobile IP Mobile hosts can be reached at fixed IP via a home agent
Home agent tunnels packets to reach the mobile host; reply can optimize path for subsequent packets No changes to routers or fixed hosts This is a repeat of earlier “routing for mobile hosts” which was modeled on the mobile IP protocol CN5E by Tanenbaum & Wetherall, © Pearson Education-Prentice Hall and D. Wetherall, 2011

Network Layer Chapter 5 Design Issues Routing Algorithms

Similar presentations

Presentation on theme: "Network Layer Chapter 5 Design Issues Routing Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Network Layer Chapter 5 Design Issues Routing Algorithms

Similar presentations

Presentation on theme: "Network Layer Chapter 5 Design Issues Routing Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback