Presentation is loading. Please wait.

Presentation is loading. Please wait.

TCOM 509: UDP, TCP/IP - Internet Protocols

Similar presentations


Presentation on theme: "TCOM 509: UDP, TCP/IP - Internet Protocols"— Presentation transcript:

1 TCOM 509: UDP, TCP/IP - Internet Protocols
* Obtained permission to use Raj Jain’s technical material

2

3 IP Routing An example routing table Search Order of routing table
Destination Next-Hop Flags Network Interface H lo0 Default G emd0 emd0 H = 1 Destination is a complete host address G=1 Next-Hop to a router 0 Destination is a network address G= Next-Hop to a directly connected destination Search Order of routing table 1: Complete match of destination IP 2: Match the network addr (including the subnet ID) 3: Use default router 4: If all previous steps fail to find a suitable entry, send an ICMP “host unreachable error”

4 ROUTING SUMMARY Routing is the process of discovering, selecting and following paths from the transmitting host to the receiving host in a network. There are two categories of routing algorithms: Source Routing: The transmitting host inserts a list of routers that describe a path through the network. Hop-by-Hop Routing: The transmitting host knows how to get to the first router. The router then employs its Routing Table to select the next best hop (router), which selects the next best router, etc. Routing Source Routing Hop-by-Hop Routing Strict Loose Static Dynamic Default Distance Vector: The router sends a list of networks, how far they are and the next hop direction. Link State: The router has a complete topology map of the network. Path Vector: The router sends a complete path to get to a destination. Distance Vector Link State Path Vector 1. RIP 2. IGRP 1. OSPF 2. LS-LS 1. EGP 2. BGP

5 IP Routing Strategies Static Routing Dynamic Routing
Pre-determined routes setup by administrator Dynamic Routing Interior Routing Protocols RIP, OSPF Exterior Routing Protocols BGP

6 How Does IP Routing Work?
Basic procedure: search for a matching host address (/32) search for a matching network address (/x, where 0<x<32) search for a default entry (0/0) If all previous steps fail to find a suitable entry, send an ICMP “host unreachable error” IP packets are routed via a “best-match” or “longest-match” principle Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

7 Processing an IP packet
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

8 IP Source Routing IP routing has no concept of the source determining the route What if the source wanted to specify the packet’s path? The source route option was added to the IP protocol in order to assist in route debugging. Nowadays, it seems to be mainly used by large ISPs, to make sure that their peers aren't inappropriately dumping traffic onto their backbone links. A packet is given a list of desired hops that should be taken on the way to the final destination. Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

9 IP Source Routing via IP Options
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability. CODE = 131

10 Source Routing example
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

11 Types of Routes Static Dynamic
All packets forwarded to predetermined destinations defined by an administrator Dynamic Packets are forwarded to dynamically calculated routes determined by a routing protocol

12

13 Static Routing Benefits Drawbacks Good for small networks
Can help create a secure network Efficiently uses router resources Drawbacks Does not handle network failures well Does not scale well

14 Static Routing Example
Destination Next Hop Direct Router B Router C Router C Network 10 Destination Next Hop 10 Router A Router B Direct Router D Router A Destination Next Hop 10 Router A Direct Router C Router C Router B Router C Network Network Router D Destination Next Hop Direct Default Router C Network

15 Static Routing with Link Failure
Destination Next Hop 10 Direct Router B Router C Router C Network 10 Destination Next Hop 10 Unreachable Router B Direct Router D Router A Destination Next Hop 10 Router A Direct Router C Router C Router B Router C Network Network Router D Destination Next Hop Direct Default Router C Network

16 Dynamic Routing Distance-Vector Link-State Communicate what? Between
whom? Distance-Vector Routing tables Neighbors Link-State Interface status All routers

17 Dynamic IP Routing Protocols
RIP (Distance Vector) OSPF (Link State) IS-IS (Link State) BGP (Path Vector)

18 Distance Vector vs. Link State

19 Concept of Administrative Distance
Administrative Distances:  Concept of Administrative Distance Connected Static to Interface or Static to Next Hop 1 E-IGRP (Cisco only) 90 OSPF 110 IS-IS 115 RIP v1 and v2 120 Only one IGP route is installed in the routing table Administrative Distances of Routing Protocols: Measures trustworthiness of the source of route Handles preferences when multiple sources of routing info exists in router Protocol with lowest admin weight wins

20 Distance-Vector and Link State Protocol
Protocol Category Metric Algorithm RIP v1 Distance Vector Hop Count Bellman-Ford RIP v2 OSPF Link State Bandwidth-based cost Shortest Path First IGRP Composite

21 What is RIP (Routing Info Protocol)?
RIP is a Interior Gateway Protocol (IGP) Used within an Autonomous System (AS) A collection of routers under the same administrative authority Two versions RIP v1 (RFC 1058) RIP v2 (RFC 2453)

22 Distance Vector Routing Protocol - RIP v1 - Characteristics
Directly connected subnets are known Routing updates are broadcasted to neighbors Listen to routing updates Metrics are used Routing info consists of subnet and metric Periodic updates (30 sec) A route is learned via a neighbor Failed route has a metric of infinite

23 RIP Uses UDP Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability. RIP is a UDP-based protocol. Each router that uses RIP has a routing process that sends and receives datagrams on UDP port number 520, the RIP-1/RIP-2 port. All communications intended for another routers's RIP process are sent to the RIP port. All routing update messages are sent from the RIP port.

24

25

26 RIP Characteristics Distance-vector routing protocol
Updates contain routes (vectors) and the cost (distance) to reach them and consist of the following steps: Each node calculates the distances between itself and all other nodes within the AS and stores this information as a table. Each node sends its table to all neighboring nodes. When a node receives distance tables from its neighbors, it calculates the shortest routes to all other nodes and updates its own table to reflect any changes Does not scale well for large networks as every router has to add a RIP route for every newly added network Hop count is used as the metric for path selection, based on Bellman-Ford distance-vector routing algorithm Maximum allowable hop count is 15 Routing updates are broadcast every 30 seconds

27 RIP Message Types Two message types Advertises 25 routes per update
Request message Ask neighbors to send routes Response message Carries route updates Advertises 25 routes per update Router decides how to handle routes in update Add, modify, or delete

28 RIP Routing Metrics Counts the number of hops between source and destination Number of hops is the number of router hops Hop count equals the RIP metric RIP cannot determine measured delay, reliability, load, or link bandwidth With multiple paths to the same prefix, one with fewest hops is selected May not be optimum path

29 RIP in Action (1): 1 162.11.5.0 Router A & C is down Router A Tr0 s0
Routing table B s0 s0 Router C Router B E s E0 E0

30 RIP in Action (2): 1 162.11.5.0 Router C is down
Router A is switched on Router A Tr0 1 s0 s1 s0 s0 s s s E Router C Router B E0 E0

31 RIP in Action (3): 2 1 Router C is switched on 162.11.5.0 162.11.9.0
Router A Router B Router C s0 E0 s1 Tr0 1 2 s s s E s

32 RIP Timers RIP uses numerous timers to regulate its performance. These include a routing-update timer, a route-timeout timer, and a route-flush timer. Routing-update timer - clocks the interval between periodic routing updates. Generally, it is set to 30 seconds, with a small random amount of time added whenever the timer is reset. This is done to help prevent congestion, which could result from all routers simultaneously attempting to update their neighbors. Route-timeout timer - Each routing table entry has a route-timeout timer associated with it. When the route-timeout timer expires, the route is marked invalid but is retained in the table until the route-flush timer expires. Default value is 120 secs Route-flush timer - If 180 seconds elapse from the last time the timeout was initialized, the route is considered to have expired, and the deletion process described below begins for that route. Default value is 180 secs.

33 Bellman Ford’s Distance Vector Algorithm – Example

34 Disadvantages with the Bellman Ford’s Algorithm
Does not scale well Changes in network topology are not reflected quickly since updates are spread node-by-node. Counting to infinity (if link or node failures render a node unreachable from some set of other nodes, those nodes may spend forever gradually increasing their estimates of the distance to it, and in the meantime there may be routing loops)

35 Count To Infinity Problem – URL Link
                                                                                                                                                              Count To Infinity Problem – URL Link

36 Improving Convergence
Split Horizon For interface X, don’t advertise routes out X that you learned via X prevents forwarding loops only for 2 adjacent router case joke analogy: if you tell me a joke and you get it, I don’t need to tell it back to you Hold Down Timers refuse to accept any information for a period of time (60 secs) after a route is declared unreachable can increase convergence time Triggered Updates when a change occurs, send update immediately (don’t wait for next update interval) Change can be defined as an observed increase in hop count over time (1.6 –2.0 increase in originally store hop count) Attempt to speed up convergence Hold Down Timers and Triggered Updates Can be used together to be more effective Split Horizon with Poison Reverse a.k.a. “Infinite Split Horizon” For interface X, DO advertise routes out X that you learned via X, but with a metric of INFINITY advantage: eliminates two-router loops disadvantage: increases the size of routing updates None of these mechanisms can completely avoid routing loops and counting to infinity doesn’t go away

37 RIPv2 (rfc 2453) – Solves some of the RIPv1 Shortcomings
RIPv2 is classless and uses UDP port 520 as does RIPv1 (classful). It is still distance vector and still uses hop count as the metric with a max hop count of                                                                                     The ability to multicast saves other devices on the network from wasting time opening broadcast packets.

38

39 RIPv2 – Subnet Mask Classless routing protocols carry the subnet mask. This allows all 0 and 1 subnets to be used, eliminating confusion between and Here, one is the 'all subnets' broadcast and one is broadcast on the all 1s subnet - but which is which? If the subnet mask is sent then /16 and /24 can be differentiated.

40 RIPv2 – Route Tag Each RIPv2 entry includes a Route Tag field, where additional information about a route can be stored. It provides a method for distinguishing between internal routes (learned by RIP) and external routes (learned from other protocols).

41 RIPv2 – Next Hop ----- ----- ----- ----- ----- -----
In RIPv2, each RIP entry includes a space where an explicit IP address can be entered as the next hop router for datagrams intended for the network in that entry Specifying a value of in this field indicates that routing should be via the originator of the RIP advertisement. The purpose of the Next Hop field is to eliminate packets being routed through extra hops in the system. It is particularly useful when RIP is not being run on all of the routers on a network. A simple example is given in Appendix A. Note that Next Hop is an "advisory" field. That is, if the provided information is ignored, a possibly sub-optimal, but absolutely valid, route may be taken. If the received Next Hop is not directly reachable, it should be treated as |IR1| |IR2| |IR3| |XR1| |XR2| |XR3| | | | | | | < RIP > Assume that IR1, IR2, and IR3 are all "internal" routers which are under one administration (e.g. a campus) which has elected to use RIP-2 as its IGP. XR1, XR2, and XR3, on the other hand, are under separate administration (e.g. a regional network, of which the campus is a member) and are using some other routing protocol (e.g. OSPF). XR1, XR2, and XR3 exchange routing information among themselves such that they know that the best routes to networks N1 and N2 are via XR1, to N3, N4, and N5 are via XR2, and to N6 and N7 are via XR3. By setting the Next Hop field correctly (to XR2 for N3/N4/N5, to XR3 for N6/N7), only XR1 need exchange RIP-2 routes with IR1/IR2/IR3 for routing to occur without additional hops through XR1. Without the Next Hop (for example, if RIP-1 were used) it would be necessary for XR2 and XR3 to also participate in the RIP-2 protocol to eliminate extra hops.

42 RIPv2 - Authentication 8 bits Command Version
Unused - set to all zeros 0XFFF Authentication Type Password (bytes 0-3) Password (bytes 4-7) Password (bytes 8-11) Password (bytes 12-15) A password is indicated if the AFI field is set to 0XFFF. The authentication type for simple authentication is set to 0X002. The password is left justified and unused bits are set to zero. MD5 authentication may be enabled to overcome plain-text authentication. Use the Authentication Type field to identify the method used. MD5 computes a 128-bit hash value from plain text plus password. This hash is transmitted along with the message and the hash is recalculated at the far end and the received and calculated hash values are checked against each other. If they match, the message is authenticated. RIPv2 authenticates the source of the packets. The source of the update uses the first field of the message that would normally carry IP address, SM, Next Hop, Metric and hijacks these for authentication.  This leaves room for only 24 updates per packet instead of 25 with RIPv1.

43 RIP v2 Packet Format … 31 8 16 24 Command Version
8 16 24 31 Command Version Reserved (Must be zero) Address Family Identifier Route Tag IP Address Subnet Mask Next Hop Metric Address Family Identifier Route Tag IP Address Subnet Mask Next Hop Metric

44 RIP Limitations Maximum network diameter = 15
Lack of alternative routes. RIPv2 keeps only one route to a destination in routing tables. It has to wait for updates after a failure to assess whether a new (if any) route exists Regular updates include entire routing table approximately every 30 seconds Poison reverse increases the size of the routing updates Count to infinity slows route loop prevention Metrics only involve hop count Broadcasts between neighbors (RIPv1 only) Classful routing means no prefix length carried in route updates (RIPv1 only) – and no VLSM No authentication mechanism (RIPv1 only) Slow convergence

45 OSPF

46 OSPF Concept : Having the Same Copy of Network Topology at Every Node
R1 LSA R3 LSA R2 LSA R5 LSA R4 LSA R6 LSA xyz LSA abc LSA pdq LSA

47 SPF Algorithm The Shortest Path First (SPF) routing algorithm is the basis for OSPF operations. When an SPF router is powered up, it initializes its routing-protocol data structures and then waits for indications from lower-layer protocols that its interfaces are functional. After a router is assured that its interfaces are functioning, it uses the OSPF Hello protocol to acquire neighbors, which are routers with interfaces to a common network. The router sends hello packets to its neighbors and receives their hello packets. In addition to helping acquire neighbors, hello packets also act as keepalives to let routers know that other routers are still functional. On multi-access networks (networks supporting more than two routers), the Hello protocol elects a designated router and a backup designated router. Among other things, the designated router is responsible for generating LSAs for the entire multi-access network. Designated routers allow a reduction in network traffic and in the size of the topological database. When the link-state databases of two neighboring routers are synchronized, the routers are said to be adjacent. On multiaccess networks, the designated router determines which routers should become adjacent. Topological databases are synchronized between pairs of adjacent routers. Adjacencies control the distribution of routing-protocol packets, which are sent and received only on adjacencies. Each router periodically sends an LSA to provide information on a router's adjacencies or to inform others when a router's state changes. By comparing established adjacencies to link states, failed routers can be detected quickly, and the network's topology can be altered appropriately. From the topological database generated from LSAs, each router calculates a shortest-path tree, with itself as root. The shortest-path tree, in turn, yields a routing table.

48 How OSPF Protocol Works
Stage 1: Discovering Neighbors => Hello Message Stage 2: Electing the Designated Router Stage 3: Establishing Adjacencies => DB Description Msgs Stage 4: Propagating Link State Information => Flooding using LS Request/Update Msgs) Stage 5: Calculating the Routing Table(s) => (Diksjtra’s Algorithm)

49 First Requirements for the new IGP to Replace RIP
Had to be more efficient than RIP consume fewer network resources: link bandwidth and CPU cycles Faster convergence than RIP communicate changes quickly link, interface, and router failures More descriptive metric than RIP hop-count limitations ability to include other factors (bandwidth, delay, reliability, etc.) Figure 3.2 OSPF ended up using cost 16 bits, no limit on total path cost eliminated network diameter limitations

50 New IGP Requirements (2)
Support for load balancing over multiple equal-cost links to a destination more efficient use of network resources implementation was not mandated by the protocol multiple strategies exist: flow-based, round-robin, hash function, packet-by-packet in theory, multiple vendor strategies can be combined in a single network and it can still work, but this needs to be examined closely in some cases Support for a routing hierarchy split the AS up into mini-AS’s, in a sense a scalability mechanism

51 New IGP Requirements (3)
Separate internal and external routes RIPv1 had no way to distinguish You generally trust info from your AS over routes from outside your AS Support for more flexible subnetting essentially CIDR addressing and notation, no notion of classful routing Security ability to control what routers participate in OSPF routings based on a password ToS-based routing allow specification of different metrics for each of the original ToS categories in reality, never really used; chicken-and-egg problem

52

53 What is OSPF? An IGP using Link-State technique to update routing tables Based on the shortest path first (SPF) algorithm, also known as the Dijkstra algorithm Created to fill the need for a high functionality, standards-based IGP for the TCP/IP protocol family Main RFCs: 1587 – OSPF NSSA Option 2328 – OSPF Version 2 (current implementation)

54 What is a Link-State Protocol ?
Link = router interface State = description of interface and its relationship to neighboring routers OSPF routers send link-state advertisements (LSAs) to all other routers within the same hierarchical area Routers store information in a link-state, or topological, database Each OSPF router uses the SPF algorithm to calculate the shortest path to each node

55 Three (3) Types of OSPF LS Messages
LSA (Link State Advertisement): LSAs are included in the database description packets (DDPs or DBDs). LSA entries include link-state type, the address of the advertising router, the cost of the link, and the sequence number. LSR ( Link State Request): When a slave router receives an DDP (Database Description Packet), it sends and LSAck packet. Then it compares the received information with the information it has. If the DDP has more recent information, the slave router sends a link-state request (LSR) to the master router. LSU ( Link State Update): LSU packet is sent in response to LSR (Link-State Request) packet sent from a slave router to a master router. LSU contains complete information about the requested entry.

56 What is SPF? Places each router at the root of a tree and calculates the shortest path to each destination based on the cumulative cost to reach that destination Each router has its own view of the topology even though all the routers build a shortest path tree using the same link-state database

57 SPF Cost Cost, or metric, of an interface indicates the overhead required to send packets across that interface Cost = 10**8/bandwidth (bps) Higher bandwidth = lower cost 10M Ethernet line cost = 10**8/10**7 = 10 T1 line cost = 10**8/ = 64 To handle hi-speed links, use a value greater than 10**8 in the cost calculation This is the Reference Bandwidth

58

59 Shortest Path Tree Router A’s SPF tree
A is the Root; use the least-cost path to each IP prefix If a link goes down, the SPF tree is recalculated Each router calculates its own SPF tree Router A Router D 10 10 Router B 5 Router D 5 10 7

60 Dijkstra’s Link State Algorithm
Principle: Dijkstra's algorithm works on the principle that the shortest possible path from the source has to come from one of the shortest path already discovered. Layman’s Terms: Using the street map, you're marking over the streets (tracing the street with a marker) in a certain order, until you have a route marked in from the starting point to the destination. The order is conceptually simple: from all the street intersections of the already marked routes, find the closest unmarked intersection - closest to the starting point (the "greedy" part). It's the whole marked route to the intersection, plus the street to the new, unmarked intersection. Mark that street to that intersection, draw an arrow with the direction, then repeat. Never mark to any intersection twice. When you get to the destination, follow the arrows backwards. There will be only one path back against the arrows, the shortest one. Demo:

61 OSPF Breaks an AS into Areas
ABR Area 234 Area 0 Area 10 ABR

62

63 Area Sizing Guidelines
Rules of thumb for non-backbone area No more than 100 routers No more than 50 neighbors per router Decrease when media unstable Consider static/default and demand techniques Decrease when large numbers of externals injected Consider if the incoming externals can be summarized or filtered

64 When Might Single-Area OSPF make sense?
Fewer than 50 routers with alternate paths Needs: multivendor compatibily fast convergence VLSM complex defaults and externals No clear candidates for core OSPF power greatest with hierarchy Multiple domains may be better than 1 area

65 Design Guidelines – Network Topology (Cont’d) - OSPF Network Size Recommendation

66 Design Guidelines – Network Topology (Cont’d) - How Many Areas Should Be Connected per ABR?

67 OSPF : Location of different routers

68 Different Types of OSPF routers
Internal router: An internal router has all the interfaces in the same area. All internal routers have same link state databases. Backbone router: Backbone routers sit on the perimeter of Area 0, with at least one interface connected to backbone (Area 0). Area Border Router (ABR): ABRs are routers that have interfaces attached to multiple areas. It may be noted that these routers maintain separate link-state databases for each area that they are connected. They are capable of routing traffic destined for or arriving from other areas. Autonomous System Boundary Router (ASBR): These are the routers that have at least one interface to the external network (another autonomous system). This autonomous network can be non-OSPF. ASBRs are capable of route redistribution, a term used to imply that the concerned router can import routing information from non-OSPF networks and distribute the same in OSPF network for which it is responsible and visa versa.

69 OSPF Terminology Area ID: A 32 bit number identifying an area. Acquired from IANA. Router ID: A 32 bit number identifying a router. Normally the lowest numbered IP address belonging to a router. Router Priority: An 8 bit number that indicates this router’s willingness to be a designated/backup designated router. A router priority of Zero indicates that this router is ineligible to be a designated router. LINK State Advertisement: Exchanged by adjacent routers to allow area topology databases to be maintained and inter-area and intra-AS routes to be advertised. The are five types of link state advertisements.

70

71

72 OSPF supports three kinds of connections and networks.
OSPF NETWORKS OSPF supports three kinds of connections and networks. Point-to-Point between exactly two routers. Multi-access networks with broadcasting(e.g., Ethernet, T-R, etc) Multi-access networks without broadcasting (e.g., packet switching WANs) Multi-access w/ Broadcasting Multi-access w/o Broadcasting Point-to-Point Network X.25 NETWORK

73 PROTOCOL ENCAPSULATION and OSPF PROTOCOL NUMBER
NOTE OSPF uses direct IP encapsulation. Protocol 89 is used for OSPF. OSPF is sent as multicast on pt-to-pt and broadcast networks. OSPF NETWORK LAYER Protocol Type 89 IP Header Source IP Address: Destination IP Address: DATA LINK LAYER ETHERNET DESTINATION ADDR B SOURCE ADDR B FIELD TYPE IP HEADER PREAMBLE OSPF FCS

74

75 OSPF MESSAGE TYPES HELLO (Type 1) is used to: identify neighbors,
to elect a designated Router for multi-access network, to find out about an existing Designated Router and as "I'm alive" signal. DATABASE DESCRIPTION (Type 2) is used to exchange information during initialization so that a router can find out what data is missing from its topology database. Each LSA is preceded by a common LS Advertisement Header. LSA Specific Type Router Link Advertisement LSA Specific Type 2 Network Link Advertisement LSA Specific Type Summary Link Advertisement to other Areas. LSA Specific Type Summary Link Advertisement to ASBR. LSA Specific Type AS External Link Advertisement LINK STATE REQUEST (Type 3) is used to ask for data that a router has discovered is missing from its topology database or to replace data that is out of date. Database descriptions are exchanged first then Link State request are submitted to resolve missing or suspicious data. LINK STATE UPDATE (Type 4) is used to reply to a Link State Request and also to dynamically report changes in network topology. LINK STATE ACKNOWLEDGEMENT (Type 5) is used to confirm receipt of a Link State Update. The sender will retransmit until an update is ACKed.

76 Link State Advertisement Types
Router Link Advertisement (LSA Type 1) Generated by all OSPF routers and describe the state of the router's interface (links) within the area. They are flooded throughout a single area only. Network Link Advertisement (LSA Type 2) Generated by the Designated Router (DR) on a multi-access network and lists the routers connected to the network. Summary Link Advertisements Generated by Area Border Routers (ABR) and flooded throughout a single area only. There are two types: A summary advertisement (LSA Type 3) describing routes to destinations in other areas within the same AS. A summary advertisement (LSA Type 4) describing routes to AS Boundary Routers. For routers to get information out of the AS. AS External Link Advertisement (LSA Type 5) Generated by the AS Boundary Routers(ASBR) to describe routes to destinations external to the OSPF network. They are flooded to all areas in the OSPF network.

77 LSA Types Used in Flooding
Router Links Type 1 Summary Links Types 3 and 4 ABR Describe the state and cost of the router’s links (interfaces) to the area (Intra-area). Originated by ABRs only. Describe networks in the AS but outside of area (Inter-area). Also describe the location of the ASBR. Network Links Type 2 External Links Type 5 DR ASBR Originated for multi-access segments with more than one attached router. Describe all routers attached to the specific segment. Originated by a Designated Router (discussed later on). Originated by an ASBR. Describe destinations external to the autonomous system or a default route to the outside AS.

78 LSA Specific - Description
LSA Specific Type 1 (Router Description) These are the router-LSAs. They describe the collected states of the router's interfaces. For more information, consult Section LSA Specific Type 2 (Network Description) These are the network-LSAs. They describe the set of routers attached to the network. LSA Specific Type 3 or 4 These are the summary-LSAs. They describe inter-area routes, and enable the condensation of routing information at area borders. Originated by area border routers, The Type 3 summary-LSAs describe routes to networks (Network Description) Type 4 summary-LSAs describe routes to AS boundary routers. (Router Description) LSA Specific Type 5 (Network Description) These are the AS-external-LSAs. Originated by AS boundary routers, they describe routes to destinations external to the Autonomous System. A default route for the Autonomous System can also be described by an AS-external-LSA.

79 S1: Routers discover their OSPF neighbors.
BASIC OSPF OPERATIONAL SEQUENCE S1: Routers discover their OSPF neighbors. When the OSPF routers first start they establish and maintain a relationship with their neighbors using the Hello protocol. S2: Routers elect a Designated Router (DR) and a Backup Designated Router (BDR) for a network (LAN) with multiple routers using the Hello protocol. S3: The routers form adjacencies. For routers on Multi-access networks all routers become adjacent to the DR and the BDR.

80 This is done through a Link State Request packet .
BASIC OSPF OPERATIONAL SEQUENCE Contd S4: Adjacent routers then exchange Database Description packets which may be part or all of the routers Link State Database. The adjacent routers then synchronize their Link State Databases by requesting missing or outdated information on the advertised links. This is done through a Link State Request packet . The response is a Link State Update packet. A Link State Acknowledgement packet is used to confirm the correct receipt of a Link State Update packet. S5: The routers then calculate the routing table by running the Shortest Path First(SPF) algorithm using the Link State Database as input. The routers periodically engage in advertising its Link States based upon a refresh timer expiration or a link state change. They then recalculate their routing table.

81 How OSPF Protocol Works
S1: Discovering Neighbors => Hello Messages S2: Electing the Designated Router S3: Establishing Adjacencies => DB Description Msgs S4: Propagating Link State Information => Flooding using LS Request/Update Msgs) S5: Calculating the Routing Table(s) => (Diksjtra’s Algorithm)

82 Discovering Neighbors – Hello Protocol

83 Hello Exchange Process – Pt-to-Pt Link

84 Hello Exchange Process – Ethernet Link

85 OSPF HELLO MESSAGE Message Type Version Message Length Router Identification Area Identification Common Message Header Checksum Authentication Type Authentication (octets 0-3) Authentication (octets 4-7) Network Mask Hello Message Type Identifies neighbors Elects the Designated Router(DR) Find out about an existing DR An Alive Signal Hello Interval Options E T Router Priority Dead Interval Timer Designated Router Backup Designated Router ... Neighbor One IP Address NETWORK MASK: This field contains the subnet mask of the network over which the message was sent (the mask associated with the interface). If this field does not match the receiving router's mask for that network, the receiving router rejects the Hello message and does not accept the transmitting router as a neighbor. In the absence of subnetting it is set to the default subnet mask. HELLO INTERVAL: This field tells how often in seconds this router transmits its Hello messages. A Broadcast is normally 10 seconds. A non-Broadcast is normally every 30 seconds. HelloInterval and RouterDeadInterval fields in sent OSPF packet must match with the settings configured in the receiving interface.

86 How OSPF Protocol Works
S1: Discovering Neighbors => Hello Message S2: Electing the Designated Router S3: Establishing Adjacencies => DB Description Msgs S4: Propagating Link State Information => Flooding using LS Request/Update Msgs) S5: Calculating the Routing Table(s) => (Diksjtra’s Algorithm)

87 DR and BDR Election - Example

88 How OSPF Protocol Works
S1: Discovering Neighbors => Hello Message S2: Electing the Designated Router S3: Establishing Adjacencies => DB Description Msgs S4: Propagating Link State Information => Flooding using LS Request/Update Msgs) S5: Calculating the Routing Table(s) => (Diksjtra’s Algorithm)

89

90 Database Sync Process In a link-state routing algorithm, it is very important for all routers' link-state databases to stay synchronized in order to have a compatible routing tables. OSPF simplifies this by requiring only adjacent routers to remain synchronized. The synchronization process begins as soon as the routers attempt to bring up the adjacency. Each router describes its database by sending a sequence of Database Description packets to its neighbor. Each Database Description Packet describes a set of LSAs belonging to the router's database. This sending and receiving of Database Description packets is called the "Database Exchange Process". During this process, the two routers form a master/slave relationship. Each Database Description Packet has a sequence number. Database Description Packets (DDPs) sent by the master (polls) are acknowledged by the slave through echoing of the sequence number. The DB exchange initially only sends the LSA headers and not the LSA info to achieve better BW and processing efficiencies. The master is the only one allowed to retransmit Database Description Packets. It does so only at fixed intervals, the length of which is the configured per-interface constant RxmtInterval. Each Database Description contains an indication that there are more packets to follow --- the M-bit. The Database Exchange Process is over when a router has received and sent Database Description Packets with the M-bit off. During and after the Database Exchange Process, each router has a list of those LSAs for which the neighbor has more up-to-date instances. These LSAs are requested in Link State Request Packets. Link State Request packets that are not satisfied are retransmitted at fixed intervals of time RxmtInterval. When the Database Description Process has completed and all Link State Requests have been satisfied, the databases are deemed synchronized and the routers are marked fully adjacent. At this time the adjacency is fully functional and is advertised in the two routers' router-LSAs. Criteria for determining adjacency between 2 routers: Have the same number of LSAs in their LSDBs Sum of their LSA’s LS Checksum fields are equal

91 OSPF DB DESCRIPTION MESSAGE
Message Type Version Message Length Router Identification Area Identification Common Message Header Checksum Authentication Type Authentication (octets 0-3) Authentication (octets 4-7) DB Descrioption Message Type Establishing adjacency LS Age Options LS Type Link State Identification Advertising Router Link State Sequence Number LS Checksum Length NETWORK MASK: This field contains the subnet mask of the network over which the message was sent (the mask associated with the interface). If this field does not match the receiving router's mask for that network, the receiving router rejects the Hello message and does not accept the transmitting router as a neighbor. In the absence of subnetting it is set to the default subnet mask. HELLO INTERVAL: This field tells how often in seconds this router transmits its Hello messages. A Broadcast is normally 10 seconds. A non-Broadcast is normally every 30 seconds. HelloInterval and RouterDeadInterval fields in sent OSPF packet must match with the settings configured in the receiving interface.

92 OSPF DB DESCRIPTION MESSAGE WITH LINK ADVERTISEMENT HEADER
Type Version Message Length Headers contain enough information to identify the LS records needed during synchronization. The receiver marks the LS Records to be requested. Router Identification Area Identification Checksum Authentication Type Authentication (octets 0-3) Authentication (octets 4-7) LS Age Options LS Type LS Types are: Type 1: Router Links Type 2: Network Links Type 3/4: Summary Links Type 5: External Links Link State Identification Advertising Router Link State Sequence Number LS Checksum Length LS Age: A 16 bit number indicating the time in seconds since the origin of the advertisements. This time increases as the link state advertisement resides in the router database and/or with each hop count. When it reaches a maximum value, normally one hour, it is discarded unless needed for synchronization. Options: See the Hello Packet. LS Type: This field specifies which of five different link state advertisements is contained in this header. Link State ID: A unique ID for the advertisement which is dependent upon the message type. LSA Message Types 1/4 uses the Router ID. LSA Message Type 2 uses the IP address of the Designated Router. LSA Message Types 3/5 uses an IP network number.

93 OSPF DB DESCRIPTION MESSAGE WITH LINK ADVERTISEMENT HEADER (CONT’D)
Headers contain enough information to identify the LS records needed during synchronization. The receiver marks the LS Records to be requested. Message Type Version Message Length Router Identification Area Identification Checksum Authentication Type Authentication (octets 0-3) Authentication (octets 4-7) LS Age Options LS Type LS Types are: Type 1: Router Links Type 2: Network Links Type 3/4: Summary Links Type 5: External Links Link State Identification Advertising Router Link State Sequence Number LS Checksum Length Advertising Router: The Router ID of the router that originated the link state advertisement. LSA Message Type 1 is identical to the Link State ID. LSA Message Type 2 uses the Router ID of the Network's Designated Router. LSA Message Types 3/4 use Router ID of the Area Border Router. LSA Message Type 5 uses the Router ID of the AS Boundary Router. LS Sequence Number: This field is used to sequence the advertisements and to detect duplicate or old packets. LS Checksum: The Checksum of the complete Link State Advertisement excluding the LS Age field. Length: The length is the size of the advertisements in bytes including the Advertisement Header.

94 LSA Flooding - Operations

95 Example of Router LSAs

96 Example of Network LSAs

97 Example of Summary LSAs

98 External Route LSAs - Example

99 How OSPF Protocol Works
S1: Discovering Neighbors => Hello Message S2: Electing the Designated Router S3: Establishing Adjacencies => DB Description Msgs S4: Propagating Link State Information => Flooding using LS Request/Update Msgs) S5: Calculating the Routing Table(s) => (Diksjtra’s Algorithm)

100 Propagating LS Info: When a link state changes

101 Propagating LS Info: When a link state changes

102

103 How OSPF Protocol Works
S1: Discovering Neighbors => Hello Message S2: Electing the Designated Router S3: Establishing Adjacencies => DB Description Msgs S4: Propagating Link State Information => Flooding using LS Request/Update Msgs) S5: Calculating the Routing Table(s) => (Diksjtra’s Algorithm)

104 Pros and Cons of OSPF Advantages of OSPF:
Changes in an OSPF network are propagated quickly. OSPF is heirarchical, using area 0 as the top as the heirarchy. OSPF is a Link State Algorithm. OSPF supports Variable Length Subnet Masks (VLSM). OSPF uses multicasting within areas. After initialization, OSPF only sends updates on routing table sections which have changed, it does not send the entire routing table. Using areas, OSPF networks can be logically segmented to decrease the size of routing tables. Table size can be further reduced by using route summarization. OSPF is an open standard, not related to any particular vendor. Can load-balance up to 6 equal-cost routes with 4 as the default Disadvantages of OSPF: OSPF is very processor intensive. OSPF maintains multiple copies of routing information, increasing the amount of memory needed. Using areas, OSPF can be logically segmented (this can be a good thing and a bad thing). OSPF is not as easy to learn as some other protocols. In the case where an entire network is running OSPF, and one link within it is "bouncing" every few seconds, OSPF updates would dominate the network by informing every other router every time the link changed state

105

106 BGP - Autonomous System
Networks and Routers under a single administrative authority Each AS is assigned a number AS numbers range form 1 to 65,535

107 Different AS Types (slide 30)

108 BGP is An Exterior Gateway Protocol (EGP), used to propagate tens or hundreds of thousands of routes between networks (ASs). The only protocol used to do this on the Internet today.

109 What is BGP? BGP is an inter-domain routing protocol that communicates prefix reachability BGP is a path vector protocol Similar to distance vector BGP views the Internet as a collection of autonomous systems Stability is very important to the Internet and BGP BGP supports CIDR BGP routers exchange routing information between peers Defined in RFC 1771

110 How Does BGP Work? BGP uses TCP as its transport protocol (port 179). Two BGP routers form a TCP connection between one another (peer routers) and exchange messages to open and confirm the connection parameters. BGP routers exchange network reachability information. This information is mainly an indication of the full paths (BGP AS numbers) that a route should take in order to reach the destination network. This information helps in constructing a graph of ASs that are loop-free and where routing policies can be applied in order to enforce some restrictions on the routing behavior. Any two routers that have formed a TCP connection in order to exchange BGP routing information are called peers, or neighbors. BGP peers initially exchange their full BGP routing tables. After this exchange, incremental updates are sent as the routing table changes. BGP keeps a version number of the BGP table, which should be the same for all of its BGP peers. The version number changes whenever BGP updates the table due to routing information changes. Keepalive packets are sent to ensure that the connection is alive between the BGP peers and notification packets are sent in response to errors or special conditions.

111 BGP Fundamentals BGP peers exchange routes and send updates not faster than every 90 seconds by default. Routes consist of destination prefixes with an AS path and BGP-specific attributes Each BGP update contains one path advertisement and attributes Many destinations can share the same path BGP compares the AS path and attributes to choose the best path Unfeasible routes can be advertised Unreachable routes are withdrawn

112 BGP Connections BGP updates are incremental
No regular refreshes Except at session establishment, when volume of routing can be high BGP runs over TCP connections TCP port 179 TCP Services Fragmentation, Acknowledgements, Checksums, Sequencing, and Flow Control No automatic neighbor discovery

113 BGP Peering BGP sessions are established between peers
BGP Speakers Two types of peering sessions E-BGP (external) peers with different ASs I-BGP (internal) peers within the same AS Still requires interior gateway protocols (IGPs) IGP connects BGP speakers within the AS IGP advertises internal routes

114 iBGP When BGP speakers in the same AS form a BGP connection for
the purpose of exchanging routing information, they are said to be running IBGP or internal BGP. IBGP speakers are usually fully-meshed. A c B

115 eBGP (1) When BGP speakers in different ASs form a BGP connection for
the purpose of exchanging routing information, they are said to be running EBGP or external BGP. EBGP peers are usually directly connected. AS 3561 A AS 3847 B

116 eBGP (2) AS 2033 AS 7007 AS 4200 AS 2041

117 iBGP and eBGP Diagram AS 1239 XP AS 701 AS 7007 AS 6079 AS 4006

118 eBGP Rules By default, only talks to directly-connected router.
Sends the one best BGP route for each destination. Sends all of the important “attributes”; omits the “local preference” attribute. Adds (prepends) the speaker’s ASN to the “as-path” attribute. Usually rewrites the “next-hop” attribute.

119 iBGP Rules Can talk to routers many hops away by default.
Can only send routes it “injects”, or routes heard DIRECTLY from an external peer. Thus, requires a FULL mesh. Sends all attributes. Leaves the as-path attribute alone. Doesn’t touch the “next hop” attribute.

120 Logical view of 16 routers, fully meshed

121 iBGP Restriction (1) Assume AS1239 sends route /8 to AS Router A will send that route to Routers B and C. B AS 2828 A AS 1239 C

122 iBGP Restriction (2) When Router B receives /8, it will not propagate that route to Router C because it was learned from an iBGP neighbor. Router C will behave similarly. B AS 2828 A AS 1239 C

123 BGP Route Advertisement
Only advertise the active BGP routes to peers (by default) BGP Next-hop must be reachable Never forward I-BGP routes to I-BGP peers Prevents loops Withdraw routes if active BGP routes become unreachable

124 CIDR and Aggregate Addresses
(1) AS 2 has the detailed routes / / / /24 AS 1 Router A (3) AS 1 learns only the aggregate and not the details Router B AS 2 /22 /22 (2) BGP with Routing Policy can advertise a prefix that aggregates the detailed routes One of the main enhancements of BGP4 over BGP3 is Classless Interdomain Routing (CIDR). CIDR or supernetting is a new way of looking at IP addresses. There is no notion of classes anymore (class A, B or C). For example, network which used to be an illegal class C network is now a legal supernet represented by /16 where the 16 is the number of bits in the subnet mask counting from the far left of the IP address. This is similar to Aggregates are used to minimize the size of routing tables. Aggregation is the process of combining the characteristics of several different routes in such a way that a single route can be advertised. In the example below, RTB is generating network We will configure RTC to propagate a supernet of that route to RTA. Router C AS 3

125 IBGP, EBGP Example AS 1 AS 3 EBGP EBGP AS 2 IBGP

126 Advertising Networks Using the Network command
Redistributing static routes Redistributing Dynamic routes

127 Using Network Command Advertising Networks router bgp 1 router bgp 2
AS1 A AS2 B EBGP Router A router bgp 1 neighbor remote-as 2 network network Router B router bgp 2 neighbor remote-as 1 network network Network Command The format of the network command follows: network network-number [mask network-mask] The network command controls what networks are originated by this box. This is a different concept from what you are used to configuring with IGRP and RIP. With this command we are not trying to run BGP on a certain interface, rather we are trying to indicate to BGP what networks it should originate from this box. The mask portion is used because BGP version 4 (BGP4) can handle subnetting and supernetting. A maximum of 200 entries of the network command are accepted. The network command will work if the network you are trying to advertise is known to the router, whether connected, static or learned dynamically. An example of the network command follows: RTA# router bgp 1 network mask ip route null 0 The above example indicates that router A, will generate a network entry for /16. The /16 indicates that we are using a supernet of the class C address and we are advertizing the first two octets (the first 16 bits). Note that we need the static route to get the router to generate because the static route will put a matching entry in the routing table.

128 By redistributing Static Routes
Advertising Networks By redistributing Static Routes AS1 A AS2 B EBGP Router A router bgp 1 neighbor remote-as 2 redistribute static ip route null 0 ip route null 0 You could always use static routes to originate a network or a subnet. The only difference is that BGP will consider these routes as having an origin of incomplete (unknown). In the above example the same could have been accomplished by doing: RTC# router eigrp 10 network redistribute bgp 200 default-metric router bgp 200 neighbor remote-as 300 redistribute static ... ip route null The null 0 interface means disregard the packet. So if I get the packet and there is a more specific match than (which exists of course) the router will send it to the specific match otherwise it will disregard it. This is a nice way to advertise a supernet. We have discussed how we can use different methods to originate routes out of our autonomous system. Please remember that these routes are generated in addition to other BGP routes that BGP has learned via neighbors (internal or external). BGP passes on information that it learns from one peer to other peers. The difference is that routes generated by the network command, or redistribution or static, will indicate your AS as the origin for these networks. Injecting BGP into IGP is always done by redistribution. Example: RTA# router bgp 100 neighbor remote-as 300 network RTB# router bgp 200 neighbor remote-as 300 network RTC# router bgp 300 neighbor remote-as 100 neighbor remote-as 200 network Note that you do not need network or network in RTC unless you want RTC to also generate these networks on top of passing them on as they come in from AS100 and AS200. Again the difference is that the network command will add an extra advertisement for these same networks indicating that AS300 is also an origin for these routes. An important point to remember is that BGP will not accept updates that have originated from its own AS. This is to insure a loop free interdomain topology. For example, assume AS200 above had a direct BGP connection into AS100. RTA will generate a route and will send it to AS300 then RTC will pass this route to AS200 with the origin kept as AS100, RTB will pass to AS100 with origin still AS100. RTA will notice that the update has originated from its own AS and will ignore it.

129 By Redistributing Dynamic Routes
Advertising Networks By Redistributing Dynamic Routes AS1 A B EBGP Router A router bgp 1 neighbor remote-as 2 redistribute ospf 1 router ospf 1 network area 0 AS2

130 BGP Attributes AS-path Next-hop Local preference MED Origin
Communities

131 AS-Path BGP Attributes Path traversed one or more members of a set
1883 /24 AS-Path /24 Path traversed one or more members of a set {1880, 1881, 1882} (as-set) A list of AS’s that a route has traversed (sequence) Shortest AS path preferred 1880 /24 1882 /24 / / / /

132 Multi-Exit Discriminator (MED)
BGP Attributes Multi-Exit Discriminator (MED) 690 1883 1755 1880 200 209 Preference sent to all routers in remote AS Where do I want to receive the traffic ?

133 Multi-Exit Discriminator (MED)
Indication to external peers of the preferred path into an AS. Affects routes with same AS path. Advertised to external neighbors Usually based on IGP metric Lowest MED preferred

134 MED Attribute (2) The MED (multi-exit discriminator) is a commonly used attribute. It comes after the AS_PATH in evaluation, and thus isn’t quite as much of a “hammer” as local-pref. Commonly, MED is used to tack a distance on BGP routes as they move within your network. NSPs advertise MEDs to each other to let it be known which POP the route is “closest” to.

135 Local Preference BGP Attributes 690 1755 1880 A Needs to go to 690 666
Preference sent to all routers in local AS Where do I want traffic to leave? 102 NW’98 © 1998, Cisco Systems, Inc. 135

136 Local Preference Attribute
AS 3847 F E G C D / / Preferred by all AS3847 routers Local to AS Used to influence BGP path selection Default 100 Highest local-pref preferred A B /24 AS 6201

137 Local-Pref Attribute (2)
An often-used attribute, local-pref (normally 100) overrides AS_PATH, and is transitive throughout your network. It is never advertised to an eBGP peer. For example, you can express the policy “prefer private interconnects” by making the local_pref be 150 and leaving all other peers at 100. Best used as an intermediate-level knob.

138 The BGP Path Decision Algorithm
BGP determines the best path to each destination for a BGP speaker by comparing path attributes according to the following selection sequence: Select a path with a reachable next hop. Select the path with the highest weight. If path weights are the same, select the path with the highest local preference value. Prefer locally originated routes (network routes, redistributed routes, or aggregated routes) over received routes. Select the route with the shortest AS-path length. If all paths have the same AS-path length, select the path based on origin: IGP is preferred over EGP; EGP is preferred over Incomplete. If the origins are the same, select the path with lowest MED value. If the paths have the same MED values, select the path learned via EBGP over one learned via IBGP. Select the route with the lowest IGP cost to the next hop. Select the route received from the peer with the lowest BGP router ID.

139 Common Internet Routing Phenomemon
E-BGP Route Flapping/Oscillation Remedy: Route Flap Dampening

140 BGP - Route Flapping Routing instability
Routes disappear, appear again, then disappear Withdrawal, announcement, withdrawal, announcement Visible to the Internet Waste resources Some causes of route flapping Flaky inter-AS links Flaky or insufficient hardware Link congestion IGP instability Operator error

141 BGP – Route Flap Dampening
If you are running BGP version 4, the BGP process assigns a penalty of 1000 to the route each time it flaps. When the penalty value exceeds the first of two limits (Re-use limit, Suppress limit), the route is moved into the 'historical' list of routes, dampened, and then is no longer accepted from other peers or announced to any peers. After the first limit has been exceeded, the timer which tracks the period for which the route is to be dampened is doubled for each flap. The suppression half-life is 15 minutes. The maximum suppress limit is four times the half-life; thus, one hour is the default. The suppression penalty decays at half the half life (7.5 minutes). So: First flap, penalty 1000 assigned, route placed in 'historical' category and becomes less preferred. Second flap, route has met the suppression limit of 2000 (a Cisco default). The route is dampened and no longer advertised to neighbors or accepted from neighbors. If route does not flap any further the penalty is decayed. The decay process begins 7.5 minutes after the route stabilized and decays exponentially every 5 seconds thereafter. Once the suppression penalty decays below 750 (the default value for the reuse threshold), the route is removed from dampened state and reused. The router parses the historical routes list every 10 seconds for reusable routes.

142 Route Flap Dampening - Operation

143 Useful Tool To Understand BGP Peering Relationships

144 UDP, TCP/IP - Internet Protocols

145 UDP: User Datagram Protocol

146 UDP Header Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

147 What is UDP? Relatively simple compared to TCP
UDP provides connectionless service for data delivery between two hosts No logical connection is established by UDP, so no connection-oriented services are supplied The application may need some of those services (e.g., no corrupt packets), so the application is responsible to provide them Applications that use UDP: tftp, DNS (for some functions), SNMP, RIP, VoIP Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

148 Why Use UDP over TCP? No connection establishment. As we shall discuss in Section 3.5, TCP uses a three-way handshake before it starts to transfer data. UDP just blasts away without any formal preliminaries. Thus UDP does not introduce any delay to establish a connection. This is probably the principle reason why DNS runs over UDP rather than TCP -- DNS would be much slower if it ran over TCP. HTTP uses TCP rather than UDP, since reliability is critical for Web pages with text. But, as we briefly discussed in Section 2.2, the TCP connection establishment delay in HTTP is an important contributor to the "world wide wait". No connection state. TCP maintains connection state in the end systems. This connection state includes receive and send buffers, congestion control parameters, and sequence and acknowledgment number parameters. We will see in Section 3.5 that this state information is needed to implement TCP's reliable data transfer service and to provide congestion control. UDP, on the other hand, does not maintain connection state and does not track any of these parameters. For this reason, a server devoted to a particular application can typically support many more active clients when the application runs over UDP rather than TCP. Small segment header overhead. The TCP segment has 20 bytes of header overhead in every segment, whereas UDP only has 8 bytes of overhead. Unregulated send rate. TCP has a congestion control mechanism that throttles the sender when one or more links between sender and receiver becomes excessively congested. This throttling can have a severe impact on real-time applications, which can tolerate some packet loss but require a minimum send rate. On the other hand, the speed at which UDP sends data is only constrained by the rate at which the application generates data, the capabilities of the source (CPU, clock rate, etc.) and the access bandwidth to the Internet. We should keep in mind, however, that the receiving host does not necessarily receive all the data - when the network is congested,  a significant fraction of the UDP-transmitted data could be lost due to router buffer overflow. Thus, the receive rate is limited by network congestion even if the sending rate is not constrained.

149 UDP Encapsulation Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

150 UDP Header Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

151 Computing the UDP Checksum
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

152 IP Fragmentation Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

153 Other UDP Uses Path MTU Discovery Max UDP datagram size
Using Traceroute Using UDP Max UDP datagram size ICMP Source Quench Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

154 TCP: Transmission Control Protocol

155 TCP Header Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

156 What is TCP? Relatively complex compared to UDP
TCP provides connection-oriented service for data delivery between two hosts Client and server establish a logical TCP connection before exchanging data TCP segments flow over the network in IP packets (which are connectionless) so that the logical TCP connection can be maintained over a changing physical path Connections are full-duplex Timers are used to maintain connections TCP relies on IP to provide hop-by-hop routing and error detection Applications that use TCP: telnet, ftp, http, many others Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

157 TCP Logical Connections/Ports
TCP and UDP introduce the concept of ports Common ports and the services that run on them: FTP 21 and 20 telnet 23 SMTP 25 http 80 POP3 110 Multiple ports/logical connections can be supported

158 TCP Header Fields and Other Info
Each connection uniquely identified by combination of src IP, dest IP, src port, and dest port socket = IP + port (e.g., ) Sequence numbers The sequence number of the first data octet in this segment (except when SYN is present). If SYN is present the sequence number is the initial sequence number (ISN) and the first data octet is ISN+1. Acknowledgements If the ACK control bit is set this field contains the value of the next sequence number the sender of the segment is expecting to receive. Once a connection is established this is always sent. Header Length: Length of header in bytes Flag bits: Provides connection-oriented service The SYN and Fin flags are used when establishing and terminating a TCP connection, respectively. The ACK flag is set any time the Acknowledgement field is valid, implying that the receiver should pay attention to it. The URG flag signifies that this segment contains urgent data. When this flag is set, the UrgPtr field indicates where the non-urgent data contained in this segment begins. The PUSH flag signifies that the sender invoked the push operation, which indicates to the receiving side of TCP that it should notify the receiving process of this fact. Finally, the RESET flag signifies that the receiver has become confused and so wants to abort the connection. Window size The number of data octets beginning with the one indicated in the acknowledgment field which the sender of this segment is willing to accept. Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

159 TCP Options Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

160 TCP Encapsulation Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

161 TCP Functionalities Connection-Orient
Identifies traffic flow by some identifier rather than by explicitly listing source and destination addresses Stream Data Transfer From the application's viewpoint, TCP transfers a contiguous stream of bytes. TCP does this by grouping the bytes in TCP segments, which are passed to IP for transmission to the destination. TCP itself decides how to segment the data and it may forward the data at its own convenience. Reliability TCP assigns a sequence number to each byte transmitted, and expects a positive acknowledgment (ACK) from the receiving TCP. If the ACK is not received within a timeout interval, the data is retransmitted. The receiving TCP uses the sequence numbers to rearrange the segments when they arrive out of order, and to eliminate duplicate segments. Flow Control The receiving TCP, when sending an ACK back to the sender, also indicates to the sender the number of bytes it can receive beyond the last received TCP segment, without causing overrun and overflow in its internal buffers. This is sent in the ACK in the form of the highest sequence number it can receive without problems. Logical Connections The reliability and flow control mechanisms described above require that TCP initializes and maintains certain status information for each data stream. The combination of this status, including sockets, sequence numbers and window sizes, is called a logical connection. Each connection is uniquely identified by the pair of sockets used by the sending and receiving processes. Multiplexing To allow for many processes within a single host to use TCP communication facilities simultaneously, the TCP provides a set of addresses or ports within each host. Concatenated with the network and host addresses from the internet communication layer, this forms a socket. A pair of sockets uniquely identifies each connection. Full Duplex TCP provides for concurrent data streams in both directions.

162 Important Factors That Affect TCP (Application) Performance
Link BW (window size), network delay (RTT) and MTU size (Bit Error Rate) How a receiver/sender implements Acknowlegment scheme Stop-N-Wait, Go-back-N, Selective Ack Speed at which received data is processed and ACKed at destination (sender/receiver buffer size) Maximum TCP Buffer (Memory) space for use by any TCP connection Socket Buffer Sizes for individual TCP connection Ability to actively manage speed mismatch between sender and receiver by regulating how much can/should be sent without getting into a congestion state Window size based on Bandwidth Delay Product Ability to actively detect and prevent dynamic congestion and re-act to it Connection timeout, slow start, back-off strategies Ability for sender to maximize/improve BW/network throughput TCP Large Window Scaling

163 TCP Connection-Oriented Protocol: TCP Connection Establishment and Termination

164 Passive and Active Ports
TCP enables two methods to establish a connection: active and passive. An active connection establishment happens when TCP issues a request for the connection, based on an instruction from an upper-level protocol that provides the socket number. A passive approach takes place when the upper-level protocol instructs TCP to wait for the arrival of connection requests from a remote system (usually from an active open instruction). When TCP receives the request, it assigns a port number. This enables a connection to proceed rapidly, without waiting for the active process.

165 Connection Establishment
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

166 Connection Termination
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

167 MSS: Maximum Segment Size
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

168

169

170 TCP Half-Close Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

171 TCP State Diagram Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

172 States: Establishment and Termination
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

173 TCP Reset Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

174 Simultaneous Open Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability. For example: An application at host A uses 7777 as the local port and connects to port 8888 on host B. At the same time, an application at host B uses 8888 as the local port and connects to port 7777 on host A. This is "Simultaneous Open". Here is another example: The Telnet client at host A connects to the Telnet server at host B. At the same time, the Telnet client at host B connects to the Telnet server at host A. Be careful. This time, it's not "Simultaneous Open" because the two Telnet servers on both sides do "passive open" instead of "active open". There are actually two TCP connections, instead of one in "Simultaneous Open".

175 Simultaneous Close Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

176 TCP Provides a Byte-Stream Service
TCP is a byte-oriented protocol, which means the sender writes bytes into a TCP connection and the receiver reads bytes out of the TCP connection. Although ``byte-stream'' describes the service TCP offers to application processes, TCP does not, itself, transmit individual bytes over the Internet. Instead, TCP on the source host buffers enough bytes from the sending process to fill a reasonably sized packet, and then sends this packet to its peer on the destination host. TCP on the destination host then empties the contents of the packet into a receive buffer, and the receiving process reads from this buffer at its leisure. This situation is illustrated in figure below, which for simplicity, shows data flowing in only one direction. In general, remember, a single TCP connection supports byte-streams flowing in both directions.                                                       

177 TCP Provides a Byte-Stream Service
It is a streaming protocol No “record markers” inserted into data stream Writes on one end and reads on the other are independent of each other Ex: data could be written in a sequence of 10 bytes, then 20 bytes, then 50 bytes That data could be read as 4 x 20 byte No interpretation of the application data Very similar to Unix kernel’s treatment of files in a filesystem Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

178 TCP Provides Reliability
TCP segments are sized for the application Segments must be acknowledged TCP checksum on header and data Out-of-sequence IP packets can be re-ordered Receiving TCP must discard duplicate packets Flow control is employed to manage finite buffer space Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

179 TCP Flow Control Algorithms
Sliding Window Regular Sliding Window The BW Delay Product (BWDP) Window Scaling option The Silly Window syndrome Tinygram Congestion Prevention: The Nagle algorithm TCP Timeout RTT-Timeout Calculation Acknowlegement schemes Stop-N-Wait, Go-back-N, Selective Ack schemes Ambiguous Acknowledgements: The Karn’s Algorithm Congestion Avoidance Slow Start

180 End-to-end flow control
Problem Sender can send more traffic that receiver can handle. (Too fast) Solution variable sliding window protocol each acknowledgement, which specifies how many octets have been received, contains a window advertisement that specifies how many additional octets receiver are prepared to accept. Solution: sliding window protocol. TCP: variable sliding window protocol

181 Variable Window Size … … Transmitter Window Size Value of
Window Advertisement Receiver Transmitter Transmitter Window Size Value of Window Advertisement Free space in buffer to fill increase bigger decrease smaller Stop transmissions full

182 Sliding window protocol in TCP
TCP allows the window size to vary over time. Window size changes at the time it slides forward. Advantage: it provides flow control as well as reliable transfer.

183 TCP Sliding Window Algorithm
Flow control with the use of Window concept via specifying an acceptable range of sequence numbers

184 Sliding Windows Advertized Window Available Window
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability. Available Window

185 Sliding Window: Details
Sender Receiver Max ACK received Next seqnum Next expected Max acceptable Sender window Receiver window Sent & Acked Sent Not Acked Received & Acked Acceptable Packet OK to Send Not Usable Not Usable

186 Window Flow Control: Header
Packet Sent Packet Received Source Port Dest. Port Source Port Dest. Port Sequence Number Sequence Number Acknowledgment Acknowledgment HL/Flags Window HL/Flags Window D. Checksum Urgent Pointer D. Checksum Urgent Pointer Options.. Options.. App write acknowledged sent to be sent outside window

187 Sliding Windows Example - A Dynamic Parameter
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

188 Window Size – A Dynamic Parameter
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

189 How To Determine Optimal Window Size (Keeping the Pipe Full): The Bandwidth-Delay Product (BDP)
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

190 The TCP Window Scaling option
The original TCP specification included a window size no larger than 64 KB. This limitation was introduced by the 16 bit header that specified window size. To achieve the recommended 1 MB window size, TCP extensions must be enabled to add another 14 bits to the window size,making the total bist equal to 30 for window size The TCP window scaling option works by including a scale factor in a SYN packet imbedded in the TCP OPTIONS field. This scale factor informs the receiver that the sender is willing to do window scaling and offers a scale factor for the communication. The scale factor is used to shift the window field before the data segment is sent. It's important to note that the window size used in the actual 3-way handshake is NOT the window size that is scaled. This means that the first data packet sent after the 3-way handshake is the actual window size. If there is a scaling factor, the initial window size of 65,535 bytes is always used. The window size is then multiplied by the scaling factor identified in the 3-way handshake. The table below represents the scaling factor boundaries for various window sizes.

191 The TCP Window Scaling Option
Scale factor Scale Value Initial Window Window Scaled 1 65535 or less 2 65535 131,070 4 262,140 3 8 524,280 16 1,048,560 5 6 14 16384 1,073,725,440

192 TCP Options – Window Scale Factor
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

193 The TCP Receiver Silly Window Syndrome
Problems Associated With "Shrinking" The TCP Window - when the receiver is much slower than the sender (ex below where receiver can only process 1 out of 3 received packets) This diagram shows one example of how the phenomenon known as TCP silly window syndrome can arise. The client is trying to send data as fast as possible to the server, which is very busy and cannot clear its buffers promptly. Each time the client sends data the server reduces its receive window. The size of the messages the client sends shrinks until it is only sending very small, inefficient segments. The client's send window is 360, and it has lots of data to send. It immediately sends a 360 byte segment to the server. This uses up its entire send window. When the server gets this segment it acknowledges it. However, it can only remove 120 bytes so the server reduces the window size from 360 to 120. It sends this in the Window field of the acknowledgment. The client receives an acknowledgment of 360 bytes, and sees that the window size has been reduced to 120. It wants to send its data as soon as possible, so it sends off a 120 byte segment. The server has removed 40 more bytes from the buffer by the time the 120-byte segment arrives. The buffer thus contains 200 bytes (240 from the first segment, less the 40 removed). The server is able to immediately process one-third of those 120 bytes, or 40 bytes. This means 80 bytes are added to the 200 that already remain in the buffer, so 280 bytes are used up. The server must reduce the window size to 80 bytes. The client will see this reduced window size and send an 80-byte segment. The server started with 280 bytes and removed 40 to yield 240 bytes left. It receives 80 bytes from the client, removes one third, so 53 are added to the buffer, which becomes 293 bytes. It reduces the window size to 67 bytes ( ).

194 Receiver SWS Avoidance
Let's start with SWS avoidance by the receiver. As we saw in the initial example above, the receiver contributed to SWS by reducing the size of its receive window to smaller and smaller values due its being busy. This caused the right edge of the sender's send window to move by ever-smaller increments, leading to smaller and smaller segments. To avoid SWS, we simply make the rule that the receiver may not update its advertised receive window in such a way that this leaves too little usable window space on the part of the sender. In other words, we restrict the receiver from moving the right edge of the window by too small an amount. The usual minimum that the edge may be moved is either the value of the MSS parameter, or one-half the buffer size, whichever is less. Let's see how we might use this in the example above. When the server receives the initial 360-byte segment from the client and can only process 120 bytes, it does not reduce the window size to 120. It reduces it all the way to 0, closing the window. It sends this back to the client, which will then stop and not send a small segment. Once the server has removed 60 more bytes from the buffer, it will now have 180 bytes free, half the size of the buffer. It now opens the window up to 180 bytes in size and sends the new window size to the client. It will continue to only advertise either 0 bytes, or 180 or more, not smaller values in between. This seems to slow down the operation of TCP, but it really doesn't. Because the server is overloaded, the limiting factor in overall performance of the connection is the rate at which the server can clear the buffer. We are just exchanging many small segments for a few larger ones.

195 Tinygram Congestion Prevention: The Nagle algorithm (Sender SWS)
Nagle’s algorithm, named after John Nagle, is a means of improving the efficiency of TCP/IP networks by reducing the number of packets that need to be sent over the network. Telnet example test over a long-haul link with a 5-second round trip time. User sends 25 bytes. Without any mechanism to prevent small-packet (tinygram) congestion 25 new packets would be sent in 5 seconds in accordance with the delayed ACK algorithm, meaning delay up to 200 ms. Amount data to be sent is 41x25 bytes (20 bytes for IP header, 20 bytes for TCP header). Overhead here is 4000%. With Nagle algorithm however, the first character from the user would be sent immediately. The next 24 characters, arriving from the user at 200ms intervals. When an ACK arrived for the first packet at the end of 5 seconds, a single packet with the 24 queued characters would be sent, i.e. 41x2 +25 bytes would be sent total. Overhead is only 320% with no penalty in response time.     The Nagle Algorithm is useful on a slow WAN when it is desired to reduce the tinygram congestion. Sometimes the Nagle algorithm needs to be turned off. For example, in X Window System server small messages (mouse movements) must be delivered without delay to provide real-time feedback for interactive user.

196 Key TCP Concepts Modern TCP implementations incorporate a set of SWS avoidance algorithms. When receiving, devices are programmed not to advertise very small windows, waiting instead until there is enough room in the buffer for one of a reasonable size. Transmitters use Nagle’s algorithm to ensure that small segments are not generated when there are unacknowledged bytes outstanding.

197 TCP Timers The Retransmission Timer
The retransmission timer manages retransmission timeouts (RTOs), which occur when a preset interval between the sending of a datagram and the returning acknowledgment is exceeded. The value of the timeout tends to vary, depending on the network type, to compensate for speed differences. If the timer expires, the datagram is retransmitted with an adjusted RTO, which is usually increased exponentially to a maximum preset limit. If the maximum limit is exceeded, connection failure is assumed, and error messages are passed back to the upper-layer application. Values for the timeout are determined by measuring the average time that data takes to be transmitted to another machine and the acknowledgment received back, which is called the round-trip time, or RTT. From experiments, these RTTs are averaged by a formula that develops an expected value, called the smoothed round-trip time, or SRTT. This value is then increased to account for unforeseen delays. The Delayed ACK Timer TCP uses delayed acknowledgments to reduce the number of packets that are sent on the media. Instead of sending an acknowledgment for each TCP segment received, TCP takes a common approach to implementing delayed acknowledgments. As data is received by TCP on a particular connection, it sends an acknowledgment back only if one of the following conditions is true: No acknowledgment was sent for the previous segment received. A segment is received, but no other segment arrives within 200 milliseconds for that connection.

198 TCP Timers (Continued)
The Quiet Timer After a TCP connection is closed, it is possible for datagrams that are still making their way through the network to attempt to access the closed port. The quiet timer is intended to prevent the just-closed port from reopening again quickly and receiving these last datagrams. The quiet timer is usually set to twice the maximum segment lifetime (the same value as the Time to Live field in an IP header), ensuring that all segments still heading for the port have been discarded. Typically, this can result in a port being unavailable for up to 30 seconds, prompting error messages when other applications attempt to access the port during this interval. The Keep-Alive Timer and the Idle Timer Both the keep-alive timer and the idle timer were added to the TCP specifications after their original definition. The keep-alive timer sends an empty packet at regular intervals to ensure that the connection to the other machine is still active. If no response has been received after sending the message by the time the idle timer has expired, the connection is assumed to be broken. The keep-alive timer value is usually set by an application, with values ranging from 5 to 45 seconds. The idle timer is usually set to 360 seconds.

199 RTT and TCP Retransmission Timeouts (RTO)
The RTO is typically calculated based on the RTT The original TCP specification had TCP update a smoothed RTT estimator (called R) using the low-pass filter R <- aR + (1-a)M, M is the measured RTT where a is a smoothing factor with a recommended value of 0.9. This smoothed RTT is updated every time a new measurement is made. Ninety percent of each new estimate is from the previous estimate and 10% is from the new measurement. Given this smoothed estimator, which changes as the RTT changes, RFC 793 recommended the retransmission timeout value (RTO) be set to RTO = Rb where b is a delay variance factor with a recommended value of 2.

200 TCP Performance Enhancement - Congestion Avoidance Using Slow Start
Cwnd = congestion window Some implementation recommends cwnd = 4 as the initial value to improve throughput Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

201 The TCP Delayed ACK timer
Delayed ACK timer = 200 msec by default Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

202 TCP Reliability – Use of Acknowledgement
TCP is responsible for data recovery by providing a sequence number with each packet that it sends TCP requires ACK (acknowledgement) to ensure correct data is received Packet can be retransmitted if error detected Three ACK schemes available which will be discussed at length later

203 TCP Acknowledgement Schemes
Stop-and-Wait : individual ACK is required for each segment received Go-Back-N : cumulative ACK required for consecutive segments received Selective-Ack: selectively ACK only segments that are in error as part of the consecutive segments received

204 Stop-And-Wait ACK Scheme: TCP Interactive Data Flow

205 Interactive Data: character/echo
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

206 Delayed Acknowledgements
Delayed ACK timer = 200 msec by default Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

207 Regular Acknowledgements
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

208 Stop-And-Wait ACK Scheme – Loss Packet Scenario

209 Stop-And-Wait ACK Scheme – Loss ACK Scenario

210 Performance of Stop-And-Wait ACK Scheme
It works, but performance stinks Example: 1 Gbps link, 15 ms e-e prop. delay, 1KB packet: L (packet length in bits) 8kb/pkt T = = = 8 microsec transmit R (transmission rate, bps) 10**9 b/sec U sender: utilization – fraction of time sender busy sending 1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link network protocol limits use of physical resources!

211 Ambiguous ACK – Loss ACK Scenario

212 Ambiguous Acknowledgement - The Karn Algorithm
Rule 1: Ignore measured RTT for retransmitted packets.  This removes ambiguity from RTT measurements. Rule 2: RTO should be doubled after retransmission. This is called "Exponential Back-off". A problem occurs when a packet is retransmitted. Say a packet is transmitted, a timeout occurs, the RTO is backed off, the packet is retransmitted with the longer RTO, and an acknowledgment is received. Is the ACK for the first transmission or the second? This is called the retransmission ambiguity problem. [Karn and Partridge 1987] specify that when a timeout and retransmission occur, we cannot update the RTT estimators when the acknowledgment for the retransmitted data finally arrives. This is because we don't know to which transmission the ACK corresponds. (Perhaps the first transmission was delayed and not thrown away, or perhaps the ACK of the first transmission was delayed.) Also, since the data was retransmitted, and the exponential backoff has been applied to the RTO, we reuse this backed off RTO for the next transmission. Don't calculate a new RTO until an acknowledgment is received for a segment that was not retransmitted.

213 TCP Fast Retransmission
Fast Retransmission:  When TCP detects segment loss using retransmission timer, then ssthresh is set to half of CWND. i.e.  ( ssthresh = CWND / 2  )  and CWND is set to 1-full size  segment. Now instead of waiting for retransmission timer to get off , TCP detects packet loss by looking for packet re-ordering and retransmit the lost packet. This scheme is called as " Fast Retransmit". In this algorithm , TCP receiver sends an immediate duplicate ACK on out-of-order segment arrival.  The other end TCP deduce from small number ( normally 3 ) of consecutive duplicate ACKs that the segment has been lost and deduces the starting sequence number of missing segment. The missing segment is retransmitted.

214 Go-Back-N ACK Scheme: TCP Bulk Data Flow

215 Motivation: Pipelining Increases Utilization
sender receiver first packet bit transmitted, t = 0 last bit transmitted, t = L / R first packet bit arrives RTT last packet bit arrives, send ACK last bit of 2nd packet arrives, send ACK last bit of 3rd packet arrives, send ACK ACK arrives, send next packet, t = RTT + L / R Increase utilization by a factor of 3

216 Go-Back-N (GBN) - Sender
k-bit seq # in pkt header “sliding window” of up to N, consecutive unack’ed pkts allowed ACK(n): ACKs all pkts up to, including seq # n - “cumulative ACK” may receive duplicate ACKs (see receiver) The local node must also keep a buffer of all PDUs which have been sent, but have not yet been acknowledged Timer for each in-flight pkt Timeout(n): retransmit pkt n and all higher seq # pkts in window

217 GBN: Receiver ACK-only: always send ACK for correctly-received pkt with highest in-order seq # may generate duplicate ACKs need only remember expectedseqnum out-of-order pkt: discard (don’t buffer) -> no receiver buffering! Re-ACK pkt with highest in-order seq #

218 GBN In Action

219 GBN – How To Handle Packet Loss
                                                                                                   Example of Go-Back-N. The sender in this example transmits four PDUs (1-4) and the first one (1) of these is not successfully received. The receiver notes that it was expecting a PDU numbered 1 and actually receives a PDU numbered 2. It therefore deduces that (1) was lost. It requests retransmission of the missing PDU by sending a Go-Back-N request (in this case N=1), and discards all received PDUs with a number greater than 1.The sender receives the Go-Back-N request and retransmits the missing PDU (1), followed by all subsequently sent PDUs (2-4) which the receiver the correctly receives and acknowledges.

220 GBN: ACK of Multiple Segments
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

221 GBN: ACK of Multiple Segments – Slight Differences
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

222 GBN: Fast Sender, Slow Receiver
Juniper Networks, Inc. powers the new Internet Protocol (IP) infrastructure by furnishing service providers with high-performance core, dedicated access, and mobile IP routing solutions for growing the global Internet backbone. We count the world’s largest and fastest growing carriers and service providers as its major customers including Cable & Wireless, Genuity, WorldCom, UUNET, Verio, Global Crossing, MFN-AboveNet, Qwest, Level 3, Tiscali (Italy), Dacom (Korea), NTT (Japan), Sonera (Finland), TTN (Taiwan), Guandong PTA (China), and Telekom Malaysia. We will discuss our customers in more depth later in the presentation. Our Solutions for the New IP Infrastructure Juniper Networks provides best-in-class solutions for three target markets: core, access, and mobile. Our M-series Internet backbone routers are purpose-built to meet the unique challenges of the Internet for performance, reliability, and scalability.

223 Limitations With Stop-And-Wait and Go-Back-N Ack Schemes
Poor performance when multiple packets are lost from a window of data Cumulative ACKs provide limited information Aggressively re-transmitting packets is inefficient Selective-ACK solves these problems


Download ppt "TCOM 509: UDP, TCP/IP - Internet Protocols"

Similar presentations


Ads by Google