2 Chapter 4 2 Chapter 5 Problem How to turn this host-to-host packet delivery service into a process-to-process communication channel
3 Chapter 4 3 Chapter 5 End-to-end Protocols Application-level processes that use end-to-end protocol services have certain requirements E.g. Common properties that a transport protocol can be expected to provide – Guarantees message delivery – Delivers messages in the same order they were sent – Delivers at most one copy of each message – Supports arbitrarily large messages – Supports synchronization between the sender and the receiver – Allows the receiver to apply flow control to the sender – Supports multiple application processes on each host
4 Chapter 4 4 Chapter 5 End-to-end Protocols Typical limitations of the network on which transport protocol will operate – Drop messages – Reorder messages – Deliver duplicate copies of a given message – Limit messages to some finite size – Deliver messages after an arbitrarily long delay Such a network is said to provide a best-effort level of service.
5 Chapter 4 5 Chapter 5 End-to-end Protocols Challenge for Transport Protocols – Develop algorithms that turn the less-than-desirable properties of the underlying network into the high level of service required by application programs
7 Chapter 4 7 Chapter 5 Simple Demultiplexer (UDP) Extends host-to-host delivery service of the underlying network into a process-to-process communication service Adds a level of demultiplexing which allows multiple application processes on each host to share the network
8 Chapter 4 8 Chapter 5 Simple Demultiplexer (UDP) Sender: multiplex of UDP datagrams. – UDP datagrams are received from multiple application programs. – A single sequence of UDP datagrams is passed to the IP layer. Receiver: demultiplex of UDP datagrams. – A single sequence of UDP datagrams is received from the IP layer. – Each UDP datagram received is passed to appropriate application program. Port 1Port 2Port 3 UDP Demultiplexing based on port IP layer UDP Datagram arrival at receiver
10 Chapter 4 10 Chapter 5 Simple Demultiplexer (UDP) Format for UDP header
11 Chapter 4 11 Chapter 5 Simple Demultiplexer (UDP) Multiplexing and demultiplexing is only based on port numbers. Port can be used for sending a UDP datagram to other hosts. Port can be used for receiving a UDP datagram from other hosts.
12 Chapter 4 12 Chapter 5 Simple Demultiplexer (UDP) Typically, a port is implemented by a message queue (buffer) Upon message receipt: When a message arrives, the protocol (e.g., UDP) appends the message to the end of the queue If the queue is full, the message is discarded Message Process: When an application process wants to receive a message, one is removed from the front of the queue. Application processthen processes the message. UDP Message Queue
13 Chapter 4 13 Chapter 5 Simple Demultiplexer (UDP) How does a process learns the port for the process to which it wants to send a message? – Typically, a client process initiates a message exchange with a server process. – Once a client has contacted a server, the server knows the client’s port (from the SrcPrt field contained in the message header) and can reply to it.
14 Chapter 4 14 Chapter 5 Simple Demultiplexer (UDP) How does the client learns the server’s port in the first place? – A common approach is for the server to accept messages at a well-known port. – i.e each server receives its messages at some fixed port that is widely published (Static Binding) Like well-known emergency phone number 911. – IANA assigns port numbers to specific services. IANA – List of all assignments is universally published by IANA Port numbers have a range of Static Binding: ports DNS : port 53, SMTP: port 25,FTP: port 21
15 Chapter 4 15 Chapter 5 Part of a IANA Port Assignment table:
16 Chapter 4 16 Chapter 5 Static Vs Dynamic Binding Dynamic binding – Server port is well defined (static binding) – Client's port number is often chosen from the dynamic port range – Program dynamically receives port from operating system. – Port range ( ) IANA later changed this to IANA UDP and TCP: hybrid approach. Some port numbers are assigned to fixed services. Many port numbers are available for dynamic assignment to application programs.
17 Chapter 4 17 Chapter 5 Simple Demultiplexer (UDP) Fixed UDP Port Numbers – Linux and Unix: /etc/services
18 Chapter 4 18 Chapter 5 Example 1 A client-server application such as DNS uses the services of UDP because a client needs to send a short request to a server and to receive a quick response from it. – The request and response can each fit in one user datagram. – Since only one message is exchanged in each direction, the connectionless feature is not an issue; – i.e. The client or server does not worry that messages are delivered out of order.
19 Chapter 4 19 Chapter 5 Question Is using UDP for sending s appropriate? Why?
20 Chapter 4 20 Chapter 5 Answer A client-server application such as SMTP, which is used in electronic mail, CANNOT use the services of UDP because a user can send a long message, which may include multimedia (images, audio, or video). If the application uses UDP and the message does not fit in one single user datagram, the message must be split by the application into different user datagrams. Here the connectionless service may create problems. The user datagrams may arrive and be delivered to the receiver application out of order. The receiver application may not be able to reorder the pieces. This means the connectionless service has a disadvantage for an application program that sends long messages.
21 Chapter 4 21 Chapter 5 Question Can downloading a large text file from internet use UDP as transport protocol? Why?
22 Chapter 4 22 Chapter 5 Answer Assume we are downloading a very large text file from the Internet. We definitely need to use a transport layer that provides reliable service. We don’t want part of the file to be missing or corrupted when we open the file. The delay created between the delivery of the parts are not an overriding concern for us; we wait until the whole file is composed before looking at it. In this case, UDP is not a suitable transport layer.
23 Chapter 4 23 Chapter 5 Question Is using UDP for watching a real time stream video OK? Why?
24 Chapter 4 24 Chapter 5 Answer Assume we are watching a real-time stream video on our computer. Such a program is considered a long file; it is divided into many small parts and broadcast in real time. The parts of the message are sent one after another. If the transport layer is supposed to resend a corrupted or lost frame, the synchronizing of the whole transmission may be lost – The viewer suddenly sees a blank screen and needs to wait until the second transmission arrives. This is not tolerable. However, if each small part of the screen is sent using one single user datagram, the receiving UDP can easily ignore the corrupted or lost packet and deliver the rest to the application program. – That part of the screen is blank for a very short period of the time, which most viewers do not even notice. However, video cannot be viewed out of order, so streaming audio, video, and voice applications that run over UDP must reorder or drop frames that are out of sequence.
25 Chapter 4 25 Chapter 5 Reliable Byte Stream (TCP) IP provides a connectionless/unrealible packet delivery to host UDP provides delivery to multiple ports within a host In contrast to UDP, Transmission Control Protocol (TCP) offers the following services – Byte-stream service: Application programs see stream of bytes flowing from sender to receiver. – Reliable: The sequence of bytes sent is the same as the sequence of bytes received. Technique: sliding window protocol. – Connection oriented: A connection between sender and receiver is established before sending data – Like UDP: Delivery to multiple ports within a host
26 Chapter 4 26 Chapter 5 Reliable Byte Stream (TCP) TCP Provides : – Flow control : Prevent senders from overrunning the capacity of the receivers I.e Packages are not sent faster than they can be received. Technique: TCP variant of sliding window protocol. – Congestion control: Prevent too much data from being injected into the network, i.e stop switches or links becoming overloaded
27 Chapter 4 27 Chapter 5 TCP Segment TCP is a byte-oriented protocol: – sender writes bytes into a TCP connection – receiver reads bytes out of the TCP connection. Buffered Transfer: – Application software may transmit individual bytes across the stream. – However, TCP does NOT transmit individual bytes over the Internet. – Rather, bytes are aggregated to packets before sending for efficiency
28 Chapter 4 28 Chapter 5 TCP Segment – TCP on the destination host then empties the contents of the packet into a receive buffer – The receiving process reads from this buffer. – The packets exchanged between TCP peers are called segments.
29 Chapter 4 29 Chapter 5 TCP Segment How TCP manages a byte stream.
30 Chapter 4 30 Chapter 5 TCP Header SrcPort and DstPort: – identify the source and destination ports, respectively. Acknowledgment, SequenceNum, and AdvertisedWindow : – involved in TCP’s sliding window algorithm. SequenceNum: – Each byte of data has a sequence number – Contains the sequence number for the first byte of data carried in that segment. TCP Header Format
31 Chapter 4 31 Chapter 5 TCP Header Flags: relay control information between TCP peers. – Possible flags :SYN, FIN, RESET, PUSH, URG, and ACK. – The SYN and FIN flags are used when establishing and terminating a TCP connection – The ACK flag is set any time the Acknowledgment field is valid – The URG flag signifies that this segment contains urgent data. – The PUSH flag signifies that the sender invoked the push operation – The RESET flag signifies that the receiver has become confused it received a segment it did not expect to receive so receiver wants to abort the connection. Checksum: – like in UDP
32 Chapter 4 32 Chapter 5 TCP Connections Use of protocol port numbers. – A TCP connection is identified by a pair of endpoints. – An endpoint is a pair (host IP address, port number). – E.g. Connection 1: ( , 1069) and ( , 25). Connection 2: ( , 1184) and ( , 53). Connection 3: ( , 1184) and ( , 53). – Different connections may share the endpoints! Multiple clients may be connected to the same service.
33 Chapter 4 33 Chapter 5 Connection Establishment/Termination in TCP Connection setup: – Client do an active open to a server – Two sides engage in an exchange of messages to establish the connection. – Then send data Both sides must signal that they are ready to transfer data. – Both sides must also agree on initial sequence numbers. – Initial sequence numbers are randomly chosen Timeline for three-way handshake algorithm
34 Chapter 4 34 Chapter 5 Connection Establishment 1.Segment1 (Client Server): State the initial sequence number it plans to use Flags = SYN, SequenceNum = x. 2.Segment2(Server Client): Acknowledges client’s sequence number Flags = ACK, Ack = x+1 States its own beginning sequence number Flags = SYN, SequenceNum = y 3.Segment3(Client Server): Acknowledges the server’s sequence number Flags = ACK, Ack = y ACK filed acknowledge next sequence number expected from other side
35 Chapter 4 35 Chapter 5 Connection Establishment/Termination in TCP Connection Teardown: – When participant is done sending data, it closes one direction of the connection – Either side can initiate tear down – Send FIN signal – “I’m not going to send any more data” – Other side can continue sending data – Half open connection – Must continue to acknowledge – Acknowledging FIN – Acknowledge last sequence number + 1 ClientServer FIN, SeqenceNum =a ACK Acknowledgement =a+1 ACK n Frame n FIN, SeqenceNum =b ACK Acknowledgement =b+1
36 Chapter 4 36 Chapter 5 Connection Establishment/Termination in TCP Connection setup is an asymmetric – one side does a passive open – the other side does an active open Connection teardown is symmetric – each side has to close the connection independently
37 Chapter 4 37 Chapter 5 End-to-end Issues At the heart of TCP is the sliding window algorithm TCP runs over the Internet rather than a point-to-point link The following issues need to be addressed by the sliding window algorithm – TCP supports logical connections between processes – TCP connections are likely to have widely different RTT times – – Packets may get reordered in the Internet
39 Chapter 4 39 Chapter 5 Error in transmission: packet loss Under-utilize the network capacity. There is only one message at a time in the network. SenderReceiver Frame 1 Ack 1 Frame 1 Ack 1 Timeout + retransmission
40 Chapter 4 40 Chapter 5 Sliding Window Protocol Better form of positive acknowledgement and retransmission. – Sender may transmit multiple packets before waiting for an acknowledgement. Windows of packets of small fixed size N. – All packets within window may be transmitted. – If first packet inside the window is acknowledged, window slides one element further Send frame1- 4 Ack 1 arrive send frame Ack 2 arrive send frame Sender sliding window
41 Chapter 4 41 Chapter 5 Performance depends on window size N and on speed of packet delivery. – N = 1: simple positive acknowledgement protocol. – By increasing N, network idle time may be eliminated. Steady state: – sender transmits packets as fast as network can transfer them. – Well tuned sliding window protocol keeps network saturated. Sender Receiver Frame 1 Ack 1 Frame 2 Ack 2 Frame 3 Frame 4 Ack 3 Ack 4 Frame 5 Sliding Window Revisited
42 Chapter 4 42 Chapter 5 Sliding Window Revisited Relationship between TCP send buffer (a) and receive buffer (b).
43 Chapter 4 43 Chapter 5 TCP Sliding Window Sending Side – LastByteAcked ≤ LastByteSent ≤ LastByteWritten Receiving Side – LastByteRead < NextByteExpected ≤ LastByteRcvd + 1 E.g. At Sender: Bytes 1-2 acknowledged Bytes 3-6 sent but Not yet acknowledged Bytes 7-9 not sent Will send next Bytes cannot be sent until window moves LastByteAckedLastByteSentLastByteWritten Fig. A Send Window Sending window size (advertised by the receiver)
44 Chapter 4 44 Chapter 5 TCP’s Sliding Window Mechanism TCP’s variant of the sliding window algorithm, which serves several purposes: (1) it guarantees the reliable delivery of data, (2) it ensures that data is delivered in order, and (3) it enforces flow control between the sender and the receiver.
45 Chapter 4 45 Chapter 5 Operates on byte level: – Bytes in window are sent from first to last. – First bytes have to be acknowledged before window may slide. The size of a window is variable. Receiver delivers window advertisements to sender: – On setup of a connection, the receiver informs the sender about the size of its window =how many bytes the receiver is willing to accept = size of buffer on receiver). – The sender sends at most as many bytes as determined by the window advertisement. – Every ACK message from the receiver contains a new window advertisement – The sender adapts its window size correspondingly. TCP Sliding Window
46 Chapter 4 46 Chapter 5 TCP Sliding Window Solves the problem of flow control: – As the window slides, the receiver may adjust the speed of the sender to its own speed by modifying the window size.
47 Chapter 4 47 Chapter 5 Triggering Transmission How does TCP decide to transmit a segment? – TCP supports a byte stream abstraction – Application programs write bytes into streams – It is up to TCP to decide that it has enough bytes to send a segment
48 Chapter 4 48 Chapter 5 Triggering Transmission What factors governs triggering transmission? – Ignore flow control: window is wide open – TCP has three mechanism to trigger the transmission of a segment 1. TCP maintains a variable MSS(Maximum Segment Size) – TCP sends a segment as soon as it has collected MSS bytes from the sending process 2.Sending process has explicitly asked TCP to send it – TCP supports push operation : immediate send rather than buffer – E.g. Telnet application: interactive, each keystroke immediately sent to other side 3.When a timer fires – Resulting segment contains as many bytes as are currently buffered for transmission
49 Chapter 4 49 Chapter 5 Silly Window Syndrome Caused by poorly implemented TCP flow control. Silly Window Syndrome: – Each ACK advertises a small window – Each segment carries a small amount of data Problems: – Poor use of network bandwidth – Unnecessary computational overhead E.g. – Server/Receiver consume data slowly. – So it requests client/sender to reduce the sending window size – If the server continues to be unable to process all incoming data: The window becomes smaller and smaller (shrink to a silly value) Data transmission becomes extremely inefficient.
50 Chapter 4 50 Chapter 5 TCP/IP Protocol Suite50 Silly Window Syndrome 2.Syndrome created by the Receiver – Problem: Receiving application program consumes data slowly So receiver advertise smaller window to sender – Solution : Clark ’ s solution Sending an ACK but announcing a window size of zero until there is enough space to accommodate a segment of max. size or until half of the buffer is empty i.e. receiver consumes data until “enough” space available to advertise
51 Chapter 4 51 Chapter 5 Silly Window Syndrome Syndrome created by the Sender – Problem: Sending application program creates data slowly – Goal: Do not send smaller segments Wait and collect data to send in a larger block – How long should the sending TCP wait before transmitting? Solution: Nagle ’ s algorithm Nagle ’ s algorithm takes into account – (1) the speed of the application program that creates the data, and – (2) the speed of the network that transports the data
52 Chapter 4 52 Chapter 5 Nagle’s Algorithm Solves slow sender problem If there is data to send but the window open is less than MSS, then we may want to wait some amount of time before sending the available data But how long? – If we wait too long, then we hurt interactive applications like Telnet – If we don’t wait long enough, then we risk sending a bunch of tiny packets and falling into the silly window syndrome Solution: – Introduce a timer and to transmit when the timer expires
53 Chapter 4 53 Chapter 5 Nagle’s Algorithm We could use a clock-based timer(e.g. fires every 100 ms) Nagle introduced an elegant self-clocking solution Key Idea: – As long as TCP has any data in flight, the sender will eventually receive an ACK – This ACK can be treated like a timer firing, triggering the transmission of more data
54 Chapter 4 54 Chapter 5 Nagle’s Algorithm When the application produces data to send if both the available data and the window ≥ MSS send a full segment else if there is unACKed data in flight buffer the new data until an ACK arrives else send all the new data now
55 Chapter 4 55 Chapter 5 Adaptive Retransmission TCP estimates the round-trip delay for each active connection For each connection: – TCP generates a sequence of round-trip estimates and – produces a weighted average
56 Chapter 4 56 Chapter 5 Question Why would TCP estimate RTT delay per connection?
57 Chapter 4 57 Chapter 5 Answer The goal is to wait long enough to decide that a packet was lost and needs to be retransmitted, without waiting longer than necessary – We don’t want timeout to expire too soon or too late. Ideally, we would like to set the retransmission timer to a value just slightly larger than the round-trip time (RTT) between the two TCP devices Ref: m
58 Chapter 4 58 Chapter 5 The problem is that there is no such “typical” round-trip time. Why?
59 Chapter 4 59 Chapter 5 There are two main reasons for this: 1.Differences In Connection Distance: 1.Pinging to Georgia State University, GA from within csbsju United State 2.Pinging to a Oxford University in UK from csbsju United States 3.Ping to Kharagpur college in India from csbsju United States
60 Chapter 4 60 Chapter 5 2.Transient Delays and Variability: – The amount of time it takes to send data between any two devices will vary over time due to various happenings on the internetwork: fluctuations in traffic, router loads etc. E.g.
61 Chapter 4 61 Chapter 5 It is for these reasons that TCP does not attempt to use a static, single number for its retransmission timers (timeout). Instead, TCP uses a dynamic, or adaptive retransmission scheme.
62 Chapter 4 62 Chapter 5 Adaptive Retransmission In addition to round trip delay estimations, TCP also maintains an estimate of the variance and uses a linear combination of the estimated mean and variance as the value of the timeout, as shown below Sender Receiver Frame 1 Ack 1 Frame 2 Ack 2 estimate1 estimate2 timeout Packet lost Example 1: Connection with longer RTT Sender Receiver Frame 1 Ack 1 Frame 2 Ack 2 estimate1 estimate2 timeout Packet lost Example 2: Connection with shorter RTT
63 Chapter 4 63 Chapter 5 Adaptive Retransmission Adaptive Retransmission Based on Round-Trip Time Calculations – TCP attempts to determine the approximate round-trip time between the devices, and adjusts it over time to compensate for increases or decreases in the average delay. – To learn how this is done look at RFC 2988RFC 2988 RTT estimated = ( * RTT estimated ) + ((1- ) * RTT sample ) – is a smoothing factor between 0 and 1 Rtt for most recent segment/Ack Previous estimate of Rtt New estimate of Rtt Smoothing factor [0..1]
64 Chapter 4 64 Chapter 5 Question Which values will be appropriate for ?
65 Chapter 4 65 Chapter 5 Answer RTT estimated = ( * RTT estimated ) + ((1- ) * RTT sample ) Higher values of (closer to 1) – provide better smoothing – avoid sudden changes as a result of one very fast or very slow RTT measurement. – Conversely, this also slows down how quickly TCP reacts to more sustained changes in round-trip time. Lower values of (closer to 0) : – make the RTT change more quickly in reaction to changes in measured RTT – can cause “over-reaction” when RTTs fluctuate wildly.
66 Chapter 4 66 Chapter 5 Adaptive Retransmission Original Algorithm 1.Measure SampleRTT for each segment/ ACK pair 2.Compute weighted average of RTT EstRTT = x EstRTT + (1 - )x SampleRTT between 0.8 and Set timeout based on EstRTT TimeOut = 2 x EstRTT – Why?
67 Chapter 4 67 Chapter 5 Original Algorithm Problem (Acknowledgement Ambiguity) – ACK does not really acknowledge a transmission It actually acknowledges the receipt of data – When a segment is retransmitted and then an ACK arrives at the sender It is impossible to decide if this ACK should be associated with the first or the second transmission for calculating RTTs – Why?
68 Chapter 4 68 Chapter 5 Associating the ACK with (a) original transmission versus (b) retransmission RTT appear too high RTT appear too low Original Algorithm
69 Chapter 4 69 Chapter 5 Question How do you correct this problem?
70 Chapter 4 70 Chapter 5 Answer: Karn/Partridge Algorithm Surprisingly simple solution! Refines RTT Calculation: – Do not sample RTT when retransmitting Only measures SampleRTT for segments that have been sent only once. Algorithm also included a second small change: – Double timeout after each retransmission Do NOT base timeout on the last EstimatedRTT Use Exponential backoff (similar to Ethernets) – Why?
71 Chapter 4 71 Chapter 5 Karn/Partridge Algorithm Karn-Partridge algorithm was an improvement over the original approach But it does not eliminate congestion We need to understand how timeout is related to congestion – If you timeout too soon, you may unnecessarily retransmit a segment which adds load to the network
72 Chapter 4 72 Chapter 5 Karn/Partridge Algorithm Main problem with the original computation is that it does not take variance of Sample RTTs into consideration. If the variance among Sample RTTs is small – Then the Estimated RTT can be better trusted – There is no need to multiply this by 2 to compute the timeout EstRTT = x EstRTT + (1 - )x SampleRTT TimeOut = 2 x EstRTT
73 Chapter 4 73 Chapter 5 Karn/Partridge Algorithm On the other hand, a large variance in the samples suggest that timeout value should not be tightly coupled to the Estimated RTT Jacobson/Karels proposed a new scheme for TCP retransmission
74 Chapter 4 74 Chapter 5 Jacobson/Karels Algorithm Difference = SampleRTT − EstimatedRTT EstimatedRTT = EstimatedRTT + ( × Difference) Deviation = Deviation + (|Difference| − Deviation) – is a fraction between 0 and 1. – i.e. we calculate both the mean RTT and the variation in that mean. TimeOut = μ × EstimatedRTT + × Deviation – μ is typically set to 1 – is set to 4. Thus, when the variance is small: – TimeOut is close to EstimatedRTT; When variance is large : – It causes the deviation term to dominate the calculation. Demo:
75 Chapter 4 75 Chapter 5 Summary We have discussed how to convert host-to-host packet delivery service to process-to-process communication channel. We have discussed UDP We have discussed TCP