2 Orientation r We move one layer up and look at the transport layer across the Internet.
3 Orientation r TCP and UDP are end-to-end protocols r They are only implemented at the hosts
4 Transport Protocols in the Internet UDP - User Datagram Protocol r datagram oriented r unreliable, connectionless r simple r unicast and multicast r useful for multimedia applications r used for control protocols m network management (SNMP), routing (RIP), naming (DNS), etc. TCP - Transmission Control Protocol r stream oriented r reliable, connection-oriented r complex r only unicast r used for data applications: m web (http), email (smtp), file transfer (ftp), SecureCRT, etc. The Internet supports 2 transport protocols
5 UDP - User Datagram Protocol r UDP extends the host-to-to-host delivery service of IP to an application process-to-application process delivery service r It does this by multiplexing and demultiplexing packets from multiple application-to-application communication sessions
6 UDP packet format Port numbers identify sending and receiving applications (processes). Maximum port number is 2 16 -1= 65,535 Message Length is between 8 bytes (i.e., data field can be empty) and 65,535 bytes (length of UDP header and data in bytes) Checksum is for UDP header and UDP data
7 Port Numbers r UDP (and TCP) use port numbers to identify applications r There are 65,535 UDP ports per host.
8 TCP r Service offered by TCP r TCP Header r TCP Connection Establishment and Termination r Flow control r Error control r Congestion control
9 TCP = Transmission Control Protocol r Provides a reliable unicast end-to-end byte stream over an unreliable internetwork.
10 TCP is reliable Byte stream is broken up into chunks which are called segments Detecting errors: TCP has checksums for header and data. Segments with invalid checksums are discarded Each segment that is transmitted has a sequence number. Receiver sends acknowledgments (ACKs) for segments Sender maintains a timer. An ACK is expected before the timer times out Correcting errors: Lost or errored segments are retransmitted. Selective repeat ARQ scheme Cumulative ACKs
11 Byte Stream Service r To the lower layers, TCP handles data in "segments" r To the higher layers TCP handles data as a sequence of bytes and does not identify boundaries between bytes r So: Higher layers do not know about the beginning and end of segments !
12 TCP r Service offered by TCP TCP Header r TCP Connection Establishment and Termination r Flow control r Error control r Congestion control
13 TCP Format TCP segments have a 20 byte plus options header with >= 0 data bytes reserved
14 TCP header fields - Port Numbers r Port Number: A port number identifies the endpoint of a connection. A pair identifies one endpoint of a connection. Two pairs and identify a TCP connection.
15 TCP header fields - Sequence Number r Sequence Number (SeqNo): m Sequence number is 32 bits long. m So the range of SeqNo is 0 <= SeqNo <= 2 32 -1 4.3 Gbyte m Each sequence number identifies the byte in the stream of data from the sending TCP to the receiving TCP that the first byte of data in this segment represents. m Initial Sequence Number (ISN) of a connection is set during connection establishment
16 TCP header fields - Ack. No. r Acknowledgment Number (AckNo): m Acknowledgments are piggybacked, i.e., a segment from A B contains an acknowledgement for a segment sent in the B A direction m The AckNo in the B A segment header contains the SeqNo for the next segment expected at B for the A B flow m Example: The acknowledgment for a 1500-byte segment with the sequence number 0 is AckNo=1500 m A host uses the AckNo field to send acknowledgements. m If a host sends an AckNo in a segment it sets the “ACK flag”
17 TCP header fields - Ack. No. Contd. r Example: Sender sends two segments with bytes “1..1500” and “1501..3000”, but receiver only gets the second segment. What is the sequence number of the first segment? What is the sequence number of the second segment? What is the ACK number sent in response by the receiver when it receives the second segment?
18 TCP header fields - Header Length r Header Length (4 bits): m Length of header in 32-bit words m Note that TCP header has variable length (minimum of 20 bytes)
19 TCP header fields - Flags r Flag bits: m URG: Urgent pointer is valid –If the bit is set, the following bytes contain an urgent message in the range: SeqNo <= urgent message <= SeqNo+urgent pointer m ACK: Acknowledgement Number is valid m PSH: PUSH Flag –Notification from sender to the receiver that the receiver should pass all data that it has to the application as soon as possible. –Normally set by sender when the sender’s buffer is empty (so TCP does not wait expecting more data)
20 TCP header fields - Flags Contd. r Flag bits: m RST: Reset the connection –The flag causes the receiver to reset the connection –Receiver of a RST terminates the connection and indicates higher layer application about the reset m SYN: Synchronize sequence numbers –Sent in the first packet when opening a connection m FIN: Sender is finished with sending –Used for closing a connection –Both sides of a connection must send a FIN
21 TCP header fields r Window Size: m Each side of the connection advertises its receiving window size m Window size is the maximum number of bytes that a receiver can accept. m Maximum window size is 2 16 -1= 65535 bytes r TCP Checksum: m TCP checksum covers both TCP header and TCP data r Urgent Pointer: m Only valid if URG flag is set
22 TCP header fields - Options r Options - a few examples:
23 TCP header fields r Options: m NOP is used to pad TCP header to a multiple of 4 bytes m Maximum Segment Size: Sets the maximum length of the segments This option can only appear in a SYN segment
24 TCP r Service offered by TCP r TCP Header TCP Connection Establishment and Termination r Flow control r Error control r Congestion control
25 Connection Management in TCP r Opening a TCP Connection r Closing a TCP Connection r Special Scenarios r State Diagram
26 TCP Connection Establishment r TCP uses a three-way handshake to open a connection: (1) ACTIVE OPEN: Client sends a segment with –SYN bit set –port number of client, port number of server –initial sequence number (ISN) of client (2) PASSIVE OPEN: Server responds with a segment with –SYN bit set –initial sequence number of server –ACK for ISN of client (3) Client acknowledges by sending a segment with: – ACK ISN of server
30 First data segment sequence number r Note that the data segment following the three-way handshake will start with the sequence number following that of the SYN segment
31 Why to start with a new ISN r The problem with starting off each connection with a sequence number of 1 is that it introduces the possibility of segments from different connections getting mixed up. r Traditionally, each device chose the ISN by making use of a timed counter, like a clock of sorts, that was incremented every 4 microseconds. This counter was initialized when TCP started up and then its value increased by 1 every 4 microseconds until it reached the largest 32-bit value possible (4,294,967,295) at which point it “wrapped around” to 0 and resumed incrementing. r Period: 4 hours
32 TCP Connection Termination r Each end of the data flow must be shut down independently (“half-close”) r If one end is done it sends a FIN segment. This means that no more data will be sent r Four steps involved: (1) X sends a FIN to Y (active close) (2) Y ACKs the FIN, (at this time: Y can still send data to X) (3) and Y sends a FIN to X (passive close) (4) X ACKs the FIN.
35 TCP Half-close FIN ACK of FIN DATA ACK of DATA FIN ACK of FIN
36 MSS A B C MTU = 296 MTU = 1500 SYN Default is generally 536 bytes
37 Difference between TCP connections and connections in a connection-oriented network r TCP “connections” are not the same as connections in a connection-oriented network r In a connection-oriented network, a signaling procedure is used to reserve bandwidth for the connection on every link of the end-to-end path (e.g., circuit-switched networks) r A TCP connection involves the maintenance of state information at the end hosts m Purpose is to provide error correction for TCP segments m Initial sequence number exchanged to avoid accidentally sending data to an old connection
38 TCP r Service offered by TCP r TCP Header r TCP Connection Establishment and Termination Flow control r Error control r Congestion control
39 TCP flow control Flow Control: How to prevent the sender from overrunning the receiver buffer? Flow Control in TCP TCP implements sliding window flow control Window size is usually sent within acknowledgements.
40 Window Management in TCP r The receiver returns two parameters to the sender in an ACK r The interpretation is: I am ready to receive new data with SeqNo= AckNo, AckNo+1, …., AckNo+Win-1 r Receiver can acknowledge data without opening the window r Receiver can change the window size without acknowledging data
41 TCP Flow Control r receive side of TCP connection has a receive buffer: r speed-matching service: matching the send rate to the receiving app’s drain rate r app process may be slow at reading from buffer sender won’t overflow receiver’s buffer by transmitting too much, too fast flow control
42 TCP Flow control: how it works (Suppose TCP receiver discards out-of-order segments) spare room in buffer = RcvWindow = RcvBuffer-[LastByteRcvd - LastByteRead] Rcvr advertises spare room by including value of RcvWindow in segments Sender limits unACKed data to RcvWindow m guarantees receive buffer doesn’t overflow
43 Sliding windows Sent and Acknow. Sent not acked 1 2 3 4 5 6 7 8 9 10 11 … Usable window: Can send ASAP Can’t send until window moves Offered window advertised by receiver
45 Sliding Window: In-class example ack 1025 win 3072 1:1025(1024) 4K bytes Sender Receiver win 4096 How many more segments can it send now? 4K bytes Sequence number: Is 1025 carried in TCP header? Is 1024 carried in TCP header? What is 1024? NOTATION 1025:2049(1024) 3 segments How many segments can it send now? 2049:3073(1024) 3073:4097(1024) 1K
46 Sliding Window: In-class example answers ack 1025 win 3072 1:1025(1024) 4K bytes Sender Receiver win 4096 How many more segments can it send now? 4K bytes 1025:2049(1024) 3 segments How many segments can it send now? 0 2049:3073(1024) 3073:4097(1024) 1K
47 Silly Window Syndrome r Let's say that the server is only able to remove 1 byte of data from the buffer for every 3 it receives. r Let's say it also removes 40 additional bytes from the buffer during the time it takes for the next client's segment to arrive. r In the worst case, the client then sends a segment with exactly one byte, refilling the buffer until the application draws off the next byte.
48 TCP r Service offered by TCP r TCP Header r TCP Connection Establishment and Termination r Flow control Error control r Congestion control
49 TCP error control r ARQ scheme with positive cumulative ACKs r Delayed ACKs: m TCP delays transmission of ACKs for up to 200ms m The hope is to have data ready in that time frame. Then, the ACK can be piggybacked with the data segment.
50 Delayed ACK timer r This timer ticks every 200ms. r First timeout occurs based on when the timer was initialized, which is when the system was rebooted. r The figure below explains why the delay for the ACKdelay is UP TO 200 ms (and not equal to 200 ms).
51 TCP Retransmission Timer r Retransmission Timer: m The setting of the retransmission timer is crucial for efficiency m Timeout value too small -> results in unnecessary retransmissions m Timeout value too large -> long waiting time before a retransmission can be issued m A problem is that the delays in the network are not fixed m Therefore, the retransmission timers must be adaptive
52 Measuring TCP Retransmission Timers Transfer file from aida to rigoletto Unplug Ethernet cable in the middle of file transfer
54 Interpreting the Measurements r The interval between retransmission attempts in seconds is: 1.03, 3, 6, 12, 24, 48, 64, 64, 64, 64, 64, 64, 64. r Time between retransmissions is doubled each time (Exponential Backoff Algorithm ) r Timer is not increased beyond 64 seconds r TCP gives up after 13th attempt and 9 minutes (total timeout, tcp_ip_abort_interval is 2 mins in Solaris and can be programmed by administrator - 9 mins is the commonly used old timeout value)
55 TCP timers r First timeout occurs based on when timer was initialized. r This explains why the first timeout occurs at 1.03 sec and not 1.5. r If the base timer clock is 500 ms, the first timeout occurs after 3 timer ticks. This happens to occur at 1.03 sec after first segment was sent. Subsequent retransmissions occur at 3 sec, 6 sec, 12 sec, etc.
56 Adaptive mechanism r The retransmission mechanism of TCP is adaptive r The retransmission timers are set based on round-trip time (RTT) measurements that TCP performs r The RTT is based on time difference between segment transmission and ACK r But: m TCP does not ACK each segment m Can’t start a second RTT measurement if timing on one segment is in progress m Each connection has only one timer
57 Computation of RTO in adaptive scheme r Retransmission timer is set to a Retransmission Timeout (RTO) value. r RTO is calculated based on the RTT measurements. r The RTT measurements are smoothed by the following estimators A (mean RTT value) and D (smoothed mean deviation of RTT): Err = M - A A A+ g Err=A(1-g)+gM D D+ h (|Err|-D)=D(1-h)+ h|Err| RTO = A + 4D The gains are set to h=1/4 and g=1/8 – In the formula for computing the new smoothed mean RTT A, 0.125 times the newly measured value (M) is added to 0.875 times the old smoothed value of A
58 In-class example r Assume A=1, D=1 (initial values) RTO= ?
59 Example of RTO computation (adaptive) r Assume A=1, D=1 (initial values) Err = 2 -1 =1 (since M, the measured RTT is 2) A = 1 + 0.125×1= 1.125; D = 1+0.25 (1-1)=1 RTO = A+4D=1.125+4 = 5.125 This is why in the figure below when segment 2 is lost, it is retransmitted after 5.125 sec.
60 In-class example r Assume A=1, D=1 (initial values) RTO=A+4D=5 RTO=A+4D=5.125 (adaptive: new A = 1.125; D=1) RTO=10.25 (doubling) RTO=10.25 (Karn's algorithm) 5.125 sec since that is the retransmission timer value
61 Karn’s Algorithm r If an ACK for a retransmitted segment is received, the sender cannot tell if the ACK belongs to the original or the retransmission. r The RTT measurement started for the original transmission should be terminated. r There will be no RTT measurement for the original or retransmitted segment r Therefore A and D cannot be updated when the ACK is received, and hence no new RTO computation at this point. r Don’t confuse this with the RTO being doubled when the segment is retransmitted following the exponential doubling rule. RTT measurement is suspended RTO is doubled
62 In-class example r At t 1 : RTO = 6 sec; A = 2; D = 1 r At t 2 : RTO= ? r At t 3 : RTO = ? 3 sec
63 In-class example r At t 1 : RTO = 6 sec; A = 2; D = 1 r At t 2 : RTO= 12 sec (doubling) r At t 3 : RTO = 12 sec (Karn's algorithm) 3 sec
64 Thus there are two schemes for determining RTO and two schemes for controlling RTT measurement r RTO m Exponential backoff if a segment is retransmitted m Adaptive RTO as a function of RTT (A+4D) RTT measurement is in progress and a new segment sent then no RTT measurement is taken for new segment r RTT measurement m Karn’s algorithm no RTT measurement on retransmitted segment m Can’t start a second RTT measurement if timing on one segment is in progress