Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fall 2007CSci232: Transport Layer & TCP1 Transport Layer  Transport Layer Services  connection-oriented vs. connectionless  multiplexing and demultplexing.

Similar presentations

Presentation on theme: "Fall 2007CSci232: Transport Layer & TCP1 Transport Layer  Transport Layer Services  connection-oriented vs. connectionless  multiplexing and demultplexing."— Presentation transcript:


2 Fall 2007CSci232: Transport Layer & TCP1 Transport Layer  Transport Layer Services  connection-oriented vs. connectionless  multiplexing and demultplexing  UDP: Connectionless Unreliable Service  TCP: Connection-Oriented Reliable Service  connection management: set-up and tear down  reliable data transfer protocols  flow and congestion control Readings: Chapter 5

3 Fall 2007CSci232: Transport Layer & TCP2 Transport Protocols Lowest level end-to- end protocol. –Header generated by sender is interpreted only by the destination –Routers view transport header as part of the payload Transport IP Datalink Physical Transport IP Datalink Physical IP router

4 Fall 2007CSci232: Transport Layer & TCP3 Transport Services and Protocols provide logical communication between app processes running on different hosts transport protocols run in end systems –send side: breaks app messages into segments, passes to network layer –rcv side: reassembles segments into messages, passes to app layer more than one transport protocol available to apps –Internet: TCP and UDP application transport network data link physical application transport network data link physical network data link physical network data link physical network data link physical network data link physical network data link physical logical end-end transport

5 Fall 2007CSci232: Transport Layer & TCP4 Transport Layer Services Underlying best-effort network –drops messages –re-orders messages –delivers duplicate copies of a given message –delivers messages after an arbitrarily long delay Common end-to-end services –guarantee message delivery –deliver messages in the same order they are sent –deliver at most one copy of each message –allow the receiver to flow control the sender –support multiple application processes on each host

6 Fall 2007CSci232: Transport Layer & TCP5 Transport vs. Application and Network Layer application layer: application processes and message exchange network layer: logical communication between hosts transport layer: logical communication support for app processes –relies on, enhances, network layer services Household analogy: 12 kids sending letters to 12 kids processes = kids app messages = letters in envelopes hosts = houses transport protocol = Ann and Bill network-layer protocol = postal service

7 Fall 2007CSci232: Transport Layer & TCP6 End to End Issues Transport services built on top of (potentially) unreliable network service –packets can be corrupted or lost –Packets can be delayed or arrive “out of order” Do we detect and/or recover errors for apps? – Error Control & Reliable Data Transfer Do we provide “in-order” delivery of packets? – Connection Management & Reliable Data Transfer Potentially different capacity at destination, and potentially different network capacity –Flow and Congestion Control

8 Fall 2007CSci232: Transport Layer & TCP7 Internet Transport Protocols TCP service: connection-oriented: setup required between client, server reliable transport between sender and receiver flow control: sender won’t overwhelm receiver congestion control: throttle sender when network overloaded UDP service: unreliable data transfer between sender and receiver does not provide: connection setup, reliability, flow control, congestion control Both provide logical communication between app processes running on different hosts!

9 Fall 2007CSci232: Transport Layer & TCP8 Multiplexing/Demultiplexing application transport network link physical P1 application transport network link physical application transport network link physical P2 P3 P4 P1 host 1 host 2 host 3 = process= API (“socket”) delivering received segments to correct application process Demultiplexing at rcv host: gathering data from multiple app processes, enveloping data with header (later used for demultiplexing) Multiplexing at send host:

10 Fall 2007CSci232: Transport Layer & TCP9 How Demultiplexing Works host receives IP datagrams –each datagram has source IP address, destination IP address –each datagram carries 1 transport-layer segment –each segment has source, destination port number (recall: well-known port numbers for specific applications) host uses IP addresses & port numbers to direct segment to appropriate app process (identified by “socket’) source port #dest port # 32 bits application data (message) other header fields TCP/UDP segment format

11 Fall 2007CSci232: Transport Layer & TCP10 UDP: User Datagram Protocol [RFC 768] “no frills,” “bare bones” Internet transport protocol “best effort” service, UDP segments may be: –lost –delivered out of order to app connectionless: –no handshaking between UDP sender, receiver –each UDP segment handled independently of others Why is there a UDP? no connection establishment (which can add delay) simple: no connection state at sender, receiver small segment header no congestion control: UDP can blast away as fast as desired

12 Fall 2007CSci232: Transport Layer & TCP11 UDP (cont’d) often used for streaming multimedia apps –loss tolerant –rate sensitive other UDP uses –DNS –SNMP reliable transfer over UDP: add reliability at application layer –application-specific error recovery! source port #dest port # 32 bits Application data (message) UDP segment format length checksum Length, in bytes of UDP segment, including header

13 Fall 2007CSci232: Transport Layer & TCP12 UDP Checksum Sender: treat segment contents as sequence of 16-bit integers checksum: addition (1’s complement sum) of segment contents sender puts checksum value (1’s complement of 1’s complement sum of 16- bit words) into UDP checksum field Receiver: compute checksum of received segment check if computed checksum equals checksum field value: –NO - error detected –YES - no error detected. But maybe errors nonetheless? More later …. Goal: detect “errors” (e.g., flipped bits) in transmitted segment

14 Fall 2007CSci232: Transport Layer & TCP13 Checksum: Example + sum: checksum(1’s complement): verify by adding: arrange data segment in sequences of 16-bit words

15 Fall 2007CSci232: Transport Layer & TCP14 TCP Overview Connection-oriented Byte-stream –app writes bytes –TCP sends segments –app reads bytes Application process Write bytes TCP Send buffer Segment Transmit segments Application process Read bytes TCP Receive buffer … …… Full duplex Flow control: keep sender from overrunning receiver Congestion control: keep sender from overrunning network

16 Fall 2007CSci232: Transport Layer & TCP15 Functionality Split Network provides best-effort delivery End-systems implement many functions –Reliability –In-order delivery –Demultiplexing –Message boundaries –Connection abstraction –Flow Control –Congestion control –…

17 Fall 2007CSci232: Transport Layer & TCP16 High-Level TCP Characteristics Protocol implemented entirely at the ends –Fate sharing Protocol has evolved over time and will continue to do so –Nearly impossible to change the header –Use options to add information to the header –Change processing at endpoints –Backward compatibility is what makes it TCP

18 Fall 2007CSci232: Transport Layer & TCP17 Evolution of TCP TCP & IP RFC 793 & TCP described by Vint Cerf and Bob Kahn In IEEE Trans Comm 1983 BSD Unix 4.2 supports TCP/IP 1984 Nagel’s algorithm to reduce overhead of small packets; predicts congestion collapse 1987 Karn’s algorithm to better estimate round-trip time 1986 Congestion collapse observed 1988 Van Jacobson’s algorithms congestion avoidance and congestion control (most implemented in 4.3BSD Tahoe) BSD Reno fast retransmit delayed ACK’s 1975 Three-way handshake Raymond Tomlinson In SIGCOMM 75

19 Fall 2007CSci232: Transport Layer & TCP18 TCP Through the 1990s ECN (Floyd) Explicit Congestion Notification 1993 TCP Vegas (Brakmo et al) real congestion avoidance 1994 T/TCP (Braden) Transaction TCP 1996 SACK TCP (Floyd et al) Selective Acknowledgement 1996 Hoe Improving TCP startup 1996 FACK TCP (Mathis et al) extension to SACK

20 Fall 2007CSci232: Transport Layer & TCP19 source port # dest port # 32 bits application data (variable length) sequence number acknowledgement number rcvr window size ptr urgent data checksum F SR PAU head len not used Options (variable length) TCP Segment Header Structure URG: urgent data (generally not used) ACK: ACK # valid RST, SYN, FIN: connection estab (setup, teardown commands) # bytes rcvr willing to accept counting by bytes of data (not segments!) Internet checksum (as in UDP) PSH: push data now (generally not used)

21 Fall 2007CSci232: Transport Layer & TCP20 Each connection identified with 4-tuple: –(SrcPort, SrcIPAddr, DstPort, DstIPAddr) Sliding window + flow control –acknowledgment, SequenceNum, AdvertisedWinow Flags –SYN, FIN, ACK, RESET, PUSH, URG Checksum –pseudo header (src & dst IP addresses) + TCP header + data TCP Segment Format (cont) Sender Data(SequenceNum) Acknowledgment + AdvertisedWindow Receiver

22 Fall 2007CSci232: Transport Layer & TCP21 TCP Connection Set Up TCP sender, receiver establish “connection” before exchanging data segments initialize TCP variables: –seq. # –buffers, flow control info client: end host that initiates connection server: end host contacted by client Three way handshake: Step 1: client sends TCP SYN control segment to server –specifies initial seq # Step 2: server receives SYN, replies with SYN+ACK control segment –ACKs received SYN –specifies server  receiver initial seq. # Step 3: client receives SYN+ACK, replies with ACK segment (which may contain 1 st data segment)

23 Fall 2007CSci232: Transport Layer & TCP22 Question: a. What kind of “state” client and server need to maintain? b. What initial sequence # should client (and server) use? TCP 3-Way Hand-Shake client SYN, seq=x server SYN+ACK, seq=y, ack=x+1 ACK, seq=x+1, ack=y+1 initiate connection established connection established SYN received (1 st data segment)

24 Fall 2007CSci232: Transport Layer & TCP23 TCP Connection Setup Example No. Time Source > Destination Proto SrcPort>DstPort [Flags] TCP 1414 > 22 [SYN] Seq= Len=0 MSS= TCP 22 > 1414 [SYN, ACK] Seq= Ack= Win=25200 Len=0 MSS= TCP 1414 > 22 [ACK] Seq= Ack= Win=16384 Len=0

25 Fall 2007CSci232: Transport Layer & TCP24 TCP Connection Setup Example No. Time Source > Destination Proto SrcPort>DstPort [Flags] TCP 1567 > 80 [SYN] Seq= Len=0 MSS= TCP 80> 1567 [SYN, ACK] Seq= Ack= Win=25200 Len=0 MSS= TCP 1567 > 80 [ACK] Seq= Ack= Win=17640 Len= TCP 1567 > 80 [PSH,ACK] Seq= Ack= Win=17640 Len= TCP 80> 1567 [ACK] Seq= Ack= Win=25200 Len=0 MSS=1460

26 Fall 2007CSci232: Transport Layer & TCP25 3-Way Handshake: Finite State Machine Client FSM? info (“state”) maintained at client? Server FSM? closed Upper layer: initiate connection ? ? sent SYN w/ initial seq =x SYN sent conn estab’ed ? ? SYN+ACK received sent ACK ? ? ? ?

27 Fall 2007CSci232: Transport Layer & TCP26 Connection Setup Error Scenarios Lost (control) packets –What happen if SYN lost? client vs. server actions –What happen if SYN+ACK lost? client vs. server actions –What happen if ACK lost? client vs. server actions Duplicate (control) packets –What does server do if duplicate SYN received? –What does client do if duplicate SYN+ACK received? –What does server do if duplicate ACK received?

28 Fall 2007CSci232: Transport Layer & TCP27 Connection Setup Error Scenarios (cont’d) Importance of (unique) initial seq. no.? –When receiving SYN, how does server know it’s a new connection request? –When receiving SYN+ACK, how does client know it’s a legitimate, i.e., a response to its SYN request? Dealing with old duplicate packets from old connections (or from malicious users) –If not careful: “TCP Hijacking” How to choose unique initial seq. no.? – randomly choose a number (and add to last syn# used) Other security concern: –“SYN Flood” -- denial-of-service attack

29 Fall 2007CSci232: Transport Layer & TCP28 Detecting Half-Open Connections 1.(CRASH) 2.CLOSED 3.SYN-SENT  4.(!!)  5.SYN-SENT  6.SYN-SENT 7.SYN-SENT  (send 300, receive 100) ESTABLISHED  (??)  ESTABLISHED  (Abort!!) CLOSED  TCP BTCP A

30 Fall 2007CSci232: Transport Layer & TCP29 TCP State Diagram: Connection Setup CLOSED SYN SENT SYN RCVD ESTAB LISTEN active OPEN create TCB Snd SYN create TCB passive OPEN delete TCB CLOSE delete TCB CLOSE snd SYN SEND snd SYN ACK rcv SYN Send FIN CLOSE rcv ACK of SYN Snd ACK Rcv SYN, ACK rcv SYN snd ACK Client Server

31 Fall 2007CSci232: Transport Layer & TCP30 Client wants to close connection: Step 1: client end system sends TCP FIN control segment to server TCP: Closing Connection Remember TCP duplex connection! client server FIN server closing ACK half closed FIN client closing half closed Step 2: server receives FIN, replies with ACK. half closed Server finishes sending data, also ready to close: Step 4: server sends FIN. Step 3: client receives ACK. half closed, wait for server to close

32 Fall 2007CSci232: Transport Layer & TCP31 Step 5: client receives FIN, replies with ACK. connection fully closed TCP: Closing Connection (cont’d) client FIN server ACK FIN client closing half closed server closing full closed half closed ACK full closed Problem Solved? Well Done! Step 6: server, receives ACK. connection fully closed

33 Fall 2007CSci232: Transport Layer & TCP32 Step 5: client receives FIN, replies with ACK. –Enters “timed wait” - will respond with ACK to received FINs TCP: Closing Connection (revised) client FIN server ACK FIN client closing half closed server closing half closed Two Army Problem! Step 6: server, receives ACK. connection fully closed full closed full closed ACK Step 7: client, timer expires, connection fully closed timed wait ACK FIN X timeout

34 Fall 2007CSci232: Transport Layer & TCP33 TCP Connection Tear-Down Example No. Time Source > Destination Proto SrcPort>DstPort [Flags] TCP 1414 > 22 [PSH,ACK] Seq= Ack= Win=15920 Len= TCP 1414 > 22 [FIN, ACK] Seq= Ack= Win=15920 Len= TCP 22 > 1414 [ACK] Seq= Ack= Win=25200 Len= TCP 22 > 1414 [ACK] Seq= Ack= Win=25200 Len= TCP 22 > 1414 [FIN,ACK] Seq= Ack= Win=25200 Len= TCP 1414 > 22 [ACK] Seq= Ack= Win=15920 Len=0

35 Fall 2007CSci232: Transport Layer & TCP34 State Diagram: Connection Tear-down CLOSING CLOSE WAIT FIN WAIT-1 ESTAB TIME WAIT snd FIN CLOSE send FIN CLOSE rcv ACK of FIN LAST-ACK CLOSED FIN WAIT-2 snd ACK rcv FIN delete TCB Timeout=2min send FIN CLOSE send ACK rcv FIN snd ACK rcv FIN rcv ACK of FIN snd ACK rcv FIN+ACK ACK Active Close Passive Close

36 Fall 2007CSci232: Transport Layer & TCP35 TCP Connection Management FSM TCP client lifecycle TCP client lifecycle

37 Fall 2007CSci232: Transport Layer & TCP36 TCP Connection Management FSM TCP server lifecycle TCP server lifecycle

38 Fall 2007CSci232: Transport Layer & TCP37 ARQ vs. FEC –automatic retransmission request –forward error correction General ARQ Algorithms –Stop & Wait Perform issue: low utilization when delay-bw product large –Sliding Window Protocols Go-Back-N Selective Repeat Key design issues: window size vs. size of seq. no. space Reliability and Error Recovery

39 Fall 2007CSci232: Transport Layer & TCP38 Error Recovery: Stop and Wait Time Packet ACK Timeout ARQ –Receiver sends acknowledgement (ACK) when it receives packet –Sender waits for ACK and timeouts if it does not arrive within some time period Simplest ARQ protocol Send a packet, stop and wait until ACK arrives SenderReceiver

40 Fall 2007CSci232: Transport Layer & TCP39 Recovering from Error Packet ACK Timeout Packet ACK Timeout Packet Timeout Packet ACK Timeout Time Packet ACK Timeout Packet ACK Timeout ACK lostPacket lost Early timeout DUPLICATE PACKETS!!!

41 Fall 2007CSci232: Transport Layer & TCP40 How to recognize a duplicate Performance –Can only send one packet per round trip Problems with Stop and Wait

42 Fall 2007CSci232: Transport Layer & TCP41 How to Recognize Resends? Use sequence numbers –both packets and acks Sequence # in packet is finite  How big should it be? –For stop and wait? One bit – won’t send seq #1 until received ACK for seq #0 Pkt 0 ACK 0 Pkt 0 ACK 1 Pkt 1 ACK 0

43 Fall 2007CSci232: Transport Layer & TCP42 Can’t keep the pipe full –Utilization is low when bandwidth-delay product (R x RTT)is large! Sender Receiver data (L bytes) ACK first packet bit transmitted, t = 0 RTT first packet bit arrives ACK arrives, send next packet, t = RTT + L / R Problem with Stop & Wait Protocol

44 Fall 2007CSci232: Transport Layer & TCP43 Stop & Wait: Performance Analysis Example: 1 Gbps connection, 15 ms end-end prop. delay, data segment size: 1 KB = 8Kb –U sender : utilization, i.e., fraction of time sender busy sending –1KB data segment every 30 msec (round trip time) --> 0.027% x 1 Gbps = 33kB/sec throughput over 1 Gbps link Moral of story: network protocol limits use of physical resources!

45 Fall 2007CSci232: Transport Layer & TCP44 How to Keep the Pipe Full? Send multiple packets without waiting for first to be acked –Number of pkts in flight = window Reliable, unordered delivery –Several parallel stop & waits –Send new packet after each ack –Sender keeps list of unack’ed packets; resends after timeout –Receiver same as stop & wait How large a window is needed? –Suppose 10Mbps link, 4ms delay, 500byte pkts 1? 10? 20?11020 –Round trip delay * bandwidth = capacity of pipe

46 Fall 2007CSci232: Transport Layer & TCP45 Pipelined (Sliding Window) Protocols Pipelining: sender allows multiple, “in-flight”, yet-to- be-acknowledged data segments –range of sequence numbers must be increased –buffering at sender and/or receiver Two generic forms of pipelined protocols: Go-Back-N and Selective Repeat

47 Fall 2007CSci232: Transport Layer & TCP46 Pipelining: Increased Utilization first packet bit transmitted, t = 0 senderreceiver RTT last bit transmitted, t = L / R first packet bit arrives last packet bit arrives, send ACK ACK arrives, send next packet, t = RTT + L / R last bit of 2 nd packet arrives, send ACK last bit of 3 rd packet arrives, send ACK Increase utilization by a factor of 3!

48 Fall 2007CSci232: Transport Layer & TCP47 Sliding Window Reliable, ordered delivery Receiver has to hold onto a packet until all prior packets have arrived –Why might this be difficult for just parallel stop & wait? –Sender must prevent buffer overflow at receiver Circular buffer at sender and receiver –Packets in transit  buffer size –Advance when sender and receiver agree packets at beginning have been received

49 Fall 2007CSci232: Transport Layer & TCP48 Receiver Sender Sender/Receiver State …… Sent & AckedSent Not Acked OK to SendNot Usable …… Max acceptable Receiver window Max ACK receivedNext seqnum Received & AckedAcceptable Packet Not Usable Sender window Next expected

50 Fall 2007CSci232: Transport Layer & TCP49 Window Sliding – Common Case On reception of new ACK (i.e. ACK for something that was not acked earlier) –Increase sequence of max ACK received –Send next packet On reception of new in-order data packet (next expected) –Hand packet to application –Send cumulative ACK – acknowledges reception of all packets up to sequence number –Increase sequence of max acceptable packet

51 Fall 2007CSci232: Transport Layer & TCP50 Loss Recovery On reception of out-of-order packet –Send nothing (wait for source to timeout) –Cumulative ACK (helps source identify loss) Timeout (Go-Back-N recovery) –Set timer upon transmission of packet –Retransmit all unacknowledged packets Performance during loss recovery –No longer have an entire window in transit –Can have much more clever loss recovery

52 Fall 2007CSci232: Transport Layer & TCP51 Go-Back-N in Action

53 Fall 2007CSci232: Transport Layer & TCP52 Selective Repeat Receiver individually acknowledges all correctly received pkts –Buffers packets, as needed, for eventual in-order delivery to upper layer Sender only resends packets for which ACK not received –Sender timer for each unACKed packet Sender window –N consecutive seq #’s –Again limits seq #s of sent, unACKed packets

54 Fall 2007CSci232: Transport Layer & TCP53 Selective Repeat: Sender, Receiver Windows

55 Fall 2007CSci232: Transport Layer & TCP54 Sequence Numbers How large does size of sequence number space need to be? –Must be able to detect wrap-around –Depends on sender/receiver window size E.g. –size of seq. no. space = 8, send win=recv win=7 –If pkts 0..6 are sent succesfully and all acks lost Receiver expects 7,0..5, sender retransmits old 0..6!!! size of sequence no. space must be  send window + recv window

56 Fall 2007CSci232: Transport Layer & TCP55 Sequence Numbers in TCP TCP regards data as a “byte-stream” –each byte in byte stream is numbered. 32 bit value, wraps around initial values selected at start up time TCP breaks up byte stream in packets. –Packet size is limited to the Maximum Segment Size (MSS) Each packet has a sequence number –seq. no of 1 st byte indicates where it fits in the byte stream TCP connection is duplex –data in each direction has its own sequence numbers packet 8 packet 9packet

57 Fall 2007CSci232: Transport Layer & TCP56 TCP Seq. #’s and ACKs Seq. #’s: byte stream “number”of first byte in segment’s data ACKs: seq # of next byte expected from other side host ACKs receipt of echoed ‘C’ Host A Host B Seq=42, ACK=79, data = ‘C’ Seq=79, ACK=43, data = ‘C’ Seq=43, ACK=80 User types ‘C’ host ACKs receipt of ‘C’, echoes back ‘C’ simple telnet scenario time red: A-to-B green: B-to-A

58 Fall 2007CSci232: Transport Layer & TCP57 TCP Reliable Data Transfer TCP creates reliable data transfer service on top of IP’s unreliable service Pipelined segments Cumulative ACKs TCP uses single retransmission timer Retransmissions are triggered by: –timeout events –duplicate acks Initially consider simplified TCP sender: –ignore duplicate acks –ignore flow control, congestion control

59 Fall 2007CSci232: Transport Layer & TCP58 TCP = Go-Back-N Variant Sliding window with cumulative acks –Receiver can only return a single “ack” sequence number to the sender. –Acknowledges all bytes with a lower sequence number –Starting point for retransmission –Duplicate acks sent when out-of-order packet received But: sender only retransmits a single packet. –Reason??? Only one that it knows is lost Network is congested  shouldn’t overload it Error control is based on byte sequences, not packets. –Retransmitted packet can be different from the original lost packet – Why?

60 Fall 2007CSci232: Transport Layer & TCP59 TCP Sender Events: data rcvd from app: Create segment with seq # seq # is byte-stream number of first data byte in segment start timer if not already running (think of timer as for oldest unacked segment) expiration interval: TimeOutInterval timeout: retransmit segment that caused timeout restart timer ACK received: If acknowledges previously unACKed segments –update what is known to be ACKed –start timer if there are outstanding segments

61 Fall 2007CSci232: Transport Layer & TCP60 TCP ACK generation [RFC 1122, RFC 2581] Event at Receiver Arrival of in-order segment with expected seq #. All data up to expected seq # already ACKed Arrival of in-order segment with expected seq #. One other segment has ACK pending Arrival of out-of-order segment higher-than-expect seq. #. Gap detected Arrival of segment that partially or completely fills gap TCP Receiver Action Delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK Immediately send single cumulative ACK, ACKing both in-order segments Immediately send duplicate ACK, indicating seq. # of next expected byte Immediate send ACK, provided that segment starts at lower end of gap

62 Fall 2007CSci232: Transport Layer & TCP61 TCP Flow Control receive side of TCP connection has a receive buffer: speed-matching service: matching the send rate to the receiving app’s drain rate app process may be slow at reading from buffer sender won’t overflow receiver’s buffer by transmitting too much, too fast flow control

63 Fall 2007CSci232: Transport Layer & TCP62 TCP Flow Control: How It Works (Suppose TCP receiver discards out-of-order segments) spare room in buffer = RcvWindow (dynamically changes) = RcvBuffer-[LastByteRcvd - LastByteRead] Rcvr advertises spare room by including value of RcvWindow in segments Sender limits unACKed data to RcvWindow –guarantees receive buffer doesn’t overflow

64 Fall 2007CSci232: Transport Layer & TCP63 source port # dest port # 32 bits application data (variable length) sequence number acknowledgement number rcvr window size ptr urgent data checksum F SR PAU head len not used Options (variable length) TCP Segment Structure URG: urgent data (generally not used) ACK: ACK # valid RST, SYN, FIN: connection estab (setup, teardown commands) # bytes rcvr willing to accept counting by bytes of data (not segments!) Internet checksum (as in UDP) PSH: push data now (generally not used)

65 Fall 2007CSci232: Transport Layer & TCP64 Triggering Transmission How does TCP decide to transmit a segment? –MSS (Maximum segment size) Set to size of the largest segment TCP can send without local IP fragmentation (MTU of directly connected) –Sending process explicitly asked to do (Push to flush) –Firing timer Silly Window Syndrome –Flow control needs to be maintained –Sender can transmit full segment (MSS) when Acked by receiver

66 Fall 2007CSci232: Transport Layer & TCP65 Silly Window Syndrome (cont’d) –Window currently closed from receiver –ACK opens MSS/2 bytes –Should sender transmit MSS/2? Original TCP implementation silent Early implementation of TCP decided to go ahead Sender can not know when the window will open for full MSS –If sender is aggressive, sending available window size results Silly window syndrome small segment size remains indefinitely –Hence a problem when either sender transmits a small segment or receiver opens window a small amount

67 Fall 2007CSci232: Transport Layer & TCP66 Triggering Transmission (cont’d) –Receiver may delay ACKs, but how long? –Ultimate solution lies with sender: When does the TCP sender decide to transmit a segment? Nagle’s Algorithm: –Waiting too long hurt interactive applications (Telnet) –Without waiting, risk of sending a bunch of tiny packets (silly window syndrome) –Wait till timer expires: Self clocking: As long as TCP has any data in flight, sender receives an ACK which can be used to trigger transmission If no data in flight, immediately send the segment (setting TCP_NoDElAY option)

68 Fall 2007CSci232: Transport Layer & TCP67 TCP Round Trip Time and Timeout Q: how to set TCP timeout value? longer than RTT –but RTT varies too short: premature timeout –unnecessary retransmissions too long: slow reaction to segment loss Q: how to estimate RTT? SampleRTT : measured time from segment transmission until ACK receipt –ignore retransmissions, why? SampleRTT will vary, want estimated RTT “smoother” –average several recent measurements, not just current SampleRTT

69 Fall 2007CSci232: Transport Layer & TCP68 Round-trip Time Estimation Wait at least one RTT before retransmitting Importance of accurate RTT estimators: –Low RTT estimate unneeded retransmissions –High RTT estimate poor throughput RTT estimator must adapt to change in RTT –But not too fast, or too slow! Spurious timeouts –“Conservation of packets” principle – never more than a window worth of packets in flight

70 Fall 2007CSci232: Transport Layer & TCP69 Adaptive Retransmission (Original Algorithm) Measure SampleRTT for each segment/ ACK pair Compute weighted running average of RTT –EstRTT =  x EstimatedRTT + (1-  x SampleRTT  between 0.8 and 0.9 ( to smooth Estimated RTT) -Small  indicates temp. fluctuation, a large value more stable, may not be quick to adapt to real changes Set timeout based on EstRTT –TimeOut = 2 x EstRTT

71 Fall 2007CSci232: Transport Layer & TCP70 Retransmission Ambiguity ACK is for Original transmission but was for retransmission => Sample RTT is too large ACK is for retransmission but was for original => Sample RTT too small SenderReceiver Original transmission ACK SampleR TT Retransmission SenderReceiver Original transmission ACK SampleR TT Retransmission

72 Fall 2007CSci232: Transport Layer & TCP71 Karn/Partridge Algorithm Solution: Do not sample RTT when retransmitting –only measures sample RTT for segments sent once Double timeout for each retransmission –Next timeout to be twice the last timeout, rather than basing it on the last Estimated RTT Karn and Patridge proposal is exponential backoff –Congestion is most likely cause of lost segments –TCP sources should not react too aggressively to a timeout –More timeouts mean more cautious the source should become (congestion problem)

73 Fall 2007CSci232: Transport Layer & TCP72 Jacobson/ Karels Algorithm Original computation for RTT did not take the variance of sample RTTs into account –If variation among samples is small, Estimated RTT can be better used without increasing the estimate twice –A large variance in the samples mean Time out values should not be too tightly coupled to the Estimated RTT New Calculations for average RTT –Diff = S ampleRTT - EstRTT –EstRTT = EstRTT + (  x Diff) –Dev = Dev +  ( |Diff| - Dev) where  is a fraction between 0 and 1 Consider variance when setting timeout value –TimeOut =  x EstRTT +  x Dev where  = 1 and  = 4

74 Fall 2007CSci232: Transport Layer & TCP73 TCP Round Trip Time Estimation EstimatedRTT = (1-  )*EstimatedRTT +  *SampleRTT Exponential weighted moving average influence of past sample decreases exponentially fast typical value:  = Setting the timeout interval Estimted RTT plus “safety margin” –large variation in EstimatedRTT -> larger safety margin “safty margin”: accommodate variations in estimatedRTT DevRTT = (1-  )*DevRTT +  *|SampleRTT-EstimatedRTT| (typically,  = 0.25) TimeoutInterval = EstimatedRTT + 4*DevRTT

75 Fall 2007CSci232: Transport Layer & TCP74 Example RTT Estimation:

76 Fall 2007CSci232: Transport Layer & TCP75 Timestamp Extension Used to improve timeout mechanism by more accurate measurement of RTT When sending a packet, insert current time into option –4 bytes for time, 4 bytes for echo a received timestamp Receiver echoes timestamp in ACK –Actually will echo whatever is in timestamp Removes retransmission ambiguity –Can get RTT sample on any packet

77 Fall 2007CSci232: Transport Layer & TCP76 Timer Granularity Many TCP implementations set RTO (Retransmission Timeout) in multiples of 200,500,1000ms Why? –Avoid spurious timeouts – RTTs can vary quickly due to cross traffic –Make timers interrupts efficient What happens for the first couple of packets? –Pick a very conservative value (seconds)

78 Fall 2007CSci232: Transport Layer & TCP77 Important Lessons TCP state diagram  setup/teardown TCP timeout calculation  how is RTT estimated Modern TCP loss recovery –Why are timeouts bad? –How to avoid them?  e.g. fast retransmit

79 Fall 2007CSci232: Transport Layer & TCP78 Fast Retransmit What are duplicate acks (dupacks)? –Repeated acks for the same sequence When can duplicate acks occur? –Loss –Packet re-ordering –Window update – advertisement of new flow control window Assume re-ordering is infrequent and not of large magnitude –Use receipt of 3 or more duplicate acks as indication of loss –Don’t wait for timeout to retransmit packet

80 Fall 2007CSci232: Transport Layer & TCP79 Fast Retransmit Time Sequence No Duplicate Acks Retransmission X Packets Acks

81 Fall 2007CSci232: Transport Layer & TCP80 TCP (Reno variant) Time Sequence No X X X X Now what? - timeout Packets Acks

82 Fall 2007CSci232: Transport Layer & TCP81 SACK Basic problem is that cumulative acks provide little information Selective acknowledgement (SACK) essentially adds a bitmask of packets received –Implemented as a TCP option –Encoded as a set of received byte ranges (max of 4 ranges/often max of 3) When to retransmit? –Still need to deal with reordering  wait for out of order by 3pkts

83 Fall 2007CSci232: Transport Layer & TCP82 SACK Time Sequence No X X X X Now what? – send retransmissions as soon as detected Packets Acks

Download ppt "Fall 2007CSci232: Transport Layer & TCP1 Transport Layer  Transport Layer Services  connection-oriented vs. connectionless  multiplexing and demultplexing."

Similar presentations

Ads by Google