Transport Layer 3-2 TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581 r full duplex data: m bi-directional data flow in same connection r connection-oriented: m handshaking (exchange of control msgs) r flow control: m sender will not overwhelm receiver r congestion control r point-to-point: m one sender, one receiver r reliable, in-order byte stream: m no “message boundaries” r pipelined: m TCP congestion and flow control set window size r send & receive buffers
Transport Layer 3-3 TCP segment structure source port # dest port # 32 bits application data (variable length) sequence number acknowledgement number Receive window Urg data pnter checksum F SR PAU head len not used Options (variable length) URG: urgent data (generally not used) ACK: ACK # valid PSH: push data up now (generally not used) RST, SYN, FIN: connection estab (setup, teardown commands) # bytes rcvr willing to accept counting by bytes of data (not segments!) Internet checksum (as in UDP)
Transport Layer 3-4 Data = 500,000 bytes MSS = 1000 bytes
TCP seq. #’s and ACKs Transport Layer 3-5 Seq. #’s: m byte stream “number” of first byte in segment’s data ACKs: m seq # of next byte expected from other side m cumulative ACK(as in GBN) m Buffers at receiver can hold out-of-order bytes, hence when ACK for “n” is not received, retransmit only “n” and not n+1 and all others.(as in SR) m Piggybacked ACKS
Transport Layer 3-6 TCP Round Trip Time and Timeout Q: how to set TCP timeout value? r longer than RTT m but RTT varies r too short: premature timeout m unnecessary retransmissions r too long: slow reaction to segment loss Q: how to estimate RTT? SampleRTT : measured time from segment transmission until ACK receipt m ignore retransmissions SampleRTT will vary, want estimated RTT “smoother” average several recent measurements, not just current SampleRTT
Transport Layer 3-7 TCP Round Trip Time and Timeout EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT r Exponential weighted moving average r influence of past sample decreases exponentially fast typical value: = 0.125
Transport Layer 3-9 TCP Round Trip Time and Timeout Setting the timeout EstimatedRTT plus “safety margin” large variation in EstimatedRTT -> larger safety margin 1. Estimate how much SampleRTT deviates from EstimatedRTT: TimeoutInterval = EstimatedRTT + 4*DevRTT DevRTT = (1- )*DevRTT + *|SampleRTT-EstimatedRTT| (typically, = 0.25) 2. Then set timeout interval:
Transport Layer 3-10 TCP reliable data transfer r TCP creates rdt service on top of IP’s unreliable service r Pipelined segments r Cumulative acks r TCP uses single retransmission timer r Retransmissions are triggered by: m timeout events m duplicate acks r Initially consider simplified TCP sender: m ignore duplicate acks m ignore flow control, congestion control
Transport Layer 3-11 TCP sender events: data rcvd from app: r Create segment with seq # r start timer if not already running (think of timer as for oldest unacked segment) expiration interval: TimeOutInterval (as calculated earlier) timeout: r retransmit segment that caused timeout r restart timer Ack rcvd: r If acknowledges previously unacked segments m update what is known to be acked For example, if ACK=150, then assume that segments with sequence #s 0, 50 and100 have been received at the receiver. m restart timer if there are outstanding segments
Transport Layer 3-12 TCP: retransmission scenarios Host A Seq=100, 20 bytes data ACK=100 time premature timeout Host B Seq=92, 8 bytes data ACK=120 Seq=92, 8 bytes data Seq=92 timeout ACK=120 Host A Seq=92, 8 bytes data ACK=100 loss timeout lost ACK scenario Host B X Seq=92, 8 bytes data ACK=100 time Seq=92 timeout SendBase = 100 SendBase = 120 SendBase = 120 Sendbase = 100
Transport Layer 3-13 TCP retransmission scenarios (more) Host A Seq=92, 8 bytes data ACK=100 loss timeout Cumulative ACK scenario Host B X Seq=100, 20 bytes data ACK=120 time SendBase = 120
Transport Layer 3-14 Doubling the timeout interval r Timeout interval for a segment is doubled every time it has to be re-transmitted. r This is a limited form of congestion control m If the segments are being dropped or are taking longer to get through the network, then TCP eases the congestion in the network, by delaying the retransmission of segments. m Similar to CSMA/CD in Ethernet.
Transport Layer 3-15 Fast Retransmit r Time-out period often relatively long: m long delay before resending lost packet r Detect lost segments via duplicate ACKs. m Sender often sends many segments back-to- back m If segment is lost, there will likely be many duplicate ACKs. r If sender receives 3 ACKs for the same data, it assumes that segment after ACKed data was lost: m fast retransmit
Transport Layer 3-16 TCP Flow Control r receive side of TCP connection has a receive buffer: r speed-matching service: matching the send rate to the receiving app’s drain rate r app process may be slow at reading from buffer sender won’t overflow receiver’s buffer by transmitting too much, too fast flow control
Transport Layer 3-17 TCP Flow control: how it works (Suppose TCP receiver discards out-of-order segments) spare room in buffer RcvWindow = RcvBuffer-[LastByteRcvd - LastByteRead] e.g. RCVWindow = 1000 – [150 – 50] Rcvr advertises spare room by including value of RcvWindow in segments Sender limits unACKed data to RcvWindow m guarantees receive buffer doesn’t overflow m Maintain: LastByteSent-LastByteAcked <= RecvWindow
Transport Layer 3-18 TCP Connection Management Recall: TCP sender, receiver establish “connection” before exchanging data segments r initialize TCP variables: m seq. #s Each host randomly picks an initial value for the sequence number. Following this, sequence number is updated based on number of bytes of data is sent. buffers, flow control info (e.g. RcvWindow ) Three way handshake: Step 1: client host sends TCP SYN segment to server m specifies initial seq # (client_isn) m SYN = 1 m no data Step 2: server host receives SYN, replies with SYNACK segment m server allocates buffers m specifies server initial seq. # (server_seq) m SYN = 1, ACK = client_isn + 1 Step 3: client receives SYNACK, replies with ACK segment, which may contain data. m Client allocates buffers m SYN = 0, ACK = server_isn +1
Transport Layer 3-19 RST and FIN r FIN is the bit set when a client is ready to close the connection. r RST is used by the server to indicate that the port and IP address do not match any sockets on the server.
Transport Layer 3-20 Principles of Congestion Control Congestion: r informally: “too many sources sending too much data too fast for network to handle” r different from flow control!
Transport Layer 3-21 Costs of congestion: scenario 1 r two senders, two receivers r one router, infinite buffers r no retransmission unlimited shared output link buffers Host A in : original data Host B out As the link capacity R is approached, the delay in the system increases.
Transport Layer 3-22 Costs of congestion: scenario 2 r one router, finite buffers, reliable data transfer r sender retransmission of lost packet finite shared output link buffers Host A in : original data Host B out ' in : original data, plus retransmitted data (offered load)
Transport Layer 3-23 Costs of congestion: scenario 2 r Reduced transmission rate m With finite buffers, packets will be dropped and re- transmissions are required m Depending on the length of timeout interval, retransmissions may occur for packets that are delayed (not lost) and hence the transmission rate decreases.
Transport Layer 3-24 Causes/costs of congestion: scenario 3 r four senders r multihop paths r timeout/retransmit finite shared output link buffers Host A Host B Consider what happens when 2 senders compete Host C Host D
Transport Layer 3-25 Costs of congestion: scenario 3 r when packet dropped, any “upstream" transmission capacity used for that packet was wasted!
Transport Layer 3-26 Approaches towards congestion control End-to-end congestion control: r no explicit feedback from network r congestion inferred from end-system observed loss, delay r approach taken by TCP Network-assisted congestion control: r routers provide feedback to end systems m bits indicating congestion m explicit rate sender should send at Two broad approaches towards congestion control: