Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reliable Transport II: TCP and Congestion Control

Similar presentations


Presentation on theme: "Reliable Transport II: TCP and Congestion Control"— Presentation transcript:

1 Reliable Transport II: TCP and Congestion Control
Brad Karp UCL Computer Science CS 6007/GC15/GA07 27th - 28th February, 2008

2 Outline Packet header format Connection establishment
Data transmission Retransmit timeouts RTT estimator AIMD Congestion control Throughput, loss, and RTT equation Connection teardown Protocol state machine

3 TCP Packet Header TCP packet: IP header + TCP header + data
TCP header: 20 bytes long Checksum covers header + “pseudo header” IP header source and destination addresses, protocol Length of TCP segment (TCP header + data)

4 TCP Header Details Connections inherently bidirectional; all TCP headers carry both data and ACK sequence numbers 32-bit sequence numbers are in units of bytes Source and destination ports multiplexing of TCP by applications UNIX: local ports below 1024 reserved (only root may use them) Window: advertisement of number of bytes advertiser willing to accept

5 TCP Connection Establishment: Motivation
Goals: Start TCP connection between two hosts Avoid mixing data from old connection in new connection Avoid confusing previous connection attempts with current one Prevent (most) third parties from impersonating (spoofing) one endpoint SYN packets (SYN flag in TCP header set) used to establish connections Use retransmission timer to recover from lost SYNs What protocol meets above goals?

6 TCP Connection Establishment: Non-Solution (I)
Use two-way handshake A sends SYN to B B accepts by returning SYN to A A retransmits SYN if not received A and B can ignore duplicate SYNs after connection established What about delayed data packets from old connection? time SYN SYN data, seqno = 1 data, seqno = 512 closed Connections shouldn’t start with constant sequence number; risks mixing data between old and new connections data, seqno = 1024 SYN SYN data, seqno = 1 data, seqno = 512 data, seqno = 1024

7 TCP Connection Establishment: Non-Solution (II)
Two-way handshake, as before But enclose random initial sequence numbers on SYNs What about delayed SYNs from old connection? A wrongly believes connection successfully established B will drop all of A’s data! time SYN, seqno = i closed Connection attempts should explicitly acknowledge which SYN they are accepting! SYN, seqno = k SYN, seqno = j data, seqno = k+1 data ignored!

8 TCP Connection Establishment: 3-Way Handshake
Set SYN on connection request Each side chooses random initial sequence number Each side explicitly ACKs the sequence number of the SYN it’s responding to SYN, seqno = i time SYN, seqno = j, ACK = i+1 seqno = i+1, ACK = j+1

9 Robustness of 3-Way Handshake: Delayed SYN
Suppose A’s SYN i delayed, arrives at B after connection closed B responds with SYN/ACK for i+1 A doesn’t recognize i+1; responds with reset, RST flag set in TCP header A rejects connection A B SYN, seqno = i closed SYN, seqno = j, ACK = i+1 time RST, ACK = j

10 Robustness of 3-Way Handshake: Delayed SYN/ACK
A attempts connection to B Suppose B’s SYN k/ACK p delayed, arrives at A during new connection attempt A rejects SYN k; sends RST to B Connection from A to B succeeds unimpeded closed SYN, seqno = i time SYN, seqno = k, ACK = p RST, ACK = k SYN, seqno = j, ACK = i+1 seqno = i+1, ACK = j+1

11 Robustness of 3-Way Handshake: Source Spoofing
Suppose host B trusts host A, based on A’s IP address e.g., allows any account creation request from host A Adversary M may not control host A, but may seek to impersonate, or spoof, host A Adversary may not need to receive data from B; only send data (e.g., “create an account l33thax0r”) Can M establish a connection to B as A? SYN, seqno = j, ACK = i+1 A B IP = A, SYN, seqno = i Unless he is on path between A and B, adversary cannot spoof A to B or vice-versa! Why: random ISNs on SYNs M IP = A, seqno = i+1, ACK = ??

12 TCP: Data Transmission (I)
Each byte numbered sequentially, mod 232 Sender buffers data in case retransmission required Receiver buffers data for in-order reassembly Sequence number (seqno) field in TCP header indicates first user payload byte in packet Receiver indicates receive window size explicitly to sender in window field in TCP header corresponds to available buffer space at receiver

13 TCP: Data Transmission (II)
Sender’s transmit window size: amount of buffer space at sender Sender uses window that is minimum of send and receive window sizes Receiver sends cumulative ACKs ACK number in TCP header names highest contiguous byte number received thus far, +1 one ACK per received packet, OR Delayed ACK also possible: receiver batches ACKs, sends one for every pair of data packets (200 ms max delay) Current window at sender: low byte advances as packets sent high byte advances as receive window updates arrive

14 Outline Packet header format Connection establishment
Data transmission Retransmit timeouts RTT estimator AIMD Congestion control Throughput, loss, and RTT equation Connection teardown Protocol state machine

15 TCP: Retransmit Timeouts
Sender sets timer for each sent packet when ACK returns, timer canceled if timer expires before ACK returns, packet resent Expected time for ACK to return: RTT TCP estimates round-trip time using EWMA measurements mi from timed packet/ACK pairs RTTi = ((1-α) x RTTi-1 + α x mi) Retransmit timeout: RTOi = β × RTTi original TCP: β = 2 Is this accurate enough? Recall dangers of too-short and too-long RTT estimates from previous lecture

16 Mean and Variance: Jacobson’s RTT Estimator
Above link load of 30% at router, β × RTTi will retransmit too early! Response to increasing load: waste bandwidth on duplicate packets Result: congestion collapse! [Jacobson 88]: estimate vi, mean deviation (EWMA of |mi – RTTi|), stand-in for variance vi = vi-1 × (1-γ) + γ × |mi-RTTi| Use RTOi = RTTi + 4vi Mean and Variance RTT estimator used by all modern TCPs

17 Retransmit Behavior Original TCP, before [Jacobson 88]:
at start of connection, send full window of packets retransmit each packet immediately after its timer expires Result: window-sized bursts of packets sent into network

18 Pre-Jacobson TCP (Obsolete!)
Time-sequence plot taken at sender Bursts of packets: vertical lines Spurious retransmits: repeats at same y value Dashed line: available 20 Kbps capacity

19 Self-Clocking: Conservation of Packets
Goal: self-clocking transmission each ACK returns, one data packet sent spacing of returning ACKs: matches spacing of packets in time at slowest link on path

20 Reaching Equilibrium: Slow Start
At connection start, sender sets congestion window size, cwnd, to pktSize (one packet’s worth of bytes), not whole window Sender sends up to minimum of receiver’s advertised window and cwnd Upon return of each ACK until receiver’s advertised window size reached, increase cwnd by pktSize bytes “Slow” means exponential window increase! Takes log2W RTTs to reach receiver’s advertised window size W

21 Post-Jacobson TCP: Slow Start and Mean+Variance RTT Estimator
Time-sequence plot at sender “Slower” start No spurious retransmits

22 Outline Packet header format Connection establishment
Data transmission Retransmit timeouts RTT estimator AIMD Congestion control Throughput, loss, and RTT equation Connection teardown Protocol state machine

23 Goals in Congestion Control
Achieve high utilization on links; don’t waste capacity! Divide bottleneck link capacity fairly among users Be stable: converge to a steady allocation among users Avoid congestion collapse

24 Congestion Collapse Cliff behavior observed in [Jacobson 88] Knee
Throughput (bps) Offered load (bps) Cliff behavior observed in [Jacobson 88]

25 Congestion Requires Slowing Senders
Recall: bigger buffers cannot prevent congestion Senders must slow to alleviate congestion Absence of ACKs implicitly indicates congestion TCP sender’s window size determines sending rate Recall: correct window size is bottleneck bandwidth-delay product How can sender learn this value? Search for it, by adapting window size Feedback from network: ACKs return (window OK) or do not return (window too big)

26 Avoiding Congestion: Multiplicative Decrease
Recall that sender uses sending window of size min(cwnd, rwnd), where rwnd is receiver’s advertised window Upon timeout for sent packet, sender presumes packet lost to congestion, and: sets ssthresh = cwnd / 2 sets cwnd = pktSize uses slow start to grow cwnd up to ssthresh End result: cwnd = cwnd / 2, via slow start Sender sends one window per RTT; halving cwnd halves transmit rate

27 Avoiding Congestion: Additive Increase
Drops indicate TCP sending more than its fair share of bottleneck No feedback to indicate TCP using less than its fair share of bottleneck Solution: speculatively increase window size as ACKs return Additive increase: for each returning ACK, cwnd = cwnd + (pktSize × pktSize)/cwnd Increases cwnd by ~pktSize bytes per RTT Combined algorithm: Additive Increase, Multiplicative Decrease (AIMD)

28 Refinement: Fast Retransmit (I)
Sender must wait well over RTT for timer to expire before loss detected TCP’s minimum retransmit timeout: 1 second Another loss indication: duplicate ACKs Suppose sender sends 1, 2, 3, 4, but 2 lost Receiver receives 1, 3, 4 Receiver sends cumulative ACKs 2, 2, 2 Loss causes duplicate ACKs!

29 Fast Retransmit (II) Upon arrival of 3 duplicate ACKs, sender:
sets cwnd = cwnd/2 retransmits “missing” packet no slow start Not only loss causes dup ACKs Reordering, too A B data, seqno = 1 data, seqno = 513 time data, seqno = 1025 data, seqno = 1537 ACK = 513 ACK = 513 ACK = 513 data, seqno = 513

30 AIMD in Action Sender searches for correct window size

31 Why AIMD? Other control rules possible Recall goals:
E.g., MIMD, AIAD, … Recall goals: Links fully utilized (efficient) Users share resources fairly TCP adapts all flows’ window sizes independently Must choose a control that will always converge to an efficient and fair allocation of windows

32 Equi-Fairness Line (MI)
Chiu-Jain Phase Plots Consider two users sharing a bottleneck link Plot bandwidths allocated to each Efficiency: sum of two users’ rates fixed Fairness: two users’ rates equal Equi-Fairness: ratio of two users’ rates fixed Equi-Fairness Line (MI) Fairness Line (AI) Overload User 2 (bps) Optimum Efficiency Line Underload User 1 (bps)

33 Chiu Jain: AIMD AIMD converges to optimum efficiency and fairness
Fairness Line Efficiency Line AIMD converges to optimum efficiency and fairness

34 Chiu Jain: AIAD AIAD doesn’t converge to optimum point!
Fairness Line Efficiency Line AIAD doesn’t converge to optimum point! Similar oscillations for MIMD

35 Outline Packet header format Connection establishment
Data transmission Retransmit timeouts RTT estimator AIMD Congestion control Throughput, loss, and RTT equation Connection teardown Protocol state machine

36 Modeling Throughput, Loss, and RTT
How do packet loss rate and RTT affect throughput TCP achieves? Assume: only fast retransmits no timeouts (so no slow starts in steady-state)

37 Evolution of Window Over Time
Average window size: 3W/4 One window sent per RTT Bandwidth: 3W/4 packets per RTT (3W/4 x packet size) / RTT bytes per second W depends on loss rate…

38 Loss and Window Size Assume no delayed ACKs, fixed RTT
cwnd grows by one packet per RTT So it takes W/2 RTTs to go from window size W/2 to window size W; this period is one cycle How many packets sent in total? ((3W/4) / RTT) x (W/2 x RTT) = 3W2/8 One loss per cycle (as window reaches W) loss rate: p = 8/3W2 W = sqrt(8/3p)

39 Throughput, Loss, and RTT Model
W = sqrt(8/3p) = (4/3) x sqrt(3/2p) Recall: Bandwidth: B = (3W/4 x packet size) / RTT B = packet size / (RTT x sqrt(2p/3)) Consequences: Increased loss quickly reduces throughput At same bottleneck, flow with longer RTT achieves less throughput than flow with shorter RTT!

40 Outline Packet header format Connection establishment
Data transmission Retransmit timeouts RTT estimator AIMD Congestion control Throughput, loss, and RTT equation Connection teardown Protocol state machine

41 TCP: Connection Teardown
Data may flow bidirectionally Each side independently decides when to close connection In each direction, FIN answered by ACK Must reliably terminate connection for both sides During TIME_WAIT state at first side to send FIN, ACK valid FINs that arrive Must avoid mixing data from old connection with new one During TIME_WAIT state, disallow all new connections for 2 x max segment lifetime A B FIN, seqno = i time ACK = i+1 FIN, seqno = j ACK = j+1 enter TIME_WAIT state

42 TCP: Protocol State Machine

43 Summary: TCP and Congestion Control
Connection establishment and teardown Robustness against delayed packets crucial Round-trip time estimation EWMAs estimate both RTT mean and deviation Congestion detection at sender Timeout: retransmit timer expires, half window, slow start from one packet Fast Retransmit: three duplicate ACKs, half window, no slow start Search for optimal sending window size Additive increase, multiplicative decrease (AIMD) AIMD converges to high utilization, fair sharing


Download ppt "Reliable Transport II: TCP and Congestion Control"

Similar presentations


Ads by Google