Presentation is loading. Please wait.

Presentation is loading. Please wait.

TCP Variations: Tahoe, Reno, New Reno, Vegas, Sack

Similar presentations


Presentation on theme: "TCP Variations: Tahoe, Reno, New Reno, Vegas, Sack"— Presentation transcript:

1 TCP Variations: Tahoe, Reno, New Reno, Vegas, Sack
Paul D. Amer, Professor Computer & Information Sciences University of Delaware

2 What are TCP Variations?
Implementations of TCP that use different algorithms to achieve end-to-end congestion control. Tahoe Reno NewReno Vegas

3 Van Jacobson’s algorithms
Evolution of TCP 1984 Nagel’s algorithm to reduce overhead of small packets; predicts congestion collapse 1990 4.3BSD Reno fast recovery delayed ACK’s 1987 Karn’s algorithm to better estimate round-trip time 1975 Three-way handshake Ray Tomlinson In SIGCOMM 75 1988 Van Jacobson’s algorithms slow start, congestion avoidance, fast retransmit (all implemented in 4.3BSD Tahoe) SIGCOMM 88 1983 BSD Unix 4.2 supports TCP/IP 1986 Congestion collapse 1st observed 1974 TCP described by Vint Cerf, Bob Kahn In IEEE Trans Comm 1981 TCP & IP RFC 793 & 791 1975 1980 1985 1990

4 TCP Vegas(not implemented)
Evolution of TCP 1996 NewReno modified fast recovery SACK TCP Selective Ack (Floyd et al) 1994 T/TCP Transaction TCP (Braden) 1993 TCP Vegas(not implemented) real congestion avoidance (Brakmo et al) 1994 ECN Explicit Congestion Notification (Floyd) 1996 FACK TCP Forward Ack extension to SACK (Mathis et al) 1996 Improving TCP startup (Hoe) 1993 1994 1996

5 How did TCP cause Congestion? (original recipe TCP)
Poor Efficiency In telnet-like applications, TCP sends 1 byte of data with 4000% overhead. Sending too much, too soon Unnecessary retransmits Send window too big Very little change in behavior due to congestion

6 Self-clocking or ACK Clock
Pr Pb Ar Ab Receiver Sender As Implications of ack-clocking: More batching of acks => bursty traffic Less batching leads to a large fraction of Internet traffic being just acks (overhead) Self-clocking systems tend to be very stable under a wide range of bandwidths and delays. The principal issue with self-clocking systems is getting them started.

7 TCP Algorithms: Four intertwined algorithms used commonly in TCP implementations Slow Start - Every ack increases the sender’s window (cwnd) size by 1 Congestion Avoidance - Reducing sender’s window size by half at experience of loss, and increase the sender’s window at the rate of about one packet per RTT (NOTE: not per ack) Fast Retransmit - Don’t wait for retransmit timer to go off, retransmit packet if 3 duplicate acks received Fast Recovery - Since duplicate ack came through, one packet has left the wire. Perform congestion avoidance, don’t jump down to slow start

8 Packet Ack cwnd=1 pkt 0 ack 0 cwnd=2 = pkts 1,2 acks 1,2 cwnd=4
Slow Start TCP Receiver Sender t=0r cwnd=1 pkt 0 ack 0 t=1r cwnd=2 new cwnd = old cwnd + # acks recd pkts 1,2 acks 1,2 t=2r cwnd=4 pkts3,4,5,6 t=3r cwnd=8

9 Packet Ack After cwnd reaches a certain threshold cwnd=4 cwnd=5 =
Congestion Avoidance After cwnd reaches a certain threshold TCP Receiver Sender t=xr cwnd=4 t=(x+1)r cwnd=5 new cwnd = old cwnd + # acks/cwnd t=(x+2)r cwnd=6 t=(x+3)r cwnd=7

10 Example: Slow Start/Congestion Avoidance
cwnd = 1 Example: Slow Start/Congestion Avoidance cwnd = 2 assume ssthresh = 8*MSS cwnd = 4 Eight TCP-PDUs cnwd = 8 ssthresh Eight ACKs nine TCP-PDUs cwnd = 9 nineACKs ten TCP-PDUs cwnd = 10 ten ACKs cwnd = 11

11 Slow Start & Congestion Avoidance
Initally: - cwnd = 1*MSS - ssthresh = very high If a new ACK comes: - if cwnd < ssthresh  update cwnd according to slow start if cwnd > ssthresh  update cwnd according to congestion avoidance If cwnd = ssthresh  either If timeout (i.e. loss) : - ssthresh = flight size/2; cwnd (initial) ssthresh Loss, e.g. timeout ssthresh time slow start – in green congestion avoidance – in blue

12 Packet Ack At a random point in the transfer pkt 15 pkt 16 pkt 17
Fast Retransmit At a random point in the transfer TCP Receiver Sender pkt 15 t=xr pkt 16 pkt 17 pkt 18 pkt 19 pkt 20 ack 15 ack 16 ack 16 t=(x+1)r 3 dup acks Fast Retransmit pkt 17 (retransmit)

13 TCP Tahoe’s Fast Retransmit
Sender receives 3 dupACKS. Sender infers that the segment is lost. Sender re-sends the segment immediately! Sender returns to slow-start. segment 1 cwnd = 1 ACK 1 cwnd = 2 segment 2 segment 3 ACK 2 ACK 3 cwnd = 4 segment 4 segment 5 segment 6 ACK 3 segment 7 3 duplicate ACKs ACK 3 ACK 3 Recall: a duplicate ACK means that an out-of sequence segment was received Notes: duplicate ACKs due packet reordering! if window is small don’t get duplicate ACKs! segment 4 fast-retransmit of segment 4

14 TCP Variants : TCP-Tahoe: TCP-Reno:
implements the slow start, congestion avoidance, and fast retransmit algorithms TCP-Reno: implements the slow start, congestion avoidance, fast retransmit, and fast recovery algorithms Among other implementations are Vegas, NewReno (the most commonly implemented on webservers today, according to a survey) and SACK TCP.

15 TCP Tahoe Trace (with one dropped segment)
Lost segment Fast Retransmit Begin congestion avoidance Begin slow-start

16 TCP Tahoe Trace (with one dropped segment)

17 Could Tahoe be Improved?
Receipt of dupACKs tells the sender that the receiver is still getting new segments, i.e. there is still data flowing between sender and receiver Why does sender go back to slow start after fast retransmit? Why does sender let Ack clock die?

18 TCP Variation: TCP Reno
2nd Improvement was TCP Reno (1990) Nagle’s algorithm Improved RTO calculation and back-off AIMD congestion avoidance with slow-start Fast retransmit & fast recovery

19 Fast Recovery Concept:
After fast retransmit, reduce cwnd by half, and continue sending segments at this reduced level. Problems: Sender has too many outstanding segments. How does sender transmit packets on a dupACK? Need to use a “trick” - inflate cwnd. cwnd (initial) ssthresh fast-retransmit fast-retransmit timeout new ACK new ACK Time Slow Start Congestion Avoidance “inflating” cwnd with dupACKs “deflating” cwnd with a new ACK

20 Fast Retransmit & Fast Recovery
After receiving 3 dupACKS: Retransmit the lost segment. Set ssthresh = flight size/2. Set cwnd = ssthresh, and ndupacks = N.B. In Reno: send_win = min ( rwnd, cwnd + ndupacks ). If dupACK arrives: ++ ndupacks Transmit new segment, if allowed. If new ACK arrives: ndupacks = 0 Exit fast recovery. If RTO expires: Perform slow-start - ( ssthresh = flight size/2, cwnd = 1 )

21 TCP Reno Trace (with one dropped segment)

22 TCP Tahoe & Reno Trace (with two dropped segments)

23 What if There are Multiple Losses in a Window?
With two losses in a window, Reno will occasionally timeout. With three losses in a window, Reno will usually timeout. With four losses in a window, Reno is guaranteed to timeout! With three or more losses in a window, Tahoe typically out performs Reno!

24 TCP Reno Trace (with two dropped segments)

25 TCP Variation: TCP NewReno
3rd Improvement was TCP NewReno (1995) Nagle’s algorithm Improved RTO calculation and back-off AIMD congestion avoidance with slow-start Fast retransmit & modified fast recovery

26 Modifications to Fast Recovery
Partial ACKs: An ACK that acknowledges some but not all the segments that were outstanding at the start of fast recovery. NewReno interprets this as an indication of multiple loss. If partial ACK received, re-transmit the next lost segment immediately and set ndupacks = 0 (deflate send_win). Sender remains in fast recovery until all data outstanding when fast recovery was initiated is ACK’ed. Additional dupACK’s increase ndupacks.

27 TCP NewReno Trace (with two dropped segments)

28 Tahoe, Reno & NewReno Trace (with two dropped segments)

29 Is There a Better Way? The only way Tahoe, Reno and NewReno can detect congestion is by creating congestion! They carefully probe for congestion by slowly increasing their sending rate. When they find (create), congestion, they cut sending rate at least in half! This slow advance and rapid retreat approach results in a saw-toothed sending rate, and highly erratic throughput. What if TCP could detect congestion without causing congestion?

30 TCP Variation: TVP Vegas (True Congestion Avoidance)
Introduced by Brakmo and Peterson (1994) Three changes to TCP Reno Modified congestion avoidance Don’t wait for a timeout, if actual throughput < expected throughput decrease the congestion window. (AIAD!) New retransmission mechanism motivation: what if sender never receives 3-dupACKs (due to lost segments or window size is too small.) mechanism: sender does retransmission after a dupACK received, if RTT estimate > timeout. Modified slow start motivation: sender tries finding correct window size without causing a loss.

31 Vegas vs. NewReno TCP NewReno throughput with simulated background traffic TCP Vegas throughput with simulated background traffic Source: Brakmo and Peterson, TCP Vegas: End to End Congestion Avoidance on a Global Internet, IEEE JSAC, Vol 13, No. 8, Oct. 1995, pp – 1480

32 What Variations are being used?
Experimental results obtained by testing 3728 web servers: NewReno 42% Tahoe 27% (w/o Fast Retransmit) Reno 18% Other 8% Tahoe 5% Source: Padhye and Floyd, SIGCOMM ‘01, August 27-31, San Diego, CA

33 Summary of TCP Behavior
TCP Variation Response to 3 dupACK’s Response to Partial ACK of Fast Retransmission Response to “full” ACK of Fast Retransmission Tahoe Do fast retransmit, enter slow start ++cwnd Reno enter fast recovery Exit fast recovery, deflate window, enter congestion avoidance Exit fast recovery, deflate window, enter congestion avoidance NewReno enter modified fast recovery Fast retransmit and deflate window – remain in modified fast recovery Exit modified fast recovery, deflate window, enter congestion avoidance When entering slow start, if connection is new, ssthresh = arbitrarily large value cwnd = 1. else, ssthresh = max(flight size/2, 2*MSS) In slow start ++cwnd on new ACK When entering either fast recovery or modified fast recovery, ssthresh = max(flight size/2, 2*MSS) cwnd = ssthresh. In congestion avoidance cwnd += 1*MSS per RTT

34 TCP without SACK TCP uses cumulative ACKs
Receiver identifies the last byte of data successfully received Out of order segments are not ACKed Receiver sends duplicate ACKs TCP without SACK forces the TCP sender Either to wait an RTT to find out a segment was lost Or, unnecessarily retransmit data that has been correctly received Can result in reduced overall throughput

35 TCP with Selective Ack (SACK)
SACK + Selective Repeat Retransmission Policy allows receiver informs sender about all segments that are successfully received. sender fast retransmits only the missing data segments SACK is implemented using two TCP Options SACK-Permitted Option SACK Option

36 SACK Example 1 - 100 receiver’s buffer 101 - 200 ACK 201 201-300
1-100 ACK 201 sender receiver ACK 201 SACK 1-100

37 SACK Rules With SACKs, the ACK field is still a cum ACK
A SACK cannot be sent unless the SACK-Permitted option has been received (in the SYN) The 1st SACK block MUST specify the contiguous block of data containing the segment which triggered this acknowledgment If SACKs are sent, SACK option should be included in all ACK’s which do not ACK the highest sequence number in the data receiver’s queue

38 Generating SACKs – data receiver behavior
If the data receiver has not received a SACK-Permitted Option for a given connection, the receiver must not send SACK options on that connection The receiver should send an ACK for every valid segment that arrives containing new data The data receiver should include as many distinct SACK blocks as possible in the SACK option SACK option should be filled out by repeating the most recently reported SACK blocks The data receiver provides the sender with the most up-to-date info about the state of the network and the receiver’s queue

39 Interpreting SACKs - Data Sender behavior
The sender records the SACK for future reference Maintains a retransmission queue containing unacknowledged segments One possible implementation Turns on SACK bit for the segment in retransmission queue when it receives a SACK Skips SACKed data during any later fast retransmission On fast retransmit, retransmits data not SACKed so far and less than the highest SACKed data Turns off SACK bit after retransmission time out

40 Another SACK Example receiver sender Receiver Buffer 100 299 100-299
500 300 699 ACK 300, SACK 699 300 500 900 1099 ACK 300, SACK ,

41 Another SACK Example (cont’d)
699 300 500 900 1099 sender receiver 699 300 500 900 1099 ACK 700, SACK 700 300 500 900 1099 ACK 1100 1100

42 Without SACK vs. With SACK
TCP with SACK TCP without SACK sender receiver ACK 200 ACK 200, SACK ACK 200, SACK fast retransmit ACK 600 ACK 200, SACK sender receiver ACK 200 fast retransmit ACK 600

43 SACK Observations SACK TCP follows standard TCP congestion control; Adding SACK to TCP does not change the basic underlying congestion control algorithms SACK TCP has major advantages when compared TCP Tahoe, Reno, Vegas and New Reno, as PDUs have been provided with additional information due to the SACK Difference in behavior when multiple packets are dropped from one window of data SACK information allows the sender to better decide what to retransmit and what not to

44 Any questions ? …

45 Data Receiver Reneging
Reneging – fail to fulfill a promise or obligation Data receiver is permitted to discard data in its queue that has not been acknowledged to the data sender, even if the data has already been SACKed Such discarding of SACKed segments is discouraged, but may occur if the receiver must give buffer space back to the OS If reneging occurs first SACK should reflect the newest segment even if its going to be discarded Except for the newest segment, all SACK blocks MUST NOT report any old data which is no longer actually held by the receiver

46 reneg occurs; window decreases
Reneging Example sender receiver 100 199 ACK 200 200 399 300 ACK 200; SACK reneg occurs; window decreases 200 200 window increases 599 500 200 ACK 200; SACK

47 Consequences of Reneging
Sender must maintain normal TCP timeouts Data cannot be considered “communicated” until a cum ACK is sent Sender must retransmit the data at the left window edge after a retransmit timeout, even if that data has been SACKed by the receiver Sender MUST NOT discard data before being acked by the Cum Ack

48 SACK-Permitted Option
is allowed only in a SYN Segment. indicates sender handles SACKs, and receiver should send SACKs if possible. SACK option can be used once connection is established TCP Header NOP options kind=1 Source port address Destination port address Sequence Number 6 TCP header length Cumulative Ack No. 1 SYN bit Window size Checksum Urgent pointer SACK-permitted kind=4 length=2

49 SACK-Permitted Option and SACK
SENDER RECEIVER SYN “SACK-permitted” TCP connection establishment phase SYN/ACK “SACK-permitted” ACK Cumulative Ack No. Sequence Number Source port address Destination port address SACK-permitted kind=4 length=2 Window size Urgent pointer Checksum NOP options kind=1 1 SYN bit ACK bit Cumulative Ack No. Sequence Number Source port address Destination port address SACK-permitted kind=4 length=2 Window size Urgent pointer Checksum NOP options kind=1 1 SYN bit Cumulative Ack No. Sequence Number Source port address Destination port address Window size Urgent pointer Checksum kind=1 1 ACK bit data transfer phase cum ack and optional SACKs

50 SACK Option Sequence Number Cumulative Ack No. HLEN Window size
Source port address Destination port address Length of SACK with n blocks? = (2 + 8 * n) bytes Sequence Number Cumulative Ack No. HLEN Window size Max number bytes available for TCP Options? Checksum Urgent pointer = 40 bytes Kind=1 Kind=1 Kind=5 Length=?? Left edge of 1st block Max number of SACK blocks possible? Right edge of 1st block = 4 SACK blocks (barring no other TCP Options) Left edge of nth block Right edge of nth block


Download ppt "TCP Variations: Tahoe, Reno, New Reno, Vegas, Sack"

Similar presentations


Ads by Google