Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass TCP: congestion control and error control Courtesy of Nitin Vaidya, UIUC.

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass TCP: congestion control and error control Courtesy of Nitin Vaidya, UIUC Kevin Lai, UC Berkeley Jim Kurose, UMass Revisit IPv6.ppt + web passwd + posting period

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass

Problem At what rate do you send data? –What is max useful sending rate for different apps? two components –flow control make sure that the receiver can receive sliding-window based flow control: –receiver reports window size to sender –higher window  higher throughput –throughput = wnd/RTT –congestion control make sure that the network can deliver

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Goals Robust –latency: 50us (LAN), 133ms (min, anywhere on Earth, wired), 1s (satellite), 260s (ave Mars) 10 4 -10 6 difference –bandwidth: 9.6Kb/s (then modem, now cellular), 10 Tb/s 10 9 difference –0-100% packet loss –path may change in middle of session (why?) –network may/may not support explicit congestion signaling Distributed control (survivability)

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Non-decreasing Efficiency under Load Efficiency = useful_work/time critical property of system design –network technology, protocol or application otherwise, system collapses exactly when most demand for its operation trade lower overall efficiency for this? Load Efficiency knee cliff ok?good

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Congestion Collapse and Efficiency knee – point after which –throughput increases slowly –delay increases quickly cliff – point after which –throughput decreases quickly to zero (congestion collapse) –delay goes to infinity Congestion avoidance –stay at knee Congestion control –stay left of (but usually close to) cliff Load Throughput Delay kneecliff over utilization under utilization saturation congestion collapse

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Transport Layer Congestion Collapse Solutions Reduce loss by increasing buffer size. Why not? if congestion, then send slower else if sending at lower than fair rate, then send faster –congestion control and avoidance (finally) –how to detect network congestion? –how to communicate allocation to sources? –how to determine efficient allocation? –how to determine fair allocation?

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Metrics for congestion control Efficiency –ratio of aggregate throughput to capacity Fairness –degree to which everyone is getting equal share Convergence time (responsiveness) –How long to get to fairness, efficiency Size of oscillation (smoothness) –dynamic system  oscillations around optimal point

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Detecting Congestion Explicit network signal –Send packet back to source (e.g. ICMP Source Quench) control traffic congestion collapse –Set bit in header (e.g. DECbit[CJ89], ECN) can be subverted by selfish receiver [SEW01] –Unless on every router, still need end-to-end signal Implicit network signal –Loss (e.g. TCP Tahoe, Reno, New Reno, SACK) +relatively robust, -no avoidance –Delay (e.g. TCP Vegas) +avoidance, -difficult to make robust –Easily deployable –Robust enough? Wireless?

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Communicating Allocation to Sources Explicit –Send packet back to source or set in packet header control traffic congestion collapse trust receiver –Need to keep per flow state (anti-Internet architecture) what happens if router fails, route changes, mobility –Unless on every router, still need end-to-end signal –Efficient, fair, responsive, smooth Implicit: Chiu and Jain 1988 –Can converge to efficiency and fairness without explicit signal of fair rate –Easily deployable –Good enough?

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Efficient Allocation Too slow –fail to take advantage of available bandwidth  underload Too fast –overshoot knee  overload, high delay, loss Everyone’s doing it –may all under/over shoot  large oscillations Optimal: –  x i =X goal Efficiency = 1 - distance from efficiency line User 1: x 1 User 2: x 2 Efficiency line 2 user example overload underload

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Fair Allocation Maxmin fairness –flows which share the same bottleneck get the same amount of bandwidth Assumes no knowledge of priorities Fairness = 1 - distance from fairness line User 1: x 1 User 2: x 2 2 user example 2 getting too much 1 getting too much fairness line

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Control System Model [CJ89] User 1 User 2 User n x1x1 x2x2 xnxn   x i > X goal y

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Possible Choices multiplicative increase, additive decrease –a I =0, b I >1, a D <0, b D =1 additive increase, additive decrease –a I >0, b I =1, a D <0, b D =1 multiplicative increase, multiplicative decrease –a I =0, b I >1, a D =0, 0<b D <1 additive increase, multiplicative decrease –a I >0, b I =1, a D =0, 0<b D <1 Which one?

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Multiplicative Increase, Additive Decrease User 1: x 1 User 2: x 2 fairness line efficiency line (x 1h,x 2h ) (x 1h +a D,x 2h +a D ) (b I (x 1h +a D ), b I (x 2h +a D )) Does not converge to fairness –Not stable at all Does not converges to efficiency –stable iff

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Additive Increase, Additive Decrease User 1: x 1 User 2: x 2 fairness line efficiency line (x 1h,x 2h ) (x 1h +a D,x 2h +a D ) (x 1h +a D +a I ), x 2h +a D +a I )) Does not converge to fairness Does not converge to efficiency –stable iff

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Multiplicative Increase, Multiplicative Decrease User 1: x 1 User 2: x 2 fairness line efficiency line (x 1h,x 2h ) (b d x 1h,b d x 2h ) (b I b D x 1h, b I b D x 2h ) Does not converge to fairness Converges to efficiency iff

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass (b I b D x 1h +a I, b I b D x 2h +a I ) Additive (and Multiplicative) Increase, Multiplicative Decrease User 1: x 1 User 2: x 2 fairness line efficiency line (x 1h,x 2h ) (b D x 1h,b D x 2h ) Converges to fairness Converges to efficiency iff –b I >=1 Increments smaller as fairness increases –effect on metrics? Additive Increase is better –why?

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Significance Characteristics –converges to efficiency, fairness –easily deployable –fully distributed –no need to know full state of system (e.g. number of users, bandwidth of links) Theory that enabled the Internet to grow beyond 1989 –key milestone in Internet development –fully distributed network architecture requires fully distributed congestion control –basis for TCP

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Modeling Critical to understanding complex systems –[CJ89] model relevant for 13 years, 10 6 increase of bandwidth, 1000x increase in number of users Criteria for good models –realistic –simple easy to work with easy for others to understand –realistic, complex model  useless –unrealistic, simple model  can teach something about best case, worst case, etc.

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass TCP Congestion Control [CJ89] provides theoretical basis –still many issues to be resolved How to start? Implicit congestion signal –loss –need to send packets to detect congestion –must reconcile with AIMD How to maintain equilibrium? –use ACK: send a new packet only after you receive ACK. Why? –maintain number of packets in network “constant”

22 TCP Congestion Control Maintains three variables: –cwnd – congestion window –flow_win – flow window: receiver advertised window –ssthresh – threshold size (used to update cwnd) For sending use: win = min(flow_win, cwnd)

23 TCP: Slow Start Goal: discover congestion quickly How? –quickly increase cwnd until network congested  get a rough estimate of the optimal of cwnd –Whenever starting traffic on a new connection, or whenever increasing traffic after congestion was experienced: Set cwnd =1 Each time a segment is acknowledged increment cwnd by one (cwnd++). Slow Start is not actually slow –cwnd increases exponentially

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Slow Start Example The congestion window size grows very rapidly TCP slows down the increase of cwnd when cwnd >= ssthresh ACK for segment 1 segment 1 cwnd = 1 cwnd = 2 segment 2 segment 3 ACK for segments 2 + 3 cwnd = 4 segment 4 segment 5 segment 6 segment 7 ACK for segments 4+5+6+7 cwnd = 8

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Congestion Avoidance Slow down “Slow Start” If cwnd > ssthresh then each time a segment is acknowledged increment cwnd by 1/cwnd (cwnd += 1/cwnd). So cwnd is increased by one only if all segments have been acknowlegded. (more about ssthresh latter)

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Slow Start/Congestion Avoidance Example Assume that ssthresh = 8 Roundtrip times Cwnd (in segments) ssthresh

27 Putting Everything Together: TCP Pseudocode Initially: cwnd = 1; ssthresh = infinite; New ack received: if (cwnd < ssthresh) /* Slow Start*/ cwnd = cwnd + 1; else /* Congestion Avoidance */ cwnd = cwnd + 1/cwnd; Timeout: /* Multiplicative decrease */ ssthresh = win/2; cwnd = 1; while (next < unack + win) transmit next packet; where win = min(cwnd, flow_win); unacknext win seq #

28 The big picture Time cwnd Timeout Slow Start Congestion Avoidance Recall knee-point and cliff-point!

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Fast Retransmit Don’t wait for window to drain Resend a segment after 3 duplicate ACKs –remember a duplicate ACK means that an out-of sequence segment was received Notes: –duplicate ACKs due to packet reordering or loss –window may be too small to get duplicate ACKs ACK 1 segment 1 cwnd = 1 cwnd = 2 segment 2 segment 3 ACK 3 cwnd = 4 segment 4 segment 5 segment 6 segment 7 ACK 1 3 duplicate ACKs ACK 4

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Fast Recovery After a fast-retransmit set cwnd to ssthresh/2 –i.e., don’t reset cwnd to 1 Fast Retransmit and Fast Recovery  implemented by TCP Reno; most widely used version of TCP today

31 Fast Retransmit and Fast Recovery Retransmit after 3 duplicated acks –prevent expensive timeouts No need to slow start again At steady state, cwnd oscillates around the optimal window size. Time cwnd Slow Start Congestion Avoidance

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Reflections on TCP assumes that all sources cooperate assumes that congestion occurs on time scales greater than 1 RTT only useful for reliable, in order delivery, non-real time applications vulnerable to non-congestion related loss (e.g. wireless) can be unfair to long RTT flows

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Principles of Reliable data transfer important in app., transport, link layers characteristics of unreliable channel will determine complexity of reliable data transfer protocol (rdt)

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Reliable data transfer: getting started send side receive side rdt_send(): called from above, (e.g., by app.). Passed data to deliver to receiver upper layer udt_send(): called by rdt, to transfer packet over unreliable channel to receiver rdt_rcv(): called when packet arrives on rcv-side of channel deliver_data(): called by rdt to deliver data to upper

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Reliable data transfer: getting started We ’ ll: incrementally develop sender, receiver sides of reliable data transfer protocol (rdt) consider only unidirectional data transfer –but control info will flow on both directions! use finite state machines (FSM) to specify sender, receiver state 1 state 2 event causing state transition actions taken on state transition state: when in this “ state ” next state uniquely determined by next event event actions

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Rdt1.0: reliable transfer over a reliable channel underlying channel perfectly reliable –no bit errors –no loss of packets separate FSMs for sender, receiver: –sender sends data into underlying channel –receiver read data from underlying channel

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Rdt2.0: channel with bit errors (no loss) underlying channel may flip bits in packet –recall: UDP checksum to detect bit errors the question: how to recover from errors: –acknowledgements (ACKs): receiver explicitly tells sender that pkt received OK –negative acknowledgements (NAKs): receiver explicitly tells sender that pkt had errors –sender retransmits pkt on receipt of NAK new mechanisms in rdt2.0 (beyond rdt1.0 ): –error detection –receiver feedback: control msgs (ACK,NAK) rcvr->sender

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass rdt2.0: FSM specification sender FSM receiver FSM

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass rdt2.0: in action (no errors) sender FSMreceiver FSM

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass rdt2.0: in action (error scenario) sender FSMreceiver FSM

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass rdt2.0 has a fatal flaw! What happens if ACK/NAK corrupted? sender doesn ’ t know what happened at receiver! can ’ t just retransmit: possible duplicate What to do? sender ACKs/NAKs receiver ’ s ACK/NAK? What if sender ACK/NAK lost? retransmit, but this might cause retransmission of correctly received pkt! Handling duplicates: sender adds sequence number to each pkt sender retransmits current pkt if ACK/NAK garbled receiver discards (doesn ’ t deliver up) duplicate pkt Sender sends one packet, then waits for receiver response stop and wait

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass rdt2.1: sender, handles garbled ACK/NAKs && has_seq0(rcvpkt) && has_seq1(rcvpkt)

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass rdt2.1: receiver, handles garbled ACK/NAKs rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt) rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt) rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt) rdt_rcv(rcvpkt) && corrupt(rcvpkt) udt_send(NACK[0]) udt_send(ACK[1]) Extract(rcvpkt,data) deliver_data(data) udt_send(ACK[1]) udt_send(NACK[1]) udt_send(ACK[0]) Extract(rcvpkt,data) deliver_data(data) udt_send(ACK[0]) rdt_rcv(rcvpkt) && corrupt(rcvpkt) rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt) Wait for 0 Wait for 1

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass rdt2.1: discussion Sender: seq # added to pkt two seq. # ’ s (0,1) will suffice. Why? must check if received ACK/NAK corrupted twice as many states –state must “ remember ” whether “ current ” pkt has 0 or 1 seq. # Receiver: must check if received packet is duplicate –state indicates whether 0 or 1 is expected pkt seq # note: receiver can not know if its last ACK/NAK received OK at sender

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass rdt2.2: a NAK-free protocol same functionality as rdt2.1, using NAKs only instead of NAK, receiver sends ACK for last pkt received OK –receiver must explicitly include seq # of pkt being ACKed duplicate ACK at sender results in same action as NAK: retransmit current pkt Sender FSM !

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass rdt3.0: channels with errors and loss New assumption: underlying channel can also lose packets (data or ACKs) –checksum, seq. #, ACKs, retransmissions will be of help, but not enough Q: how to deal with loss? –sender waits until certain data or ACK lost, then retransmits –yuck: drawbacks? Approach: sender waits “ reasonable ” amount of time for ACK retransmits if no ACK received in this time if pkt (or ACK) just delayed (not lost): –retransmission will be duplicate, but use of seq. # ’ s already handles this –receiver must specify seq # of pkt being ACKed requires countdown timer

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass rdt3.0 sender Sender FSM (no need to resend) stop timer

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt) rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq1(rcvpkt) rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt) rdt_rcv(rcvpkt) && corrupt(rcvpkt) udt_send(ACK[1]) Extract(rcvpkt,data) deliver_data(data) udt_send(ACK[1]) udt_send(ACK[0]) Extract(rcvpkt,data) deliver_data(data) udt_send(ACK[0]) rdt_rcv(rcvpkt) && corrupt(rcvpkt) rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && has_seq0(rcvpkt) Wait for 0 Wait for 1 Receiver FSM

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass rdt3.0 in action

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Performance of rdt3.0 rdt3.0 works, but performance stinks example: 1 Gbps link, 15 ms e-e prop. delay, 1KB packet: T transmit = 8kb/pkt 10**9 b/sec = 8 microsec Utilization = U = = 8 microsec 30.016 msec fraction of time sender busy sending = 0.00015 –1KB pkt every 30 ms -> 33kB/s thruput over 1 Gbps link –network protocol limits use of physical resources!

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Pipelined protocols Pipelining: sender allows multiple, “ in-flight ”, yet- to-be-acknowledged pkts –range of sequence numbers must be increased –buffering at sender and/or receiver Two generic forms of pipelined protocols: go- Back-N, selective repeat

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Go-Back-N (GBN) Sender: k-bit seq # in pkt header “ window ” of up to N, consecutive unack ’ ed pkts allowed ACK(n): ACKs all pkts up to, including seq # n - “ cumulative ACK ” –may receive duplicate ACKs (see receiver) timer for each in-flight pkt timeout(n): retransmit pkt n and all higher seq # pkts in window

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass GBN: sender extended FSM

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass GBN: receiver extended FSM receiver simple: ACK-only: always send ACK for correctly-received pkt with highest in-order seq # –may generate duplicate ACKs –need only remember expectedseqnum out-of-order pkt: –discard (don ’ t buffer) -> no receiver buffering! –ACK pkt with highest in-order seq #

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass GBN in action

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass When GBN can be not that bad? Error rate or distribution? –Long-term or short-term fading –Window size Loss rate or distribution? RTT? Link bandwidth? Complexity?

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Selective Repeat receiver individually acknowledges all correctly received pkts –buffers pkts, as needed, for eventual in-order delivery to upper layer sender only resends pkts for which ACK not received –sender timer for each unACKed pkt sender window –N consecutive seq # ’ s –again limits seq #s of sent, unACKed pkts

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Selective repeat: sender, receiver windows

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Selective repeat data from above : if next available seq # in window, send pkt timeout(n): resend pkt n, restart timer ACK(n) in [sendbase,sendbase+N]: mark pkt n as received if n smallest unACKed pkt, advance window base to next unACKed seq # sender pkt n in [rcvbase, rcvbase+N-1] send ACK(n) out-of-order: buffer in-order: deliver (also deliver buffered, in-order pkts), advance window to next not-yet-received pkt pkt n in [rcvbase-N,rcvbase-1] ACK(n) otherwise: ignore receiver

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Selective repeat in action

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Selective repeat: dilemma Example: seq # ’ s: 0, 1, 2, 3 window size=3 receiver sees no difference in two scenarios! incorrectly passes duplicate data as new in (a) Q: what relationship between seq # size and window size?

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass TCP on Mobile Ad Hoc Networks

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Overview of TCP/IP

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Internet Protocol (IP) Packets may be delivered out-of-order Packets may be lost Packets may be duplicated

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Transmission Control Protocol (TCP) Reliable ordered delivery Implements congestion avoidance and control Reliability achieved by means of retransmissions if necessary End-to-end semantics –Acknowledgements sent to TCP sender to confirm delivery of data received by TCP receiver –Ack for data sent only after data has reached receiver

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass TCP Basics Cumulative acknowledgements An acknowledgement ack’s all contiguously received data TCP assigns byte sequence numbers For simplicity, we will assign packet sequence numbers Also, we use slightly different syntax for acks than normal TCP syntax –In our notation, ack i acknowledges receipt of packets through packet i

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass 40393738 3533 Cumulative Acknowledgements A new cumulative acknowledgement is generated only on receipt of a new in-sequence packet 41403839 3537 3634 3634 i dataack i srcdest

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Duplicate Acknowledgements A dupack is generated whenever an out-of-order segment arrives at the receiver 40393738 3634 42413940 36 Dupack (Above example assumes delayed acks) On receipt of 38

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Window Based Flow Control Sliding window protocol Window size minimum of –receiver’s advertised window - determined by available buffer space at the receiver –congestion window - determined by the sender, based on feedback from the network 23456789101113112 Sender’s window Acks receivedNot transmitted

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Window Based Flow Control 23456789101113112 Sender’s window 23456789101113112 Sender’s window When receiving Ack 5 Sliding!

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Window Based Flow Control Congestion window size (W) bounds the amount of data that can be sent per round-trip time Throughput <= W / RTT

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Ideal Window Size Ideal size = delay * bandwidth –delay-bandwidth product What if window size < delay*bw ? –Inefficiency (wasted bandwidth) What if > delay*bw ? –Queuing at intermediate routers increased RTT due to queuing delays –Potentially, packet loss

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass How does TCP detect a packet loss? Retransmission timeout (RTO) Duplicate acknowledgements

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Detecting Packet Loss Using Retransmission Timeout (RTO) At any time, TCP sender sets retransmission timer for only one packet If acknowledgement for the timed packet is not received before timer goes off, the packet is assumed to be lost RTO dynamically calculated

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Retransmission Timeout (RTO) calculation RTO = mean + 4 mean deviation –Standard deviation  average of (sample – mean) –Mean deviation  average of |sample – mean| –Mean deviation easier to calculate than standard deviation –Mean deviation is more conservative  2 2

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Exponential Backoff Double RTO on each timeout Packet transmitted Time-out occurs before ack received, packet retransmitted Timeout interval doubled T1 T2 = 2 * T1 Windows: initially 3s Max. 240s

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Fast Retransmission Timeouts can take too long –how to initiate retransmission sooner? Fast retransmit

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Detecting Packet Loss Using Dupacks: Fast Retransmit Mechanism Dupacks may be generated due to –packet loss, or –out-of-order packet delivery TCP sender assumes that a packet loss has occurred if it receives three dupacks consecutively 121178910 Receipt of packets 9, 10 and 11 will each generate a dupack from the receiver. The sender, on getting these dupacks, will retransmit packet 8.

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Congestion Avoidance and Control Slow Start: cwnd grows exponentially with time during slow start When cwnd reaches slow-start threshold, congestion avoidance is performed Congestion avoidance: cwnd increases linearly with time during congestion avoidance –Rate of increase could be lower if sender does not always have data to send

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Slow start Congestion avoidance Slow start threshold Example assumes that acks are not delayed

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Congestion Control On detecting a packet loss, TCP sender assumes that network congestion has occurred On detecting packet loss, TCP sender drastically reduces the congestion window Reducing congestion window reduces amount of data that can be sent per RTT

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Congestion Control -- Timeout On a timeout, the congestion window is reduced to the initial value of 1 MSS The slow start threshold is set to half the window size before packet loss –more precisely, ssthresh = maximum of min(cwnd,receiver’s advertised window)/2 and 2 MSS Slow start is initiated

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass ssthresh = 8 ssthresh = 10 cwnd = 20 After timeout

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Congestion Control - Fast retransmit Fast retransmit occurs when multiple (>= 3) dupacks come back Fast recovery follows fast retransmit Different from timeout : slow start follows timeout –timeout occurs when no more packets are getting across –fast retransmit occurs when a packet is lost, but latter packets get through –ack clock is still there when fast retransmit occurs –no need to slow start

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Fast Recovery ssthresh = min(cwnd, receiver’s adv. window)/2 –(at least 2 MSS) retransmit the missing segment (fast retransmit) cwnd = ssthresh + number of dupacks –Temporary inflation when a new ack comes: cwnd = ssthreh –enter congestion avoidance Congestion window cut into half

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass After fast retransmit and fast recovery window size is reduced in half. Receiver’s advertised window After fast recovery

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass TCP Performance in Mobile Ad Hoc Networks (MANETs)

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Performance of TCP Several factors affect TCP performance in MANET: Wireless transmission errors Multi-hop routes on shared wireless medium –For instance, adjacent hops typically cannot transmit simultaneously Route failures due to mobility

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Random Errors If number of bit errors is small, they may be corrected by an error correcting code Excessive bit errors result in a packet being discarded, possibly before it reaches the transport layer

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Random Errors May Cause Fast Retransmit 40393738 3634 Example assumes delayed ack - every other packet ack’d

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Random Errors May Cause Fast Retransmit 41403839 3634 Example assumes delayed ack - every other packet ack’d

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Random Errors May Cause Fast Retransmit 42413940 36 Duplicate acks are not delayed 36 dupack

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Random Errors May Cause Fast Retransmit 40 36 Duplicate acks 414342

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Random Errors May Cause Fast Retransmit 41 36 3 duplicate acks trigger fast retransmit at sender 424443 36

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Random Errors May Cause Fast Retransmit Fast retransmit results in –retransmission of lost packet –reduction in congestion window Reducing congestion window in response to errors is unnecessary Reduction in congestion window reduces the throughput

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Sometimes Congestion Response May be Appropriate in Response to Errors On a CDMA channel, errors occur due to interference from other user, and due to noise [Karn99pilc] –Interference due to other users is an indication of congestion. If such interference causes transmission errors, it is appropriate to reduce congestion window –If noise causes errors, it is not appropriate to reduce window When a channel is in a bad state for a long duration, it might be better to let TCP backoff, so that it does not unnecessarily attempt retransmissions while the channel remains in the bad state [Padmanabhan99pilc] IETF Performance Implications of Link Characteristics (pilc)

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Impact of Random Errors [Vaidya99] Exponential error model 2 Mbps wireless full duplex link No congestion losses

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Burst Errors May Cause Timeouts If wireless link remains unavailable for extended duration, a window worth of data may be lost –E.g., driving through a tunnel Timeout results in slow start Slow start reduces congestion window to 1 MSS, reducing throughput Reduction in window in response to errors unnecessary

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Random Errors May Also Cause Timeout Multiple packet losses in a window can result in timeout when using TCP-Reno (and to a lesser extent when using SACK)

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Impact of Transmission Errors TCP cannot distinguish between packet losses due to congestion and transmission errors Unnecessarily reduces congestion window Throughput suffers

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Mobile Ad Hoc Networks May need to traverse multiple links to reach a destination

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Mobile Ad Hoc Networks Mobility causes route changes

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Throughput over Multi-Hop Wireless Paths Connections over multiple hops are at a disadvantage compared to shorter connections, because they have to contend for wireless access at each hop

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Impact of Multi-Hop Wireless Paths TCP Throughput using 2 Mbps 802.11 MAC

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Throughput Degradations with Increasing Number of Hops Packet transmission can occur on at most one hop among three consecutive hops –Increasing the number of hops from 1 to 2, 3 results in increased delay, and decreased throughput Increasing number of hops beyond 3 allows simultaneous transmissions on more than one link, however, degradation continues due to contention between TCP Data and Acks traveling in opposite directions When number of hops is large enough, the throughput stabilizes due to effective pipelining

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Performance Metric Expected throughput 3 - 2 - 1 - 0 Time (seconds) 010203040 Min Path Length (Hops) f i = fraction of time TCP source and receiver are i hops away T i = TCP throughput across an i-hop network 10 40 30 40 T1T1 +T2T2 T1T1 T2T2

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Impact of Mobility TCP Throughput Ideal throughput (Kbps) Actual throughput 2 m/s10 m/s

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Impact of Mobility Ideal throughput Actual throughput 20 m/s 30 m/s

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Throughput generally degrades with increasing speed … Speed (m/s) Average Throughput Over 50 runs Ideal Actual

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass But not always

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass mobility causes link breakage, resulting in route failure TCP data and acks en route discarded Why Does Throughput Degrade? TCP sender times out. Starts sending packets again Route is repaired No throughput despite route repair

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass mobility causes link breakage, resulting in route failure TCP data and acks en route discarded Why Does Throughput Degrade? TCP sender times out. Backs off timer. Route is repaired TCP sender times out. Resumes sending Larger route repair delays especially harmful No throughput despite route repair t 2t

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Why Does Throughput Improve? Low Speed Scenario C B D A C B D A C B D A 1.5 second route failure Route from A to D is broken for ~1.5 second. When TCP sender times after 1 second, route still broken. TCP times out after another 2 seconds, and only then resumes.

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Why Does Throughput Improve? Higher (double) Speed Scenario C B D A C B D A C B D A 0.75 second route failure Route from A to D is broken for ~ 0.75 second. When TCP sender times after 1 second, route is repaired.

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Why Does Throughput Improve? General Principle The previous two slides show a plausible cause for improved throughput TCP timeout interval somewhat (not entirely) independent of speed Network state at higher speed, when timeout occurs, may be more favorable than at lower speed Network state –Link/route status –Route caches –Congestion

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass How to Improve Throughput (Bring Closer to Ideal) Network feedback Inform TCP of route failure by explicit message Let TCP know when route is repaired –Probing –Explicit notification Reduces repeated TCP timeouts and backoff

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Performance with Explicit Notification

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Issues: Network Feedback Network knows best (why packets are lost) + Network feedback beneficial -Need to modify transport & network layer to receive/send feedback Need mechanisms for information exchange between layers

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Impact of Caching Route caching has been suggested as a mechanism to reduce route discovery overhead [Broch98] Each node may cache one or more routes to a given destination When a route from S to D is detected as broken, node S may: –Use another cached route from local cache, or –Obtain a new route using cached route at another node

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass To Cache or Not to Cache Average speed (m/s) Actual throughput (as fraction of expected throughput)

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Why Performance Degrades With Caching When a route is broken, route discovery returns a cached route from local cache or from a nearby node After a time-out, TCP sender transmits a packet on the new route. However, the cached route has also broken after it was cached Another route discovery, and TCP time-out interval Process repeats until a good route is found timeout due to route failure timeout, cached route is broken timeout, second cached route also broken

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Issues: To Cache or Not to Cache Caching can result in faster route “repair” Faster does not necessarily mean correct If incorrect repairs occur often enough, caching performs poorly Need mechanisms for determining when cached routes are stale

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Caching and TCP performance Caching can reduce overhead of route discovery even if cache accuracy is not very high But if cache accuracy is not high enough, gains in routing overhead may be offset by loss of TCP performance due to multiple time-outs

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass TCP Performance Two factors result in degraded throughput in presence of mobility: Loss of throughput that occurs while waiting for TCP sender to timeout (as seen earlier) –This factor can be mitigated by using explicit notifications and better route caching mechanisms Poor choice of congestion window and RTO values after a new route has been found –How to choose cwnd and RTO after a route change?

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Issues Window Size After Route Repair Same as before route break: may be too optimistic Same as startup: may be too conservative Better be conservative than overly optimistic –Reset window to small value after route repair –Let TCP figure out the suitable window size –Impact low on paths with small delay-bw product

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Issues: RTO After Route Repair Same as before route break –If new route long, this RTO may be too small, leading to timeouts Same as TCP start-up –May be too large –May result in slow response to next packet loss Another plausible approach: new RTO = function of old RTO, old route length, and new route length –Example: new RTO = old RTO * new route length / old route length –Not evaluated yet –Pitfall: RTT is not just a function of route length

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Out-of-Order Packet Delivery Out-of-order (OOO) delivery may occur due to: –Route changes –Link layer retransmissions schemes that deliver OOO Significantly OOO delivery confuses TCP, triggering fast retransmit Potential solutions: –Deterministically prefer one route over others, even if multiple routes are known –Reduce OOO delivery by re-ordering received packets can result in unnecessary delay in presence of packet loss –Turn off fast retransmit can result in poor performance in presence of congestion

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Impact of Acknowledgements TCP Acks (and link layer acks) share the wireless bandwidth with TCP data packets Data and Acks travel in opposite directions –In addition to bandwidth usage, acks require additional receive-send turnarounds, which also incur time penalty –To reduce frequency of send-receive turnaround and contention between acks and data

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass Impact of Acks: Mitigation [Balakrishnan97] Piggybacking link layer acks with data Sending fewer TCP acks - ack every d-th packet (d may be chosen dynamically) but need to use rate control at sender to reduce burstiness (for large d) Ack filtering - Gateway may drop an older ack in the queue, if a new ack arrives –reduces number of acks that need to be delivered to the sender

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass TCP: congestion control and error control Courtesy of Nitin Vaidya, UIUC.

Similar presentations

Presentation on theme: "Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass TCP: congestion control and error control Courtesy of Nitin Vaidya, UIUC."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass TCP: congestion control and error control Courtesy of Nitin Vaidya, UIUC.

Similar presentations

Presentation on theme: "Courtesy of Nitin Vaidya, UIUC, or Kevin Lai, UC Berkeley, or Jim Kurose, UMass TCP: congestion control and error control Courtesy of Nitin Vaidya, UIUC."— Presentation transcript:

Similar presentations

About project

Feedback