CSCI-1680 Transport Layer II Data over TCP Based partly on lecture notes by David Mazières, Phil Levis, John Jannotti Theophilus Benson.

Slides:



Advertisements
Similar presentations
CSCI-1680 Transport Layer II Based partly on lecture notes by David Mazières, Phil Levis, John Jannotti Rodrigo Fonseca.
Advertisements

Congestion Control Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
CSCI-1680 Transport Layer III Congestion Control Strikes Back
TCP Congestion Control
Congestion Control Created by M Bateman, A Ruddle & C Allison As part of the TCP View project.
TCP Congestion Control Dina Katabi & Sam Madden nms.csail.mit.edu/~dina 6.033, Spring 2014.
1 TCP - Part II. 2 What is Flow/Congestion/Error Control ? Flow Control: Algorithms to prevent that the sender overruns the receiver with information.
Computer Networks: TCP Congestion Control 1 TCP Congestion Control Lecture material taken from “Computer Networks A Systems Approach”, Fourth Edition,Peterson.
Introduction to Congestion Control
Transport Layer 3-1 Fast Retransmit r time-out period often relatively long: m long delay before resending lost packet r detect lost segments via duplicate.
Computer Networks: TCP Congestion Control 1 TCP Congestion Control Lecture material taken from “Computer Networks A Systems Approach”, Third Ed.,Peterson.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #7 TCP New Reno Vs. Reno.
1 Lecture 9: TCP and Congestion Control Slides adapted from: Congestion slides for Computer Networks: A Systems Approach (Peterson and Davis) Chapter 3.
Computer Networks : TCP Congestion Control1 TCP Congestion Control.
1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.
1 Internet Networking Spring 2004 Tutorial 10 TCP NewReno.
Networks : TCP Congestion Control1 TCP Congestion Control.
Networks : TCP Congestion Control1 TCP Congestion Control Presented by Bob Kinicki.
Advanced Computer Networks: TCP Congestion Control 1 TCP Congestion Control Lecture material taken from “Computer Networks A Systems Approach”, Fourth.
CMPE 257 Spring CMPE 257: Wireless and Mobile Networking Spring 2005 E2E Protocols (point-to-point)
CSCI-1680 Transport Layer II Data over TCP Based partly on lecture notes by David Mazières, Phil Levis, John Jannotti Rodrigo Fonseca.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
1 Transport Protocols (continued) Relates to Lab 5. UDP and TCP.
EE 122: Congestion Control and Avoidance Kevin Lai October 23, 2002.
CSCI-1680 Transport Layer III Congestion Control Strikes Back Based partly on lecture notes by David Mazières, Phil Levis, John Jannotti, Ion Stoica Rodrigo.
Copyright © Lopamudra Roychoudhuri
1 TCP - Part II Relates to Lab 5. This is an extended module that covers TCP data transport, and flow control, congestion control, and error control in.
Lecture 9 – More TCP & Congestion Control
What is TCP? Connection-oriented reliable transfer Stream paradigm
Transport Layer 3-1 Chapter 3 Transport Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
Computer Networking Lecture 18 – More TCP & Congestion Control.
TCP: Transmission Control Protocol Part II : Protocol Mechanisms Computer Network System Sirak Kaewjamnong Semester 1st, 2004.
1 CS 4396 Computer Networks Lab TCP – Part II. 2 Flow Control Congestion Control Retransmission Timeout TCP:
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
TCP Congestion Control Computer Networks TCP Congestion Control 1.
1 TCP - Part II. 2 What is Flow/Congestion/Error Control ? Flow Control: Algorithms to prevent that the sender overruns the receiver with information.
1 Computer Networks Congestion Avoidance. 2 Recall TCP Sliding Window Operation.
TCP. TCP ACK generation [RFC 1122, RFC 2581] Event at Receiver Arrival of in-order segment with expected seq #. All data up to expected seq # already.
CS 6401 Congestion Control in TCP Outline Overview of RENO TCP Reacting to Congestion SS/AIMD example.
TCP Congestion Control 컴퓨터공학과 인공지능 연구실 서 영우. TCP congestion control2 Contents 1. Introduction 2. Slow-start 3. Congestion avoidance 4. Fast retransmit.
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Congestion Control 0.
CSCI-1680 Transport Layer II Data over TCP Based partly on lecture notes by David Mazières, Phil Levis, Rodrigo Fonseca John Jannotti.
@Yuan Xue A special acknowledge goes to J.F Kurose and K.W. Ross Some of the slides used in this lecture are adapted from their.
@Yuan Xue A special acknowledge goes to J.F Kurose and K.W. Ross Some of the slides used in this lecture are adapted from their.
TCP - Part II Relates to Lab 5. This is an extended module that covers TCP flow control, congestion control, and error control in TCP.
Chapter 5 TCP Sequence Numbers & TCP Transmission Control
Chapter 3 outline 3.1 transport-layer services
Chapter 6 TCP Congestion Control
Congestion Control.
Introduction to Congestion Control
CSCI-1680 Transport Layer II Data over TCP
Chapter 5 TCP Transmission Control
TCP - Part II Relates to Lab 5. This is an extended module that covers TCP flow control, congestion control, and error control in TCP.
TCP.
Lecture 19 – TCP Performance
So far, On the networking side, we looked at mechanisms to links hosts using direct linked networks and then forming a network of these networks. We introduced.
Congestion Control in TCP
Chapter 6 TCP Congestion Control
CS640: Introduction to Computer Networks
State Transition Diagram
If both sources send full windows, we may get congestion collapse
TCP Congestion Control
Transport Layer: Congestion Control
Chapter 3 outline 3.1 Transport-layer services
TCP flow and congestion control
TCP: Transmission Control Protocol Part II : Protocol Mechanisms
Presentation transcript:

CSCI-1680 Transport Layer II Data over TCP Based partly on lecture notes by David Mazières, Phil Levis, John Jannotti Theophilus Benson

TCP goal Last Class: We should not send more data than the receiver can take: flow control This Class: not send more data than the network can take: congestion control

First Goal We should not send more data than the receiver can take: flow control When to send data? –Receiver uses window header field to tell sender how much space it has –Send when there’s space!! How much data to send? –Data is sent in MSS-sized segments Chosen to avoid fragmentation

Two Problems Send a lot of small packets –Telnet or SSH generates very small packets –2 bytes of payload + N bytes of header –The Goodput will be BAD Acks wait network BW –If you send a packet that only has ACK –That’s a waste of a whole packet –Very bad throughput

Small Packets: When to Transmit? Nagle’s algorithm Goal: reduce the overhead of small packets If available data and window >= MSS Send a MSS segment else If there is unAcked data in flight buffer the new data until ACK arrives else send all the new data now Receiver should avoid advertising a window <= MSS after advertising a window of 0

Expensive Acks Delayed Acknowledgments Goal: Piggy-back ACKs on Data –Delay ACK for 200ms in case application sends data –If more data received, immediately ACK second segment –Note: never delay duplicate ACKs (if missing a segment) Warning: can interact very badly with Nagle –Temporary deadlock –Can disable Nagle with TCP_NODELAY –Application can also avoid many small writes

Limitations of Flow Control Network may be the bottleneck Signal from receiver not enough! Sending too fast will cause queue overflows, heavy packet loss Flow control provides correctness Need more for performance: congestion control

Congestion Collapse Nagle, rfc896, 1984 Mid 1980’s. Problem with the protocol implementations, not the protocol! What was happening? –Load on the network  buffers at routers fill up  round trip time increases If close to capacity, and, e.g., a large flow arrives suddenly… –RTT estimates become too short –Lots of retransmissions  increase in queue size –Eventually many drops happen (full queues) –Fraction of useful packets (not copies) decreases

AliceBob Capacity = 5 pkts Buffer= 50% full Need to get in queue And wait my turn  Waiting in a queue == longer RTT

TCP Congestion Control 3 Key Challenges –Determining the available capacity in the first place –Adjusting to changes in the available capacity –Sharing capacity between flows Idea –Each source determines network capacity for itself –Rate is determined by window size –Uses implicit feedback (drops, delay) –ACKs pace transmission (self-clocking)

Dealing with Congestion TCP keeps congestion and flow control windows –Max packets in flight is lesser of two Sending rate: ~Window/RTT The key here is how to set the congestion window to respond to congestion signals

TCP’s 3 Challenges Determining the available capacity in the first place –Exponential increase in congestion window Sharing capacity between flows –AIMD Adjusting to changes in the available capacity –Slow probing, AIMD

Starting Up Before TCP Tahoe –On connection, nodes send full (rcv) window of packets –Retransmit packet immediately after its timer expires Result: window-sized bursts of packets in network –Lots of loss --- congestion collapse

Bursts of Packets: Lots of Retransmissions Graph from Van Jacobson and Karels, 1988

Determining Initial Capacity Question: how do we set w initially? –Should start at initialCwnd MSS (to avoid overloading the network) initialCwnd is usually 1,2,4 but can be anything –Could increase additively until we hit congestion –May be too slow on fast network Start by doubling w each RTT –Then will dump at most one extra window into network –This is called slow start Slow start, this sounds quite fast! –In contrast to initial algorithm: sender would dump entire flow control window at once

Startup behavior with Slow Start

Slow start implementation Let w be the size of the window in bytes –We have w/MSS segments per RTT We are doubling w after each RTT –We receive w/MSS ACKs each RTT –So we can set w = w + MSS on every ACK At some point we hit the network limit. –Experience loss –We are at most one window size above the limit –Remember window size (ssthresh) and reduce window If no loss then get exit slow start once window > init_ssthresh

Easy Way to Detect Loss Timeout: –Loss = pkt not arriving before a time –If packet arrives after it is still considered lost Any other way? –Gap in sequence numbers at receiver –Receiver uses cumulative ACKs: drops => duplicate ACKs

TCP’s 3 Challenges Determining the available capacity in the first place –Exponential increase in congestion window (SS) Sharing capacity between flows –AIMD Adjusting to changes in the available capacity –Slow probing, AIMD

Dealing with Congestion Assume losses are due to congestion –When can this be bad? After a loss, reduce congestion window –How much to reduce? Idea: conservation of packets at equilibrium –Want to keep roughly the same number of packets in network –Analogy with water in fixed-size pipe –Put new packet into network when one exits

How much to reduce window? Crude model of the network –Let L i be the load (# pkts) in the network at time I –No congestion: N packets go in and N packets come out If network uncongested, roughly constant L i = N What happens under congestion? –Pkts hang out in queues or buffers!! Some fraction γ of packets can’t exit the network –Now L i = N + γL i-1, or L i ≈ g i L 0 –Exponential increase in congestion Sources must decrease offered rate exponentially –i.e, multiplicative decrease in window size –TCP chooses to cut window in half

AliceBob Capacity = 5 pkts Buffer=0 out of 10 Alice constantly sends 5 (N)

AliceBob Capacity = 5 pkts Buffer=0 out of 10 5 enter 5 leave Alice constantly sends 5 Load =5

How much to reduce window? Crude model of the network –Let L i be the load (# pkts) in the network at time I –No congestion: N packets go in and N packets come out If network uncongested, roughly constant L i = N What happens under congestion? –Pkts hang out in queues or buffers!! Some fraction γ of packets can’t exit the network –Now L i = N + γL i-1, or L i ≈ g i L 0 –Exponential increase in congestion Sources must decrease offered rate exponentially –i.e, multiplicative decrease in window size –TCP chooses to cut window in half

AliceBob Capacity = 5 pkts Buffer=2 out of 10 7 enter 5 leave 2 stay in N/W T0: Alice constantly sends 7 Load =7

AliceBob Capacity = 5 pkts Buffer=4 out of 10 T1: Alice constantly sends 7 7 enter 5 leave 4 stay in N/W Load =9

AliceBob Capacity = 5 pkts Buffer=6 out of 10 T2: Alice constantly sends 7 7 enter 5 leave 6 stay in N/W Load =11

How much to reduce window? Crude model of the network –Let L i be the load (# pkts) in the network at time I –No congestion: N packets go in and N packets come out If network uncongested, roughly constant L i = N What happens under congestion? –Pkts hang out in queues or buffers!! Some fraction γ of packets can’t exit the network –Now L i = N + γL i-1, or L i ≈ g i L 0 –Exponential increase in congestion Sources must decrease offered rate exponentially –i.e, multiplicative decrease in window size –TCP chooses to cut window in half

AliceBob Capacity = 5 pkts Buffer=6 out of 10 T2: Alice constantly sends 7 7 enter 5 leave 6 stay in N/W

How to use extra capacity? Network signals congestion, but says nothing of underutilization –Senders constantly try to send faster, see if it works –So, increase window if no losses… By how much? –Multiplicative increase? (w = w * 2) Easier to saturate the network than to recover Too fast, will lead to saturation, wild fluctuations –Additive increase? ( w = w + 1) Won’t saturate the network Remember fairness (third challenge)?

In both situations the sum of the flow rates should be A + B = N A=B A > B A < B Flow Rate A Flow Rate B Flow Rate A Flow Rate B

In both situations the sum of the flow rates should be A + B = N A=B A > B A < B Flow Rate A Flow Rate B Flow Rate A Flow Rate B FAIR Efficient

Chiu Jain Phase Plots Flow Rate A Flow Rate B Fair: A = B Efficient: A+B = C Goal: fair and efficient! Efficient: A + B = N

Fair and Efficient distribution of network capacity Easy to calculate if you know everything How to calculate in a distributed fashion? –Options: increase or decrease window size –Decrease in a exponential manner –How about Increase? Additive increase Multiplicative increase?

Fair and Efficient distribution of network capacity Easy to calculate if you know everything How to calculate in a distributed fashion? –Options: increase or decrease window size –Decrease in a exponential manner –How about Increase? Additive increase Multiplicative increase?

Chiu Jain Phase Plots Multiplicative Increase Multiplicative Decrease Flow Rate A Flow Rate B Fair: A = B Efficient: A+B = C MD MI T0: A+B > N T1: A/2+B/2 < N T2: (A/2)*2 + (A/2)*2 > N Increase: w = w *2 Decrease: w = 2/2

Chiu Jain Phase Plots Multiplicative Increase Multiplicative Decrease Flow Rate A Flow Rate B Fair: A = B Efficient: A+B = C MD MI T0: A+B > N T1: A/4+B/4 < N T2: (A/4)*2 + (A/4)*2 < N T3: (A/4)*2*2 + (A/4)*2*2 > N Increase: w = w *2 Decrease: w = w/4

Chiu Jain Phase Plots Multiplicative Increase Multiplicative Decrease Flow Rate A Flow Rate B Fair: A = B Efficient: A+B = C MD MI T0: A+B > N T1: A/4+B/4 < N T2: (A/4)*2 + (A/4)*2 < N T3: (A/4)*2*2 + (A/4)*2*2 > N Increase: w = w *2 Decrease: w = w/4 Return to the beginning after some number of iterations

Chiu Jain Phase Plots Additive Increase Multiplicative Decrease Flow Rate A Flow Rate B Fair: A = B Efficient: A+B = C T0: A+B > N T1: A/2+B/2 < N T2: (A/2)+1+ (A/2)+1 < N T3: (A/2)+2 + (A/2)+2 < N Increase: w = w+1 Decrease: w = w/2

Chiu Jain Phase Plots Additive Increase Multiplicative Decrease Flow Rate A Flow Rate B Fair: A = B Efficient: A+B = C T0: A+B > N T1: A/2+B/2 < N T2: (A/2)+1+ (A/2)+1 < N T3: (A/2)+2 + (A/2)+2 < N Increase: w = w+1 Decrease: w = w/2 Slowly Converge on N

AIMD Implementation In practice, send MSS-sized segments –Let window size in bytes be w (a multiple of MSS) Increase: –After w bytes ACKed, could set w = w + MSS –Smoother to increment on each ACK w = w + MSS * MSS/w (receive w/MSS ACKs per RTT, increase by MSS/(w/MSS) for each) Decrease: –After a packet loss, w = w/2 –But don’t want w < MSS –So react differently to multiple consecutive losses –Back off exponentially (pause with no packets in flight)

AIMD Trace AIMD produces sawtooth pattern of window size –Always probing available bandwidth

TCP’s 3 Challenges Determining the available capacity in the first place –Exponential increase in congestion window (SS) Sharing capacity between flows –AIMD Adjusting to changes in the network –Changes in RTT –Change in Capacity: Slow probing, AIMD

Putting it together TCP has two states: – Slow Start (SS) –Congestion Avoidance (CA) A window size threshold governs the state transition –Window <= threshold (ssthresh): slow start –Window > threshold (ssthresh): congestion avoidance –Threshold magically defined States differ in how they respond to ACKs –Slow start: w = w + MSS –Congestion Avoidance: w = w + MSS 2 /w (1 MSS per RTT)

Putting it all together Time cwnd Timeout Slow Start AIMD ssthresh Timeout Slow Start AIMD Init_ssthresh

TCP’s 3 Challenges Determining the available capacity in the first place –Exponential increase in congestion window (SS) Sharing capacity between flows –AIMD Adjusting to changes in the network –Changes in RTT. Why should you care? –Efficiently adjust to change in Capacity.

Easy Way to Detect Loss Timeout: –Loss = pkt not arriving before a time –If packet arrives after it is still considered lost Any other way? –Gap in sequence numbers at receiver –Receiver uses cumulative ACKs: drops => duplicate ACKs

Easy Way to Detect Loss Timeout: –Loss = pkt not arriving before a time –If packet arrives after it is still considered lost Any other way? –Gap in sequence numbers at receiver –Receiver uses cumulative ACKs: drops => duplicate ACKs

How to Calculate TimeOut: RTT We want an estimate of RTT so we can know a packet was likely lost, and not just delayed Key for correct operation Challenge: RTT can be highly variable –Both at long and short time scales! Both average and variance increase a lot with load Solution –Use exponentially weighted moving average (EWMA) –Estimate deviation as well as expected value

Original: Timeout Calculation Est_RTT = (1 – α) × Est_RTT + α × Sample_RTT Timeout = 2 × Est_RTT Problem 1: –When there’sa retransmission, what is definition of RTT? –Solution: only sample for segments with no retransmission Problem 2: –does not take variance into account: too aggressive when there is more load!

Old RTT Estimation

Jacobson/Karels Algorithm (Tahoe) EstRTT = (1 – α) × EstRTT + α × SampleRTT –Recommended α is DevRTT = (1 – β) × DevRTT + β | SampleRTT – EstRTT | –Recommended β is 0.25 Timeout = EstRTT + 4 DevRTT For successive retransmissions: use exponential backoff

Tahoe RTT Estimation

TCP’s 3 Challenges Determining the available capacity in the first place –Exponential increase in congestion window (SS) Sharing capacity between flows –AIMD Adjusting to changes in the network –Change in RTT: Estimated moving average –Efficiently adjust to change in Capacity.

Putting it all together Time cwnd Timeout Slow Start AIMD ssthresh Timeout Slow Start AIMD Init_ssthresh

Slow start every time? What’s Wrong? Time cwnd Timeout Slow Start AIMD ssthresh Timeout Slow Start AIMD Init_ssthresh

Slow start every time?! Losses have large effect on throughput Fast Recovery (TCP Reno) –Same as TCP Tahoe on Timeout: w = 1, slow start –On triple duplicate ACKs: w = w/2 –Retransmit missing segment (fast retransmit) –Stay in Congestion Avoidance mode

Fast Recovery and Fast Retransmit Time cwnd Slow Start AI/MD Fast retransmit Init_ssthresh

TCP Response to Loss Slow Start Triggered by a Timeout W=1 Switch to (SS) Fast Recovery Triggered by 3 dup-acks W = W/2 Stay in (CA)

TCP’s 3 Challenges Determining the available capacity in the first place –Exponential increase in congestion window (SS) Sharing capacity between flows –AIMD Adjusting to changes in the network –Change in RTT: Estimated moving average –Efficiently adjust to change in Capacity: Fast recovery

Next Class More Congestion Control fun Cheating on TCP TCP on extreme conditions TCP Friendliness TCP Future