Congestion control and TCP 2007

Congestion control and TCP 2007

Congestion sources and collapse
Overview Congestion sources and collapse [RJ90] Raj Jain, Congestion control in computer networks: issues and trends, IEEE Network Magazine, May 1990, pp Congestion control basics [CJ89] D.M. Chiu and R. Jain, Analysis of the Increase and Decrease Algorithms for Congestion Avoidance in Computer Networks,” Computer Networks and ISDN Systems, Vol. 17, pp. 1-14, 1989. TCP congestion control [JK88] V. Jacobson and M. Karels, Congestion Avoidance and Control, In Proc. ACM SIGCOMM '88 (Stanford, CA, August, 1988).

What is Congestion? Sum of demands > available resources (during an interval) Can be persistent, e.g. network under provisioned Can be transient, e.g. when a burst of traffic comes Congestion collapse is not just a theory Has been frequently observed in many networks

What’s Really Happening?
Knee – point after which Throughput increases very slow Delay increases fast Cliff – point after which Throughput starts to decrease very fast to zero (congestion collapse) Delay approaches infinity Note (in an M/M/1 queue) Delay = 1/(1 – utilization) packet loss knee cliff Throughput congestion collapse Load Delay Load

Congestion Control Myths
Congestion can be solved by throwing in: Myth1: More buffer space at the routers Myth2: Faster links Myth3: Faster processors Myth4: Slower access links and high speed backbone trunks

Myth1: More buffer space at the routers Too much memory in the intermediate nodes is as harmful as too little memory

Myth2: Faster links Introducing a high-speed link may reduce the performance

Myth3: Faster processors A balanced configuration with all processors & links at the same speed is also susceptible to congestion

Summary Packet loss due to buffer shortage is a symptom not a cause of congestion. sum of demands > available resources More unbalanced networks that are causing congestion. Congestion is a dynamic resource allocation problem. It cannot be solved with static solutions alone. We need protocol designs that protect networks in the event of congestion. We need congestion control.

Congestion control schemes
Resource Creation Schemes Power increases on satellite links to increase their bandwidths Dialup links Path splitting so that extra traffic is sent via routers Demand Reduction Schemes Service Denial Schemes (Admission Control) Load Degradation Schemes (Dynamic window) Scheduling Schemes (contention, polling, priority, reservation)

Requirements for congestion control
Scheme must be socially optimal allow the total network performance to be maximized (Uses network resources efficiently) Scheme must be fair network resource allocation Scheme must be responsive match the demand dynamically to the available capacity Scheme must have a low overhead Scheme must work in bad environments errors, lost packets, deadlocks

Where to Prevent Collapse?
Can end hosts prevent problem? Yes, but must trust end hosts to do right thing E.g., sending host must adjust amount of data it puts in the network based on detected congestion Can routers prevent collapse? No, not all forms of collapse Routers can help Sending accurate congestion signals Isolating well-behaved from ill-behaved sources

Policies that affect congestion

Principle of Control Control and feed-back rates should be similar
Otherwise, oscillatory or irresponsive Combination of schemes working at transport, networking and data link layers are required different time scale Dynamic schemes are good only for transient congestion long-term congestion needs extra resources

Overview Congestion sources and collapse Congestion control basics
TCP congestion control

Congestion Control vs. Congestion Avoidance
Congestion control goal Stay left of cliff Congestion avoidance goal Stay left of knee Operate near the knee point Load Throughput knee cliff congestion collapse

Objectives Simple router behavior Distributedness
Efficiency: Xknee = Sxi(t) Fairness: (Sxi)2/n(Sxi2) -> 1 Power: (throughput a /delay) Convergence: control system must be stable

Basic Control Model Let’s assume window-based control
Reduce window when congestion is perceived How is congestion signaled? Either mark or drop packets When is a router congested? Drop tail queues – when queue is full Average queue length – at some threshold Increase window otherwise Probe for available bandwidth – how?

Control System Model [CJ89]
User 1 x1 x2 xi>Xgoal  User 2 xn User n y Simple, yet powerful model Explicit binary signal of congestion

Linear Control Many different possibilities for reaction to congestion and probing Examine simple linear controls Window (t + 1) = a + b Window(t) Different (ai, bi) for increase and (ad, bd) for decrease Supports various reaction to signals Increase/decrease additively Increased/decrease multiplicatively Which of the four combinations is optimal?

Phase plots What are desirable properties?
What if flows are not equal? Efficiency Line Fairness Line User 1’s Allocation x1 User 2’s Allocation x2 Optimal point Overload Underutilization

Muliplicative Increase/Decrease
Both X1 and X2 increase by the same factor over time Extension from origin – constant fairness T0 T1 Efficiency Line Fairness Line User 1’s Allocation x1 User 2’s Allocation x2

Additive Increase/Decrease
Both X1 and X2 increase/decrease by the same amount over time Additive increase improves fairness and additive decrease reduces fairness Fairness Line T1 User 2’s Allocation x2 T0 Efficiency Line User 1’s Allocation x1

Constraints Distributed efficiency
I.e., Window(t+1) > Window(t) during increase (each user) ai > 0 and bi >= 1 Similarly, ad = 0 and bd < 1 Must never decrease fairness a & b’s must be >= 0 ai/bi > 0 and ad/bd >= 0 Full constraints ai > 0 and bi >= 1 ad = 0 and 0 =< bd < 1

What is the Right Choice?
Constraints limit us to AIMD (Additive Increase + Muliplicative Decrease) AIMD moves towards optimal point x0 x1 x2 Efficiency Line Fairness Line User 1’s Allocation x1 User 2’s Allocation x2

Modeling [CJ89] provides theoretical basis for congestion avoidance mechanism Good science for understanding why it works so well Great engineering built useful protocol: TCP Reno etc. Critical to understanding complex systems [CJ89] model relevant after 15 years, 106 increase of bandwidth, 1000x increase in number of users Criteria for good models Two conflicting goals: reality and simplicity Realistic, complex model  too hard to understand, too limited in applicability Unrealistic, simple model  can be misleading

Overview Congestion sources and collapse Congestion control basics
TCP congestion control

A Little Bit of History Until 1988 TCP with fixed window (serious problems with loss and retransmissions!!) In 1988, adaptive TCP window was introduced (Van Jacobsen) Various “improved” versions: Tahoe, Reno, Vegas, New Reno, Snoop, Westwood, Peach etc ( ) Recently, the strict “end to end” TCP paradigm has been relaxed Active Queue Management (AQM) Explicit Congestion Notification (ECN) - explicit network feedback; eXplicit Control Protocol (XCP) Hybrid end to end and network feedback model

Flow Control vs Congestion Control
TCP Flow Control (not same with flow ctl in K-13) Make sure receiving end can handle data Negotiated end-to-end, with no regard to network Ends must ensure that no more than W packets are in flight Receiver ACKs packets When sender gets an ACK, it knows packet has arrived TCP Congestion Control Can the network handle the rate of data? preventing network buffers overflow Determined end-to-end, but TCP is making guesses about the state of the network

Goals (Congestion Control)
Operate near the knee point Remain in equilibrium How to maintain equilibrium? Use ACK: send a new packet only after you receive and ACK. Design Goals (best effort flow/congestion control) Efficient (good resource utilization; low O/H) Fair (ie, max-min fair) Stable (converges to equilibrium point; no intermittent “capture”) Scaleable (eg, limit on per flow processing O/H)

TCP Congestion Control - Solutions
Reaching equilibrium Slow start Adapting to resource availability Congestion avoidance Eliminates spurious retransmissions (error control) Accurate estimation RTO (retransmission timeout) Fast retransmit (- cumulative acks) Fast recovery

TCP Congestion Control
Maintains three variables: flow_win – flow window (receiver advertised window) cwnd – congestion window ssthresh – threshold size (window) (used to update cwnd) For sending use: win = min(flow_win, cwnd)

TCP: Slow Start Goal: reach knee quickly How?
Quickly increase cwnd until network congested  get a rough estimate of the optimal of cwnd Upon starting (or restarting): Set cwnd =1 Each time a segment is acknowledged increment cwnd by one (cwnd++) Slow Start is not actually slow cwnd increases exponentially

Congestion Avoidance Slow down “Slow Start”
ssthresh is lower-bound guess about location of knee If cwnd > ssthresh then each time a segment is acknowledged increment cwnd by 1/cwnd (cwnd += 1/cwnd) So cwnd is increased by one only if all segments have been acknowlegded

Congestion Window (Tahoe)
cwnd Timeout Congestion Avoidance Slow Start Time

Fast Retransmit prevent expensive timeouts
Don’t wait for window to drain Resend a segment after 3 duplicate ACKs remember a duplicate ACK means that an out-of sequence segment was received ACK 2 segment 1 cwnd = 1 cwnd = 2 segment 2 segment 3 ACK 4 cwnd = 4 segment 4 segment 5 segment 6 segment 7 ACK 3 3 duplicate ACKs

Fast Recovery No need to slow start again
After a fast-retransmit set cwnd to ssthresh/2 i.e., don’t reset cwnd to 1 But when RTO expires still do cwnd = 1 Fast Retransmit and Fast Recovery Implemented by TCP Reno Most widely used version of TCP today Lesson: avoid RTOs at all costs!

Fast Retransmit and Fast Recovery (Reno)
Time cwnd Slow Start Congestion Avoidance Timeout Fast retransmit Fast recovery Fast retransmit: retransmit a segment after 3 duplicated acks prevent expensive timeouts Fast recovery: reduce cwnd to half instead of to one No need to slow start again At steady state, cwnd oscillates around the optimal window size. But when RTO expires still do cwnd = 1

Reflections on TCP Assumes that all sources cooperate
Assumes that congestion occurs on time scales greater than 1 RTT Only useful for reliable, in order delivery, non-real time applications Vulnerable to non-congestion related loss (e.g. wireless) Can be unfair to long RTT flows

TCP Modeling Given the congestion behavior of TCP can we predict what type of performance (bandwidth BW) we should get? What are the important factors Loss rate Affects how often window is reduced RTT Affects increase rate and relates BW to window RTO (retransmission timeout) Affects performance during loss recovery MSS (maximum segment/packet size) Affects increase rate

TCP Behavior on steady state
Let’s concentrate on steady state behavior with no timeouts and perfect loss recovery W/2 RTT Window Time a loss event +1 per RTT W/2 W

Simple TCP Model Some additional assumptions Fixed RTT No delayed ACKs
In steady state, TCP losses packet each time window reaches W packets Window drops to W/2 packets Each RTT window increases by 1 packet  W/2 * RTT before next loss BW = MSS * avg window/RTT = MSS*(W+W/2)/(2*RTT) BW = .75 * MSS * W / RTT

Simple Loss Model W = sqrt (8/3p) BW = MSS / (RTT * sqrt (2p/3))
What was the loss event rate p? Packets transferred (between loss) = (.75 W/RTT) * (W/2 * RTT) = 3W2/8 Each RTT window increases by 1 packet  W/2 * RTT before next loss  loss event rate = p = 8/(3W2) W = sqrt (8/3p) BW = .75 * MSS * W / RTT BW = MSS / (RTT * sqrt (2p/3)) BW ~ 1/sqrt(p) Ex. MSS = 8 kb, RTT = 0.1 s p = 1.5E-4 -> BW = 8 Mbps p = 1.5E-6 -> BW = 80 Mbps p = 1.5E-10 -> BW = 8 Gbps

TCP Friendliness What does it mean to be TCP friendly?
TCP is not going away. Any new congestion control must compete with TCP flows Should not clobber TCP flows & grab bulk of link Should also be able to hold its own, i.e. grab its fair share, or it will never become popular How is this quantified/shown? If it shows 1/sqrt(p) loss/throughput behavior it is ok But is this really true?

Engineering vs Science in CC
Behavior of TCP Are packets smoothly paced? NO! Ack-compression Are long-lived flows nicely interleaved? NO! How does throughput depend on drop rate? Tput ~ 1/sqrt(d) Engineering vs Science in CC Great engineering built useful protocol TCP Reno, etc. Good science by CJ and others Basis for understanding why it works so well

Issues with TCP Fairness: High speeds: Short flows:
Throughput depends on RTT High speeds: to reach 10Gbps, packet losses occur every 90 minutes! Short flows: How to set initial cwnd properly What about flows that want congestion control, but don’t want reliable delivery? (streaming)

TCP: Cooperation and Compatibility
TCP assumes all flows employ TCP-like congestion control TCP-friendly or TCP-compatible Selfish flows: can get all the bandwidth they like If new congestion control algorithms are developed, they must be TCP-friendly

Congestion control and TCP 2007

Similar presentations

Presentation on theme: "Congestion control and TCP 2007"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Congestion control and TCP 2007

Similar presentations

Presentation on theme: "Congestion control and TCP 2007"— Presentation transcript:

Similar presentations

About project

Feedback