Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and.

Similar presentations


Presentation on theme: "Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and."— Presentation transcript:

1 Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and TCP Grid 10:45-11:00 Discussion 11:00-11:15Break 11:15-12:00 Steven Low (CS/EE, Caltech) TCP/AQM protocols and duality model 12:00-12:15 Dohy Hong (INRIA/Caltech) Synchronization effect of TCP 12:15- 1:00Lunch

2 Multi-Gbps TCP 1:00-2:00 Fernando Paganini (EE, UCLA) Control theory and stability of TCP/AQM 2:00-2:30Steven Low (CS/EE, Caltech) Stabilized Vegas (& instability of Reno/RED) 2:30-3:00 Zhikui Wang (EE, UCLA) FAST simulations 3:00-3:15Break 3:15-4:00David Wei, Cheng Hu (CS, Caltech) Some related projects and TCP kernel 4:00-4:15Break 4:15-5:00 Discussion

3 Copyright, 1996 © Dale Carnegie & Associates, Inc. Dualtiy Model of TCP/AQM Steven Low CS & EE, Caltech netlab.caltech.edu 2002

4 Acknowledgment S. Athuraliya, D. Lapsley, V. Li, Q. Yin (UMelb) S. Adlakha (UCLA), D. Choe (Postech/Caltech), J. Doyle (Caltech), K. Kim (SNU/Caltech), L. Li (Caltech), F. Paganini (UCLA), J. Wang (Caltech) L. Peterson, L. Wang (Princeton)

5 Part I Protocols

6 Window Flow Control ~ W packets per RTT Lost packet detected by missing ACK RTT time Source Destination 12W12W12W data ACKs 12W

7 Source Rate Limit the number of packets in the network to window W Source rate = bps If W too small then rate « capacity If W too big then rate > capacity => congestion

8 TCP Window Flow Controls Receiver flow control Avoid overloading receiver Set by receiver awnd:receiver (advertised) window Network flow control Avoid overloading network Set by sender Infer available network capacity cwnd: congestion window Set W = min (cwnd, awnd)

9 Receiver Flow Control Receiver advertises awnd with each ACK Window awnd closed when data is received and ack’d opened when data is read Size of awnd can be the performance limit (e.g. on a LAN) sensible default ~16kB

10 Network Flow Control Source calculates cwnd from indication of network congestion Congestion indications Losses Delay Marks Algorithms to calculate cwnd Tahoe, Reno, Vegas, RED, REM …

11 TCP Congestion Controls Tahoe (Jacobson 1988) Slow Start Congestion Avoidance Fast Retransmit Reno (Jacobson 1990) Fast Recovery Vegas (Brakmo & Peterson 1994) New Congestion Avoidance RED (Floyd & Jacobson 1993) Probabilistic marking BLUE (Feng, Kandlur, Saha, Shin 1999) REM/PI (Athuraliya et al 2000/Hollot et al 2001) Clear buffer, match rate AVQ (Kunniyur & Srikant 2001)

12 TCP/AQM Tahoe (Jacobson 1988) Slow Start Congestion Avoidance Fast Retransmit Reno (Jacobson 1990) Fast Recovery Vegas (Brakmo & Peterson 1994) New Congestion Avoidance BLUE RED (Floyd & Jacobson 1993) REM/PI (Athuraliya et al 2000, Hollot et al 2001) AVQ (Kunniyur & Srikant 2001)

13 TCP Tahoe (Jacobson 1988) SS time window CA SS: Slow Start CA: Congestion Avoidance

14 Slow Start Start with cwnd = 1 (slow start) On each successful ACK increment cwnd cwnd  cnwd + 1 Exponential growth of cwnd each RTT: cwnd  2 x cwnd Enter CA when cwnd >= ssthresh

15 Slow Start data packet ACK receiversender 1 RTT cwnd 1 2 3 4 5 6 7 8 cwnd  cwnd + 1 (for each ACK)

16 Congestion Avoidance Starts when cwnd  ssthresh On each successful ACK: cwnd  cwnd + 1/cwnd Linear growth of cwnd each RTT: cwnd  cwnd + 1

17 Congestion Avoidance cwnd 1 2 3 1 RTT 4 data packet ACK cwnd  cwnd + 1 (for each cwnd ACKS) receiversender

18 Packet Loss Assumption: loss indicates congestion Packet loss detected by Retransmission TimeOuts (RTO timer) Duplicate ACKs (at least 3) 1 23456 123 Packets Acknowledgements 3 3 7 3

19 Fast Retransmit Wait for a timeout is quite long Immediately retransmits after 3 dupACKs without waiting for timeout Adjusts ssthresh flightsize = min(awnd, cwnd) ssthresh  max(flightsize/2, 2) Enter Slow Start (cwnd = 1)

20 Successive Timeouts When there is a timeout, double the RTO Keep doing so for each lost retransmission Exponential back-off Max 64 seconds 1 Max 12 restransmits 1 1 - Net/3 BSD

21 Summary: Tahoe Basic ideas Gently probe network for spare capacity Drastically reduce rate on congestion Windowing: self-clocking Other functions: round trip time estimation, error recovery for every ACK { if (W < ssthresh) then W++ (SS) else W += 1/W (CA) } for every loss { ssthresh = W/2 W = 1 }

22 TCP Reno (Jacobson 1990) CA SS Fast retransmission/fast recovery

23 Fast recovery Motivation: prevent `pipe’ from emptying after fast retransmit Idea: each dupACK represents a packet having left the pipe (successfully received) Enter FR/FR after 3 dupACKs Set ssthresh  max(flightsize/2, 2) Retransmit lost packet Set cwnd  ssthresh + ndup (window inflation) Wait till W=min(awnd, cwnd) is large enough; transmit new packet(s) On non-dup ACK (1 RTT later), set cwnd  ssthresh (window deflation) Enter CA

24 9 9 4 00 Example: FR/FR Fast retransmit Retransmit on 3 dupACKs Fast recovery Inflate window while repairing loss to fill pipe time S R 12345687 8 cwnd8 ssthresh 1 7 4 000 Exit FR/FR 4 4 4 11 00 1011

25 Summary: Reno Basic ideas Fast recovery avoids slow start dupACKs: fast retransmit + fast recovery Timeout: fast retransmit + slow start slow start retransmit congestion avoidance FR/FR dupACKs timeout

26 Active queue management Idea: provide congestion information by probabilistically marking packets Issues How to measure congestion ( p and G )? How to embed congestion measure? How to feed back congestion info? x(t+1) = F( p(t), x(t) ) p(t+1) = G( p(t), x(t) ) Reno, Vegas DropTail, RED, REM

27 RED (Floyd & Jacobson 1993) Congestion measure: average queue length p l (t+1) = [p l (t) + x l (t) - c l ] + Embedding: p-linear probability function Feedback: dropping or ECN marking Avg queue marking 1

28 REM (Athuraliya & Low 2000) Congestion measure: price p l (t+1) = [p l (t) +  (  l b l (t)+ x l (t) - c l )] + Embedding: Feedback: dropping or ECN marking

29 REM (Athuraliya & Low 2000) Congestion measure: price p l (t+1) = [p l (t) +  (  l b l (t)+ x l (t) - c l )] + Embedding: exponential probability function Feedback: dropping or ECN marking

30 Match rate Clear buffer and match rate Clear buffer Key features Sum prices Theorem (Paganini 2000) Global asymptotic stability for general utility function (in the absence of delay)

31 Active Queue Management p l (t) G(p(t), x(t)) DropTailloss [1 - c l / x l (t) ] + (?) REDqueue [ p l (t) + x l (t) - c l ] + Vegasdelay [ p l (t) + x l (t)/c l - 1 ] + REMprice [ p l (t) +  l  b l (t)+ x l (t) - c l ) ] + x(t+1) = F( p(t), x(t) ) p(t+1) = G( p(t), x(t) ) Reno, Vegas DropTail, RED, REM

32 Congestion & performance p l (t) G(p(t), x(t)) Renoloss [1 - c l / x l (t) ] + (?) Reno/REDqueue [ p l (t) + x l (t) - c l ] + Reno/REMprice [ p l (t) +  l  b l (t)+ x l (t) - c l ) ] + Vegasdelay [ p l (t) + x l (t)/c l - 1 ] + Decouple congestion & performance measure RED: `congestion’ = `bad performance’ REM: `congestion’ = `demand exceeds supply’ But performance remains good!

33 Part II Duality Model

34 Congestion Control Heavy tail  Mice-elephants Elephant Internet Mice Congestion control efficient & fair sharing small delay queueing + propagation CDN

35 TCP x i (t)

36 TCP & AQM x i (t) p l (t) Example congestion measure p l (t) Loss (Reno) Queueing delay (Vegas)

37 Outline Protocol (Reno, Vegas, RED, REM/PI…) Equilibrium Performance Throughput, loss, delay Fairness Utility Dynamics Local stability Cost of stabilization

38 TCP & AQM x i (t) p l (t) Duality theory  equilibrium Source rates x i (t) are primal variables Congestion measures p l (t) are dual variables Flow control is optimization process

39 TCP & AQM x i (t) p l (t) Control theory  stability Internet as a feedback system Distributed & delayed

40 Outline TCP/AQM Reno/RED Equilibrium Duality model Stability & optimal control Linearized model A scalable control

41 TCP & AQM x i (t) p l (t) TCP: Reno Vegas AQM: DropTail RED REM/PI AVQ

42 TCP & AQM x i (t) p l (t) TCP: Reno Vegas AQM: DropTail RED REM/PI AVQ

43 Model structure F1F1 FNFN G1G1 GLGL R f (s) R b ’ (s) TCP Network AQM Multi-link multi-source network x y q p from F. Paganini

44 TCP Reno (Jacobson 1990) SS time window CA SS: Slow Start CA: Congestion Avoidance Fast retransmission/fast recovery

45 TCP Vegas (Brakmo & Peterson 1994) SS time window CA Converges, no retransmission … provided buffer is large enough

46 queue size for every RTT { if W/RTT min – W/RTT <  then W ++ if W/RTT min – W/RTT >  then W -- } for every loss W := W/2 Vegas model p l (t+1) = [ p l (t) + y l (t)/c l - 1 ] + Gl:Gl: Fi:Fi:

47 Vegas model F1F1 FNFN G1G1 GLGL R f (s) R b ’ (s) TCP Network AQM x y q p

48 Overview Protocol (Reno, Vegas, RED, REM/PI…) Equilibrium Performance Throughput, loss, delay Fairness Utility Dynamics Local stability Cost of stabilization

49 Outline TCP/AQM Reno/RED Equilibrium Duality model Stability Linearized model A scalable control

50 Flow control Interaction of source rates x s (t) and congestion measures p l (t) Duality theory They are primal and dual variables Flow control is optimization process Example congestion measure Loss (Reno) Queueing delay (Vegas)

51 Model c1c1 c2c2 Network Links l of capacities c l Sources i L(s) - links used by source i U i (x i ) - utility if source rate = x i x1x1 x2x2 x3x3

52 Primal problem Assumptions Strictly concave increasing U i Unique optimal rates x i exist Direct solution impractical

53 Related Work Formulation Kelly 1997 Penalty function approach Kelly, Maulloo & Tan 1998 Kunniyur & Srikant 2000 Duality approach Low & Lapsley 1999 Athuraliya & Low 2000 Extensions Mo & Walrand 2000 La & Anantharam 2000 Dynamics Johari & Tan 2000, Massoulie 2000, Vinnicombe 2000, … Hollot, Misra, Towsley & Gong 2001 Paganini 2000, Paganini, Doyle, Low 2001, …

54 Duality Approach Primal-dual algorithm:

55 Duality Model of TCP Primal-dual algorithm: Reno, VegasDropTail, RED, REM Source algorithm iterates on rates Link algorithm iterates on prices With different utility functions

56 Duality Model of TCP Primal-dual algorithm: Reno, VegasDropTail, RED, REM (x*,p*) primal-dual optimal if and only if

57 Duality Model of TCP Primal-dual algorithm: Reno, VegasDropTail, RED, REM Any link algorithm that stabilizes queue generates Lagrange multipliers solves dual problem

58 Gradient algorithm Theorem (Low & Lapsley ’99) Converge to optimal rates in distributed asynchronous environment

59 Gradient algorithm Vegas: approximate gradient algorithm

60 Summary: equilibrium Flow control problem TCP/AQM Maximize aggregate source utility With different utility functions Primal-dual algorithm Reno, VegasDropTail, RED, REM

61 Implications Performance Rate, delay, queue, loss Fairness Utility function Persistent congestion

62 Performance Delay Congestion measures: end to end queueing delay Sets rate Equilibrium condition: Little’s Law Loss No loss if converge (with sufficient buffer) Otherwise: revert to Reno (loss unavoidable)

63 Vegas Utility Equilibrium (x, p) = (F, G) Proportional fairness

64 Validation (L. Wang, Princeton) Source 1Source 3 Source 5 RTT (ms) 17.1 (17) 21.9 (22) 41.9 (42) Rate (pkts/s)1205 (1200) 1228 (1200) 1161 (1200) Window (pkts) 20.5 (20.4) 27 (26.4) 49.8 (50.4) Avg backlog (pkts) 9.8 (10) NS-2 simulation, single link, capacity = 6 pkts/ms 5 sources with different propagation delays,  s = 2 pkts/RTT meausred theory

65 Persistent congestion Vegas exploits buffer process to compute prices (queueing delays) Persistent congestion due to Coupling of buffer & price Error in propagation delay estimation Consequences Excessive backlog Unfairness to older sources Theorem (Low, Peterson, Wang ’02) A relative error of  s in propagation delay estimation distorts the utility function to

66 Validation (L. Wang, Princeton) Single link, capacity = 6 pkt/ms,  s = 2 pkts/ms, d s = 10 ms With finite buffer: Vegas reverts to Reno Without estimation errorWith estimation error

67 Validation (L. Wang, Princeton) Source rates (pkts/ms) #src1 src2 src3 src4 src5 15.98 (6) 22.05 (2) 3.92 (4) 30.96 (0.94) 1.46 (1.49) 3.54 (3.57) 40.51 (0.50) 0.72 (0.73) 1.34 (1.35) 3.38 (3.39) 50.29 (0.29) 0.40 (0.40) 0.68 (0.67) 1.30 (1.30) 3.28 (3.34) #queue (pkts)baseRTT (ms) 1 19.8 (20) 10.18 (10.18) 2 59.0 (60)13.36 (13.51) 3127.3 (127)20.17 (20.28) 4237.5 (238)31.50 (31.50) 5416.3 (416)49.86 (49.80)

68 Vegas/REM Vegas peak = 43 pkts utilization : 90% - 96%

69 Outline Protocol (Reno, Vegas, RED, REM/PI…) Equilibrium Performance Throughput, loss, delay Fairness Utility Dynamics Local stability Cost of stabilization

70 Vegas model F1F1 FNFN G1G1 GLGL R f (s) R b ’ (s) TCP Network AQM x y q p

71 Vegas model F1F1 FNFN G1G1 GLGL R f (s) R b ’ (s) TCP Network AQM x y q p

72 Approximate model

73

74 Linearized model F1F1 FNFN G1G1 GLGL R f (s) R b ’ (s) TCP Network AQM x y q p  controls equilibrium delay

75 Stability Cannot be satisfied with >1 bottleneck link! Theorem (Choe & Low, ‘02) Locally asymptotically stable if > 0.63

76 Stabilized Vegas

77 Linearized model F1F1 FNFN G1G1 GLGL R f (s) R b ’ (s) TCP Network AQM x y q p  controls equilibrium delay

78 Linearized model F1F1 FNFN G1G1 GLGL R f (s) R b ’ (s) TCP Network AQM x y q p  controls equilibrium delay choose a i = a,  i = 

79 Stability Theorem (Choe & Low, ‘02) Locally asymptotically stable if example LHS < 10*10 = 100 a = 0.1,  = 0.015   ( a,  ) = 120

80 Stability Theorem (Choe & Low, ‘02) Locally asymptotically stable if Application Stabilized TCP with current routers Queueing delay as congestion measure has the right scaling Incremental deployment with ECN

81 Papers A duality model of TCP flow controls (ITC, Sept 2000) Optimization flow control, I: basic algorithm & convergence (ToN, 7(6), Dec 1999) Understanding Vegas: a duality model (J. ACM, 2002) Scalable laws for stable network congestion control (CDC, 2001) REM: active queue management (Network, May/June 2001) netlab.caltech.edu

82 Backup slides

83 Dynamics Small effect on queue AIMD Mice traffic Heterogeneity Big effect on queue Stability!

84 Stable: 20ms delay Window Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking

85 Queue Stable: 20ms delay Window Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking

86 Unstable: 200ms delay Window Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking

87 Unstable: 200ms delay Queue Window Ns-2 simulations, 50 identical FTP sources, single link 9 pkts/ms, RED marking

88 Other effects on queue 20ms 200ms 50% noise avg delay 16ms avg delay 208ms

89 Effect of instability Larger jitters in throughput Lower utilization No noise

90 Linearized system F1F1 FNFN G1G1 GLGL R f (s) R b ’ (s) TCP Network AQM x y q p

91 Linearized system F1F1 FNFN G1G1 GLGL R f (s) R b ’ (s) x y q p TCPREDqueuedelay

92 Stability condition Theorem: TCP/RED stable if  TCP: Small  Small c Large N RED: Small  Large delay

93 Validation

94

95 Stability region Unstable for Large delay Large capacity Small load

96 Role of AQM (Sufficient) stability condition

97 Role of AQM (Sufficient) stability condition TCP: AQM: scale down K & shape H

98 Role of AQM TCP: RED:REM/PI: Queue dynamics (windowing) Problem!!

99 Role of AQM To stabilize a given TCP F i with least cost Linearized F, single link, identical sources Any AQM solves optimal control with appropriate Q and R

100 Role of AQM To stabilize a given TCP F i with least cost Linearized F, single link, identical sources k 1 =0 Nonzero buffer k 3 =0 Slower decay


Download ppt "Multi-Gbps TCP 9:00-10:00 Harvey Newman (Physics, Caltech) High speed networks & grids 10:00-10:45 Sylvain Ravot (Physics, Caltech/CERN) LHC networks and."

Similar presentations


Ads by Google