Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

Similar presentations


Presentation on theme: "1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,"— Presentation transcript:

1 1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones, Michael Chen, Larry McIntosh, Frank Leers SLAC, Manchester University, Chelsio and Sun Site visit to SLAC by DoE program managers Thomas Ndousse & Mary Anne Scott April 27, 2005 www.slac.stanford.edu/grp/scs/net/talk05/tcp-apr05.ppt Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM), also supported by IUPAP

2 2 Project goals Evaluate various techniques for achieving high bulk-throughput on fast long-distance real production WAN links Compare & contrast: ease of configuration, throughput, convergence, fairness, stability etc. For different RTTs Recommend “optimum” techniques for data intensive science (BaBar) transfers using bbftp, bbcp, GridFTP Validate simulator & emulator findings & provide feedback

3 3 Techniques rejected Jumbo frames –Not an IEEE standard –May break some UDP applications –Not supported on SLAC LAN Sender mods only, HENP model is few big senders, lots of smaller receivers –Simplifies deployment, only a few hosts at a few sending sites –So no Dynamic Right Sizing (DRS) Runs on production nets –No router mods (XCP/ECN)

4 4 Software Transports Advanced TCP stacks –To overcome AIMD congestion behavior of Reno based TCPs –BUT: SLAC “datamover” are all based on Solaris, while advanced TCPs currently are Linux only SLAC production systems people concerned about non- standard kernels, ensuring TCP patches keep current with security patches for SLAC supported Linux version So also very interested in transport that runs in user space (no kernel mods) –Evaluate UDT from UIC folks

5 5 Hardware Assists For 1Gbits/s paths, cpu, bus etc. not a problem For 10Gbits/s they are more important NIC assistance to the CPU is becoming popular –Checksum offload –Interrupt coalescence –Large send/receive ofload (LSO/LRO) –TCP Offload Engine (TOE) Several vendors for 10Gbits/s NICs, at least one for 1Gbits/s NIC But currently restricts to using NIC vendor’s TCP implementation Most focus is on the LAN –Cheap alternative to Infiniband, MyriNet etc.

6 6 Protocols Evaluated TCP (implementations as of April 2004) –Linux 2.4 New Reno with SACK: single and parallel streams (Reno) –Scalable TCP (Scalable) –Fast TCP –HighSpeed TCP (HSTCP) –HighSpeed TCP Low Priority (HSTCP-LP) –Binary Increase Control TCP (BICTCP) –Hamilton TCP (HTCP) –Layering TCP (LTCP) UDP –UDT v2.

7 7 Chose 3 paths from SLAC –Caltech (10ms), Univ Florida (80ms), CERN (180ms) Used iperf/TCP and UDT/UDP to generate traffic Each run was 16 minutes, in 7 regions Methodology (1Gbit/s) Ping 1/s Iperf or UDT ICMP/ping traffic TCP/ UDP bottleneck iperf SLAC Caltech/UFL/CERN 2 mins 4 mins

8 8 Behavior Indicators Achievable throughput Stability S= σ/μ (standard deviation/average) Intra-protocol fairness F =

9 9 Behavior wrt RTT 10ms (Caltech): Throughput, Stability (small is good), Fairness minimum (over regions 2 thru 6) (closer to 1 is better) –Excl. FAST ~ 720±64Mbps, S~0.18±0.04, F~0.95 –FAST ~ 400±120Mbps, S=0.33, F~0.88 80ms (U. Florida): Throughput, Stability –All ~ 350±103Mbps, S=0.3±0.12, F~0.82 180ms (CERN): –All ~ 340±130Mbps, S=0.42±0.17, F~0.81 The Stability and Fairness effects are more manifest on longer RTT, so focus on CERN

10 10 Reno single stream Low performance on fast long distance paths –AIMD (add a=1 pkt to cwnd / RTT, decrease cwnd by factor b=0.5 in congestion) –Net effect: recovers slowly, does not effectively use available bandwidth, so poor throughput Remaining flows do not take up slack when flow removed Congestion has a dramatic effect Recovery is slow Multiple streams increase recovery rate SLAC to CERN RTT increases when achieves best throughput

11 11 Fast Also uses RTT to detect congestion –RTT is very stable: σ(RTT) ~ 9ms vs 37±0.14ms for the others SLAC-CERN Big drops in throughput which take several seconds to recover from 2 nd flow never gets equal share of bandwidth

12 12 HTCP One of the best performers –Throughput is high –Big effects on RTT when achieves best throughput –Flows share equally Appears to need >1 flow to achieve best throughput Two flows share equally SLAC-CERN

13 13 BICTCP Needs > 1 flow for best throughput

14 14 UDTv2 Similar behavior to better TCP stacks –RTT very variable at best throughputs –Intra-protocol sharing is good –Behaves well as flows add & subtract

15 15 Overall Proto Avg thru (Mbps) S (σ/μ) min (F) σ ( RTT ) MHz/ Mbps Scal.423±1150.270.83220.64 BIC412±1170.280.98550.71 HTCP402±1130.280.99570.65 UDT390±1360.350.95491.2 LTCP376±1370.360.56410.67 Fast335±1100.330.5890.66 HSTCP255±1870.730.79250.9 Reno248±1630.660.6220.63 HSTCP-LP228±1140.50.64330.65 Scalable is one of best, but inter-protocol is poor (see Bullot et al.) BIC & HTCP are about equal UDT is close, BUT cpu intensive (used to be much (factor of 10) worse) Fast gives low RTT values & variability All TCP protocols use similar cpu (HSTCP looks poor because throughput low)

16 16 Conclusions Need testing on real networks –Controlled simulation & emulation critical for understanding –BUT ALSO need to verify, and results look different than expected (e.g. Fast) Most important for transoceanic paths UDT looks promising Need to evaluate various offloads (TOE, LSO...) Need to repeat inter-protocol fairness vs Reno More implementations emerging (Westwood+), and improvements to existing Test at 10Gbps

17 17 Further Information Web site with lots of plots & analysis –www.slac.stanford.edu/grp/scs/net/papers/pfld05/ruchig/Fairness/www.slac.stanford.edu/grp/scs/net/papers/pfld05/ruchig/Fairness/ Inter-protocols comparison (Journal of Grid Comp, PFLD04) –www.slac.stanford.edu/cgi-wrap/getdoc/slac-pub-10402.pdfwww.slac.stanford.edu/cgi-wrap/getdoc/slac-pub-10402.pdf SC2004 details –www-iepm.slac.stanford.edu/monitoring/bulk/sc2004/


Download ppt "1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,"

Similar presentations


Ads by Google