1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,

Slides:



Advertisements
Similar presentations
TCP transfers over high latency/bandwidth network & Grid TCP Sylvain Ravot
Advertisements

Institute of Computer Science Foundation for Research and Technology – Hellas Greece Computer Architecture and VLSI Systems Laboratory Exploiting Spatial.
LCG TCP performance optimization for 10 Gb/s LHCOPN connections 1 on behalf of M. Bencivenni, T.Ferrari, D. De Girolamo, Stefano.
Restricted Slow-Start for TCP William Allcock 1,2, Sanjay Hegde 3 and Rajkumar Kettimuthu 1,2 1 Argonne National Laboratory 2 The University of Chicago.
Ahmed El-Hassany CISC856: CISC 856 TCP/IP and Upper Layer Protocols Slides adopted from: Injong Rhee, Lisong Xu.
Presentation by Joe Szymanski For Upper Layer Protocols May 18, 2015.
CUBIC : A New TCP-Friendly High-Speed TCP Variant Injong Rhee, Lisong Xu Member, IEEE v 0.2.
CUBIC Qian HE (Steve) CS 577 – Prof. Bob Kinicki.
Texas A&M University Improving TCP Performance in High Bandwidth High RTT Links Using Layered Congestion Control Sumitha.
1 Testbeds Les Cottrell Site visit to SLAC by DoE program managers Thomas Ndousse & Mary Anne Scott April 27,
PFLDnet, Nara, Japan 2-3 Feb 2006, R. Hughes-Jones Manchester 1 Transport Benchmarking Panel Discussion Richard Hughes-Jones The University of Manchester.
High speed TCP’s. Why high-speed TCP? Suppose that the bottleneck bandwidth is 10Gbps and RTT = 200ms. Bandwidth delay product is packets (1500.
High-performance bulk data transfers with TCP Matei Ripeanu University of Chicago.
1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,
The Effect of Router Buffer Size on HighSpeed TCP Performance Dhiman Barman Joint work with Georgios Smaragdakis and Ibrahim Matta.
1 Chapter 3 Transport Layer. 2 Chapter 3 outline 3.1 Transport-layer services 3.2 Multiplexing and demultiplexing 3.3 Connectionless transport: UDP 3.4.
1 Emulating AQM from End Hosts Presenters: Syed Zaidi Ivor Rodrigues.
© 2006 Open Grid Forum Interactions Between Networks, Protocols & Applications HPCN-RG Richard Hughes-Jones OGF20, Manchester, May 2007,
ESLEA Bedfont Lakes Dec 04 Richard Hughes-Jones Network Measurement & Characterisation and the Challenge of SuperComputing SC200x.
1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell Sun SuperG Spring 2005, April,
Introduction 1 Lecture 14 Transport Layer (Congestion Control) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer Science.
Transport Layer 4 2: Transport Layer 4.
Transport Layer3-1 Chapter 3 outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles.
Experiences in Design and Implementation of a High Performance Transport Protocol Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data.
Large File Transfer on 20,000 km - Between Korea and Switzerland Yusung Kim, Daewon Kim, Joonbok Lee, Kilnam Chon
Towards a Common Communication Infrastructure for Clusters and Grids Darius Buntinas Argonne National Laboratory.
Maximizing End-to-End Network Performance Thomas Hacker University of Michigan October 26, 2001.
Implementing High Speed TCP (aka Sally Floyd’s) Yee-Ting Li & Gareth Fairey 1 st October 2002 DataTAG CERN (Kinda!)
Electronic visualization laboratory, university of illinois at chicago A Case for UDP Offload Engines in LambdaGrids Venkatram Vishwanath, Jason Leigh.
1 Using Netflow data for forecasting Les Cottrell SLAC and Fawad Nazir NIIT, Presented at the CHEP06 Meeting, Mumbai India, February
Experience with Loss-Based Congestion Controlled TCP Stacks Yee-Ting Li University College London.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
HighSpeed TCP for High Bandwidth-Delay Product Networks Raj Kettimuthu.
Requirements for Simulation and Modeling Tools Sally Floyd NSF Workshop August 2005.
Rate Control Rate control tunes the packet sending rate. No more than one packet can be sent during each packet sending period. Additive Increase: Every.
Data Transport Challenges for e-VLBI Julianne S.O. Sansa* * With Arpad Szomoru, Thijs van der Hulst & Mike Garret.
Masaki Hirabaru NICT Koganei 3rd e-VLBI Workshop October 6, 2004 Makuhari, Japan Performance Measurement on Large Bandwidth-Delay Product.
Wide Area Network Performance Analysis Methodology Wenji Wu, Phil DeMar, Mark Bowden Fermilab ESCC/Internet2 Joint Techs Workshop 2007
TERENA Networking Conference, Zagreb, Croatia, 21 May 2003 High-Performance Data Transport for Grid Applications T. Kelly, University of Cambridge, UK.
Transport Layer 3-1 Chapter 3 Transport Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March
TCP transfers over high latency/bandwidth networks Internet2 Member Meeting HENP working group session April 9-11, 2003, Arlington T. Kelly, University.
Performance Engineering E2EpiPEs and FastTCP Internet2 member meeting - Indianapolis World Telecom Geneva October 15, 2003
Data Transport Challenges for e-VLBI Julianne S.O. Sansa* * With Arpad Szomoru, Thijs van der Hulst & Mike Garret.
GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
XCP: eXplicit Control Protocol Dina Katabi MIT Lab for Computer Science
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
1 Evaluation of Advanced TCP stacks on Fast Long-Distance production Networks Prepared by Les Cottrell & Hadrien Bullot, Richard Hughes-Jones EPFL, SLAC.
INDIANAUNIVERSITYINDIANAUNIVERSITY Status of FAST TCP and other TCP alternatives John Hicks TransPAC HPCC Engineer Indiana University APAN Meeting – Hawaii.
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 1 The Uptake of High Speed Protocols or Are these protocols making their way.
An Analysis of AIMD Algorithm with Decreasing Increases Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data Mining.
Masaki Hirabaru (NICT) and Jin Tanaka (KDDI) Impact of Bottleneck Queue on Long Distant TCP Transfer August 25, 2005 NOC-Network Engineering Session Advanced.
1 FAST TCP for Multi-Gbps WAN: Experiments and Applications Les Cottrell & Fabrizio Coccetti– SLAC Prepared for the Internet2, Washington, April 2003
Recent experience with PCI-X 2.0 and PCI-E network interfaces and emerging server systems Yang Xia Caltech US LHC Network Working Group October 23, 2006.
Chapter 3 outline 3.1 transport-layer services
R. Hughes-Jones Manchester
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
TransPAC HPCC Engineer
Using Netflow data for forecasting
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
MB-NG Review High Performance Network Demonstration 21 April 2004
Wide Area Networking at SLAC, Feb ‘03
Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones,
TCP Congestion Control
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
Transport Layer: Congestion Control
TCP flow and congestion control
High-Performance Data Transport for Grid Applications
Review of Internet Protocols Transport Layer
Presentation transcript:

1 Characterization and Evaluation of TCP and UDP-based Transport on Real Networks Les Cottrell, Saad Ansari, Parakram Khandpur, Ruchi Gupta, Richard Hughes-Jones, Michael Chen, Larry McIntosh, Frank Leers SLAC, Manchester University, Chelsio and Sun Protocols for Fast Long Distance Networks, Lyon, France February, Partially funded by DOE/MICS Field Work Proposal on Internet End-to-end Performance Monitoring (IEPM)

2 Project goals Evaluate various techniques for achieving high bulk- throughput on fast long-distance real production WAN links How useful for production: ease of configuration, throughput, convergence, fairness, stability etc. For different RTTs Recommend “optimum” techniques for data intensive science (BaBar) transfers using bbftp, bbcp, GridFTP Provide input for validation of simulator & emulator findings

3 Techniques rejected Jumbo frames –Not an IEEE standard –May break some UDP applications –Not supported on SLAC LAN Sender mods only, HENP model is few big senders, lots of smaller receivers –Simplifies deployment, only a few hosts at a few sending sites –So no Dynamic Right Sizing (DRS) Runs on production nets –No router mods (XCP/ECN)

4 Software Transports Advanced TCP stacks –To overcome AIMD congestion behavior of Reno based TCPs –BUT: SLAC “datamover” are all based on Solaris, while advanced TCPs currently are Linux only SLAC production systems people concerned about non- standard kernels, ensuring TCP patches keep current with security patches for SLAC supported Linux version So also very interested in transport that runs in user space (no kernel mods) –Evaluate UDT from UIC folks

5 Hardware Assists For 1Gbits/s paths, cpu, bus etc. not a problem For 10Gbits/s they are more important NIC assistance to the CPU is becoming popular –Checksum offload –Interrupt coalescence –Large send/receive offload (LSO/LRO) –TCP Offload Engine (TOE) Several vendors for 10Gbits/s NICs, at least one for 1Gbits/s NIC But currently restricts to using NIC vendor’s TCP implementation Most focus is on the LAN –Cheap alternative to Infiniband, MyriNet etc.

6 Protocols Evaluated TCP (implementations as of April 2004) –Linux 2.4 New Reno with SACK: single and parallel streams (Reno) –Scalable TCP (Scalable) –Fast TCP –HighSpeed TCP (HSTCP) –HighSpeed TCP Low Priority (HSTCP-LP) –Binary Increase Control TCP (BICTCP) –Hamilton TCP (HTCP) –Layering TCP (LTCP) UDP –UDT v2.

7 Chose 3 paths from SLAC –Caltech (10ms), Univ Florida (80ms), CERN (180ms) Used iperf/TCP and UDT/UDP to generate traffic Each run was 16 minutes, in 7 regions Methodology (1Gbit/s) Ping 1/s Iperf or UDT ICMP/ping traffic TCP/ UDP bottleneck iperf SLAC Caltech/UFL/CERN 2 mins 4 mins

8 Behavior Indicators Achievable throughput Stability S= σ/μ (standard deviation/average) Intra-protocol fairness F =

9 Behavior wrt RTT 10ms (Caltech): Throughput, Stability (small is good), Fairness minimum (over regions 2 thru 6) (closer to 1 is better) –Excl. FAST ~ 720±64Mbps, S~0.18±0.04, F~0.95 –FAST ~ 400±120Mbps, S=0.33, F~ ms (U. Florida): Throughput, Stability –All ~ 350±103Mbps, S=0.3±0.12, F~ ms (CERN): –All ~ 340±130Mbps, S=0.42±0.17, F~0.81 The Stability and Fairness effects are more manifest on longer RTT, so focus on CERN

10 Reno single stream Low performance on fast long distance paths –AIMD (add a=1 pkt to cwnd / RTT, decrease cwnd by factor b=0.5 in congestion) –Net effect: recovers slowly, does not effectively use available bandwidth, so poor throughput Remaining flows do not take up slack when flow removed Congestion has a dramatic effect Recovery is slow Multiple streams increase recovery rate SLAC to CERN RTT increases when achieves best throughput

11 Fast Also uses RTT to detect congestion –RTT is very stable: σ(RTT) ~ 9ms vs 37±0.14ms for the others SLAC-CERN Big drops in throughput which take several seconds to recover from 2 nd flow never gets equal share of bandwidth

12 HTCP One of the best performers –Throughput is high –Big effects on RTT when achieves best throughput –Flows share equally Appears to need >1 flow to achieve best throughput Two flows share equally SLAC-CERN > 2 flows appears less stable

13 BICTCP Needs > 1 flow for best throughput

14 UDTv2 Similar behavior to better TCP stacks –RTT very variable at best throughputs –Intra-protocol sharing is good –Behaves well as flows add & subtract

15 Overall Proto Avg thru (Mbps) S (σ/μ) min (F) σ ( RTT ) MHz/ Mbps Scal.423± BIC412± HTCP402± UDT390± LTCP376± Fast335± HSTCP255± Reno248± HSTCP-LP228± Scalable is one of best, but inter-protocol fairness is poor (see Bullot et al.) BIC & HTCP are about equal UDT is close, BUT cpu intensive (factor of 2, used to be >factor of 10 worse) Fast gives low RTT values & variability All TCP protocols use similar cpu (HSTCP looks poor because throughput low)

16 10Gbps tests At SC2004 using two 10Gbps dedicated paths between Pittsburgh and Sunnyvale –Using Solaris 10 (build 69) and Linux 2.6 –On Sunfire Vx0z (dual & quad 2.4GHz 64 bit AMD Opterons) with PCI-X 133MHz 64 bit –Only 1500 Byte MTUs Achievable performance limits (using iperf) –Reno TCP (multi-flows) vs UDTv2, –TOE (Chelsio) vs no TOE (S2io)

17 Results UDT limit was ~ 4.45Gbits/s –Cpu limited TCP Limit was about 7.5±0.07 Gbps, regardless of: –Whether LAN (back to back) or WAN WAN used 2MB window & 16 streams –Whether Solaris 10 or Linux 2.6 –Whether S2io or Chelsio NIC Gating factor=PCI-X –Raw bandwidth 8.53Gbps –But transfer broken into segments to allow interleaving –E.g. with max memory read byte count of 4096Bytes with Intel Pro/10GbE LR NIC limit is 6.83Gbits/s One host with 4 cpus & 2 NICs sent 11.5±0.2Gbps to two dual cpu hosts with 1 NIC each Two hosts to two hosts (1 NIC/host) 9.07Gbps goodput forward & 5.6Gbps reverse

18 TCP CPU Utilization CPU power important Each cpu=2.4GHz Throughput increases with flows Util. not linear(throughput) Depends on flows too Chelsio(TOE) Normalize GHz/Gbps Chelsio + TOE + Linux S2io + CKS offload + Sol10 –S2io supports LSO but Sol10 did not, so not used –Microsoft reports 0.017GHz/Gbps with Windows+S2io/LSO, 1 flow

19 Conclusions Need testing on real networks –Controlled simulation & emulation critical for understanding –BUT need to verify, and results look different than expected (e.g. Fast) Most important for transoceanic paths UDT looks promising, still needs work for > 6Gbits/s Need to evaluate various offloads (TOE, LSO...) Need to repeat inter-protocol fairness vs Reno New buses important, need NICs to support then evaluate

20 Further Information Web site with lots of plots & analysis – Inter-protocols comparison (Journal of Grid Comp, PFLD04) – SC2004 details –www-iepm.slac.stanford.edu/monitoring/bulk/sc2004/