MB-NG Review – 24 April 2004 Richard Hughes-Jones The University of Manchester, UK MB-NG Review High Performance Network Demonstration 21 April 2004.

Slides:



Advertisements
Similar presentations
Helping TCP Work at Gbps Cheng Jin the FAST project at Caltech
Advertisements

Spring Process Control Spring Outline 1.Optimization 2.Statistical Process Control 3.In-Process Control.
ESLEA and HEPs Work on UKLight Network. ESLEA Exploitation of Switched Lightpaths in E- sciences Applications Exploitation of Switched Lightpaths in E-
Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
GridPP Meeting Edinburgh 4-5 Feb 04 R. Hughes-Jones Manchester 1 High Performance Networking for ALL Members of GridPP are in many Network collaborations.
E-Science All Hands Meeting 1-4 Sep 03 R. Hughes-Jones Manchester 1 High Bandwidth High Throughput in the MB-NG & DataTAG Projects Richard Hughes-Jones,
Storage System Integration with High Performance Networks Jon Bakken and Don Petravick FNAL.
D. Elia, R. SantoroITS week / SPD meeting - May 12, Test beam data analysis D. Elia, R. Santoro – Bari SPD Group Alignments, plane rotation for setup.
17 May Multiple Sites. 17 May Multiple Sites This presentation assumes you are already familiar with Doors and all its standard commands It.
TCP transfers over high latency/bandwidth network & Grid TCP Sylvain Ravot
IGRID2002 Radio Astronomy VLBI Demo. uWeb based demonstration sending VLBI data A controlled stream of UDP packets Mbit/s on the production network.
MB - NG MB-NG Technical Meeting 03 May 02 R. Hughes-Jones Manchester 1 Task2 Traffic Generation and Measurement Definitions Pass-1.
DataTAG CERN Oct 2002 R. Hughes-Jones Manchester Initial Performance Measurements With DataTAG PCs Gigabit Ethernet NICs (Work in progress Oct 02)
JIVE VLBI Network Meeting 15 Jan 2003 R. Hughes-Jones Manchester The EVN-NREN Project Richard Hughes-Jones The University of Manchester.
GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 End-2-End Network Monitoring What do we do ? What do we use it for? Richard Hughes-Jones Many people.
Meeting on ATLAS Remote Farms. Copenhagen 11 May 2004 R. Hughes-Jones Manchester Networking for ATLAS Remote Farms Richard Hughes-Jones The University.
GridPP meeting Feb 03 R. Hughes-Jones Manchester WP7 Networking Richard Hughes-Jones.
CdL was here DataTAG/WP7 Amsterdam June 2002 R. Hughes-Jones Manchester 1 EU DataGrid - Network Monitoring Richard Hughes-Jones, University of Manchester.
Slide: 1 Richard Hughes-Jones PFLDnet2005 Lyon Feb 05 R. Hughes-Jones Manchester 1 Investigating the interaction between high-performance network and disk.
Optical Networking Status of Discussion in the UK Richard Hughes-Jones The University of Manchester Particle Physics Network Coordination Group TERENA.
DataGrid WP7 Meeting CERN April 2002 R. Hughes-Jones Manchester Some Measurements on the SuperJANET 4 Production Network (UK Work in progress)
JIVE VLBI Network Meeting 28 Jan 2004 R. Hughes-Jones Manchester Brief Report on Tests Related to the e-VLBI Project Richard Hughes-Jones The University.
CALICE UCL, 20 Feb 2006, R. Hughes-Jones Manchester 1 10 Gigabit Ethernet Test Lab PCI-X Motherboards Related work & Initial tests Richard Hughes-Jones.
DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG.
EDG WP7 Networking Demonstration uDemonstration sending HEP data CERN to SARA Multiple streams of TCP packets Tuned TCP parameters: ifconfig eth0 txqueuelen.
PFLDNet Argonne Feb 2004 R. Hughes-Jones Manchester 1 UDP Performance and PCI-X Activity of the Intel 10 Gigabit Ethernet Adapter on: HP rx2600 Dual Itanium.
© 2006 Open Grid Forum Interactions Between Networks, Protocols & Applications HPCN-RG Richard Hughes-Jones OGF20, Manchester, May 2007,
Slide: 1 Richard Hughes-Jones CHEP2004 Interlaken Sep 04 R. Hughes-Jones Manchester 1 Bringing High-Performance Networking to HEP users Richard Hughes-Jones.
ESLEA Bedfont Lakes Dec 04 Richard Hughes-Jones Network Measurement & Characterisation and the Challenge of SuperComputing SC200x.
CdL was here DataTAG CERN Sep 2002 R. Hughes-Jones Manchester 1 European Topology: NRNs & Geant SuperJANET4 CERN UvA Manc SURFnet RAL.
MB - NG MB-NG Meeting UCL 17 Jan 02 R. Hughes-Jones Manchester 1 Discussion of Methodology for MPLS QoS & High Performance High throughput Investigations.
02 nd April 03Networkshop Managed Bandwidth Next Generation F. Saka UCL NETSYS (NETwork SYStems centre of excellence)
13th-14th July 2004 University College London End-user systems: NICs, MotherBoards, TCP Stacks & Applications Richard Hughes-Jones.
Introduction 1 Lecture 14 Transport Layer (Congestion Control) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer Science.
Transport Layer 4 2: Transport Layer 4.
Large File Transfer on 20,000 km - Between Korea and Switzerland Yusung Kim, Daewon Kim, Joonbok Lee, Kilnam Chon
Implementing High Speed TCP (aka Sally Floyd’s) Yee-Ting Li & Gareth Fairey 1 st October 2002 DataTAG CERN (Kinda!)
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
Slide: 1 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 1 TCP/IP Overview & Performance Richard Hughes-Jones The University.
Experience with Loss-Based Congestion Controlled TCP Stacks Yee-Ting Li University College London.
High TCP performance over wide area networks Arlington, VA May 8, 2002 Sylvain Ravot CalTech HENP Working Group.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
HighSpeed TCP for High Bandwidth-Delay Product Networks Raj Kettimuthu.
Network Performance for ATLAS Real-Time Remote Computing Farm Study Alberta, CERN Cracow, Manchester, NBI MOTIVATION Several experiments, including ATLAS.
MB - NG MB-NG Meeting Dec 2001 R. Hughes-Jones Manchester MB – NG SuperJANET4 Development Network SuperJANET4 Production Network Leeds RAL / UKERNA RAL.
CAIDA Bandwidth Estimation Meeting San Diego June 2002 R. Hughes-Jones Manchester UDPmon and TCPstream Tools to understand Network Performance Richard.
Transport Layer3-1 TCP throughput r What’s the average throughout of TCP as a function of window size and RTT? m Ignore slow start r Let W be the window.
TERENA Networking Conference, Zagreb, Croatia, 21 May 2003 High-Performance Data Transport for Grid Applications T. Kelly, University of Cambridge, UK.
TCP transfers over high latency/bandwidth networks Internet2 Member Meeting HENP working group session April 9-11, 2003, Arlington T. Kelly, University.
Performance Engineering E2EpiPEs and FastTCP Internet2 member meeting - Indianapolis World Telecom Geneva October 15, 2003
MB - NG MB-NG Meeting UCL 17 Jan 02 R. Hughes-Jones Manchester 1 Discussion of Methodology for MPLS QoS & High Performance High throughput Investigations.
GNEW2004 CERN March 2004 R. Hughes-Jones Manchester 1 Lessons Learned in Grid Networking or How do we get end-2-end performance to Real Users ? Richard.
18/09/2002Presentation to Spirent1 Presentation to Spirent 18/09/2002.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
1 eVLBI Developments at Jodrell Bank Observatory Ralph Spencer, Richard Hughes- Jones, Simon Casey, Paul Burgess, The University of Manchester.
Recent experience with PCI-X 2.0 and PCI-E network interfaces and emerging server systems Yang Xia Caltech US LHC Network Working Group October 23, 2006.
Connect. Communicate. Collaborate 4 Gigabit Onsala - Jodrell Lightpath for e-VLBI Richard Hughes-Jones.
MB MPLS MPLS Technical Meeting Sep 2001 R. Hughes-Jones Manchester SuperJANET Development Network Testbed – Cisco GSR SuperJANET4 C-PoP – Cisco GSR.
R. Hughes-Jones Manchester
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
Networking between China and Europe
Transport Protocols over Circuits/VCs
Networking for grid Network capacity Network throughput
Lecture 19 – TCP Performance
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
MB-NG Review High Performance Network Demonstration 21 April 2004
MB – NG SuperJANET4 Development Network
High-Performance Data Transport for Grid Applications
Presentation transcript:

MB-NG Review – 24 April 2004 Richard Hughes-Jones The University of Manchester, UK MB-NG Review High Performance Network Demonstration 21 April 2004

MB-NG Review, April It works ? So what’s the Problem with TCP TCP has 2 phases: Slowstart & Congestion Avoidance AIMD and High Bandwidth – Long Distance networks Poor performance of TCP in high bandwidth wide area networks is due in part to the TCP congestion control algorithm - cwnd congestion window For each ack in a RTT without loss: cwnd -> cwnd + a / cwnd - Additive Increase, a=1 For each window experiencing loss: cwnd -> cwnd – b (cwnd) - Multiplicative Decrease, b= ½ Time to recover from 1 packet loss ~100 ms rtt:

MB-NG Review, April Investigation of new TCP Stacks High Speed TCP a and b vary depending on current cwnd using a table a increases more rapidly with larger cwnd – returns to the ‘optimal’ cwnd size sooner for the network path b decreases less aggressively and, as a consequence, so does the cwnd. The effect is that there is not such a decrease in throughput. Scalable TCP a and b are fixed adjustments for the increase and decrease of cwnd a = 1/100 – the increase is greater than TCP Reno b = 1/8 – the decrease on loss is less than TCP Reno Scalable over any link speed. Fast TCP Uses round trip time as well as packet loss to indicate congestion with rapid convergence to fair equilibrium for throughput. HSTCP-LP High Speed (Low Priority) – backs off if rtt increases BiC-TCP – Additive increase large cwnd; binary search small cwnd H-TCP – after congestion standard then switch to high performance ●●●

MB-NG Review, April Comparison of TCP Stacks TCP Response Function Throughput vs Loss Rate – steeper: faster recovery Drop packets in kernel MB-NG rtt 6ms DataTAG rtt 120 ms

MB-NG Review, April Multi-Gigabit flows at SC2003 BW Challenge Three Server systems with 10 GigEthernet NICs Used the DataTAG altAIMD stack 9000 byte MTU Send mem-mem iperf TCP streams From SLAC/FNAL booth in Phoenix to: Chicago Starlight rtt 65 ms window 60 MB Phoenix CPU 2.2 GHz 3.1 Gbit hstcp I=1.6% Amsterdam SARA rtt 175 ms window 200 MB Phoenix CPU 2.2 GHz 4.35 Gbit hstcp I=6.9% New TCP stacks are very Stable Both used Abilene to Chicago

MB-NG Review, April Transfer Applications – Throughput [1] 2Gbyte file transferred RAID0 disks Manc – UCL GridFTP See alternate 600/800 Mbit and zero Apache web server + curl-based client See steady 720 Mbit

MB-NG Review, April Transfer Applications – Throughput [2] 2Gbyte file transferred RAID5 - 4disks Manc – RAL bbcp Mean 710 Mbit/s GridFTP See many zeros Mean ~710 Mean ~620

MB-NG Review, April Topology of the MB – NG Network Key Gigabit Ethernet 2.5 Gbit POS Access MPLS Admin. Domains UCL Domain Edge Router Cisco 7609 man01 man03 Boundary Router Cisco 7609 RAL Domain Manchester Domain lon02 man02 ral01 UKERNA Development Network Boundary Router Cisco 7609 ral02 lon03 lon01

MB-NG Review, April High Throughput Demo Manchester man03lon Gbit SDH MB-NG Core 1 GEth Cisco GSR Cisco 7609 Cisco 7609 London Dual Zeon 2.2 GHz Send data with TCP Drop Packets Monitor TCP with Web100

MB-NG Review, April Standard to HS-TCP No loss, but output queue filled by sender

MB-NG Review, April HS-TCP to Scalable No loss, but output queue filled by sender

MB-NG Review, April Standard, HS-TCP, Scalable Drop 1 in 25,000

MB-NG Review, April Standard Reno TCP Drop 1 in 10 6

MB-NG Review, April Focus on Helping Real Users: Throughput CERN -SARA Using the GÉANT Backup Link 1 GByte disk-disk transfers Blue is the Data Red is the TCP ACKs Standard TCP Average Throughput 167 Mbit/s Users see Mbit/s! High-Speed TCP Average Throughput 345 Mbit/s Scalable TCP Average Throughput 340 Mbit/s Technology link to EU Projects: DataGrid DataTAG & GÉANT

MB-NG Review, April BaBar Case Study: Host, PCI & RAID Controller Performance RAID0 (striped) & RAID5 (stripped with redundancy) 3Ware 7506 Parallel 66 MHz 3Ware 7505 Parallel 33 MHz 3Ware 8506 Serial ATA 66 MHz ICP Serial ATA 33/66 MHz Tested on Dual 2.2 GHz Xeon Supermicro P4DP8-G2 motherboard Disk: Maxtor 160GB 7200rpm 8MB Cache Read ahead kernel tuning: /proc/sys/vm/max-readahead Disk – Memory Read Speeds Memory - Disk Write Speeds

MB-NG Review, April Topology of the MB – NG Network Key Gigabit Ethernet 2.5 Gbit POS Access MPLS Admin. Domains UCL Domain Edge Router Cisco 7609 man01 man03 Boundary Router Cisco 7609 RAL Domain Manchester Domain lon02 man02 ral01 UKERNA Development Network Boundary Router Cisco 7609 ral02 lon03 lon01 HW RAID

MB-NG Review, April BaBar Data: Throughput on MB–NG kit RAID5 - 4disks RAL - Manc Includes small files ~Kbytes bbftp 1 stream with compression bbftp 6 streams bbftp 1 stream no compression 10 * 2 G byte files – each peak is a 20 G byte transfer bbftp 1 stream Files ≥ 1 Mbyte With bb diag

MB-NG Review, April Helping Real Users Radio Astronomy VLBI PoC with NRNs & GEANT 1024 Mbit/s 24 on 7 NOW

MB-NG Review, April byte Packets man -> JIVE FWHM 22 µs (B2B 3 µs ) VLBI Project: Throughput Jitter & 1-way Delay 1-way Delay – note the packet loss (points with zero 1 –way delay) 1472 byte Packets Manchester -> Dwingeloo JIVE

MB-NG Review, April Case Study: ATLAS LHC Tests streaming built Events from Level3 Trigger to remote compute farm in real time 500 Mbit to 1 Gbit CERN – Man Investigation of use of new high performance TCPs Testing concepts in the ATLAS Offline Computing model More Mesh than Star: CERN Tier0 to Tier 1s Tier 2s to all Tier 1s Tests planned over production networks: Lancaster-Manchester NNW SuperJANET4 Lancaster-Manchester to CERN

MB-NG Review, April

MB-NG Review, April Scalable TCP DataTAG Drop 1 in 10 6

MB-NG Review, April HS-TCP DataTAG Drop 1 in 10 6

MB-NG Review, April Standard Reno TCP DataTAG Drop 1 in 10 6 Transition highspeed to Standard 520s

MB-NG Review, April Summary Multi-Gigabit transfers are possible and stable Demonstrated that new TCP stacks help performance DataTAG has made major contributions to understanding of high-speed networking There has been significant technology transfer between DataTAG and other projects Now reaching out to real users. But still much research to do: Achieve performance – Protocol vs implementation issues Stability / Sharing issues Optical transports & hybrid networks

MB-NG Review, April Gigabit: Tuning PCI-X mmrbc 1024 bytes mmrbc 2048 bytes mmrbc 4096 bytes 5.7Gbit/s mmrbc 512 bytes CSR Access PCI-X Sequence Data Transfer Interrupt & CSR Update byte packets every 200 µs Intel PRO/10GbE LR Adapter PCI-X bus occupancy vs mmrbc Measured times Times based on PCI-X times from the logic analyser Expected throughput ~7 Gbit/s

MB-NG Review, April DataTAG Testbed

MB-NG Review, April BaBar Case Study: Disk Performance BaBar Disk Server Tyan Tiger S2466N motherboard 1 64bit 66 MHz PCI bus Athlon MP2000+ CPU AMD-760 MPX chipset 3Ware RAID5 8 * 200Gb Maxtor IDE 7200rpm disks Note the VM parameter readahead max Disk to memory (read) Max throughput 1.2 Gbit/s 150 MBytes/s) Memory to disk (write) Max throughput 400 Mbit/s 50 MBytes/s) [not as fast as Raid0]

MB-NG Review, April RAID Controller Performance RAID 0 RAID 5 Read Speed Write Speed

MB-NG Review, April BaBar: Serial ATA Raid Controllers RAID5 3Ware 66 MHz PCI ICP 66 MHz PCI

MB-NG Review, April Measure the time between lost packets in the time series of packets sent. Lost 1410 in 0.6s Is it a Poisson process? Assume Poisson is stationary λ(t) = λ Use Prob. Density Function: P(t) = λ e -λt Mean λ = 2360 / s [426 µs] Plot log: slope expect Could be additional process involved VLBI Project: Packet Loss Distribution

MB-NG Review, April The performance of the end host / disks BaBar Case Study: RAID BW & PCI Activity 3Ware RAID5 parallel EIDE 3Ware forces PCI bus to 33 MHz BaBar Tyan to MB-NG SuperMicro Network mem-mem 619 Mbit/s Disk – disk throughput bbcp Mbytes/s (320 – 360 Mbit/s) PCI bus effectively full! User throughput ~ 250 Mbit/s Read from RAID5 Disks Write to RAID5 Disks