Masaki Hirabaru NICT The 3rd International HEP DataGrid Workshop August 26, 2004 Kyungpook National Univ., Daegu, Korea High Performance.

Slides:



Advertisements
Similar presentations
Engineering Meeting Report Aug. 29, 2003 Kazunori Konishi.
Advertisements

Lee, Seungjun ( ) Korea Advanced Institute of Science and Technology August 28, 2003 APAN Measurement WG meeting eTOP End-to-end.
e-VLBI Deployments with Research Internet
E2E performance measurement
Helping TCP Work at Gbps Cheng Jin the FAST project at Caltech
TCP Performance over IPv6 Yoshinori Kitatsuji KDDI R&D Laboratories, Inc.
August 10, Circuit TCP (CTCP) Helali Bhuiyan
Experience of High Performance Experiments Yoshinori Kitatsuji Tokyo XP KDDI R&D Laboratories, Inc.
The Effect of Router Buffer Size on HighSpeed TCP Performance Dhiman Barman Joint work with Georgios Smaragdakis and Ibrahim Matta.
PFLDNet Argonne Feb 2004 R. Hughes-Jones Manchester 1 UDP Performance and PCI-X Activity of the Intel 10 Gigabit Ethernet Adapter on: HP rx2600 Dual Itanium.
TCP Westwood: Experiments over Large Pipes Cesar Marcondes Anders Persson Prof. M.Y. Sanadidi Prof. Mario Gerla NRL – Network Research Lab UCLA.
Current plan for e-VLBI demonstrations at iGrid2005 and SC2005 Yasuhiro Koyama *1, Tetsuro Kondo *1, Hiroshi Takeuchi *1, Moritaka Kimura, Masaki Hirabaru.
Ongoing e-VLBI Developments with K5 VLBI System Hiroshi Takeuchi, Tetsuro Kondo, Yasuhiro Koyama, and Moritaka Kimura Kashima Space Research Center/NICT.
E-VLBI Development at Haystack Observatory Alan Whitney Chet Ruszczyk Kevin Dudevoir Jason SooHoo MIT Haystack Observatory 24 March 2006 EVN TOG meeting.
Electronic Transmission of Very- Long Baseline Interferometry Data National Internet2 day, March 18, 2004 David LapsleyAlan Whitney MIT Haystack Observatory,
NOC/Engineering Meeting Date & Time: 12: :00 over Lunch 13: :00 at Lotus Hall 1 Yoshinori Kitatsuji: APAN-JP NOC Report & Plan Masaki Hirabaru:
Advanced Network Architecture Research Group 2001/11/149 th International Conference on Network Protocols Scalable Socket Buffer Tuning for High-Performance.
Developments for real-time software correlation e-VLBI Y. Koyama, T. Kondo, M. Kimura, M. Sekido, M. Hirabaru, and M. Harai Kashima Space Research Center,
Experiences in Design and Implementation of a High Performance Transport Protocol Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data.
Large File Transfer on 20,000 km - Between Korea and Switzerland Yusung Kim, Daewon Kim, Joonbok Lee, Kilnam Chon
1 Update Jacqueline Brown University of Washington APAN Meeting Busan, Korea 29 August 2003.
Best-Case WiBro Performance for a Single Flow 1 MICNET 2009 Shinae Woo †, Keon Jang †, Sangman Kim † Soohyun Cho *, Jaehwa Lee *, Youngseok Lee ‡, Sue.
Masaki Hirabaru ISIT and Genkai Genkai / Hyeonhae Workshop in Fukuoka Feb. 27, 2003 Performance of G / H Link.
Masaki Hirabaru Internet Architecture Group GL Meeting March 19, 2004 High Performance Data transfer on High Bandwidth-Delay Product Networks.
The Internet Hall of Fame Induction Memorial Party for the late Dr. Masaki Hirabaru Shigeki Goto JPNIC Wednesday 4 March,
1 ASTER DATA Transfer System Earth Remote Sensing Data Analysis Center Kunjuro Omagari ERSDAC 23th APAN Meeting.
1 Masaki Hirabaru and Yasuhiro Koyama APEC-TEL APGrid Workshop September 6, 2005 e-VLBI: Science over High-Performance Networks.
E-VLBI over TransPAC Masaki HirabaruDavid LapsleyYasuhiro KoyamaAlan Whitney Communications Research Laboratory, Japan MIT Haystack Observatory, USA Communications.
Hirabaru, Koyama, and Kimura NICT APAN NOC Meeting January 20, 2005 e-VLBI updates.
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
Masaki Hirabaru CRL, Japan APAN Engineering Team Meeting APAN 2003 in Busan, Korea August 27, 2003 Common Performance Measurement Platform.
EVN-NREN meeting, Schiphol, , A. Szomoru, JIVE Recent eVLBI developments at JIVE Arpad Szomoru Joint Institute for VLBI in Europe.
Masaki Hirabaru CRL, Japan ITRC MAI BoF Kochi November 6, 2003 広帯域・高遅延ネットワークでの TCP性能計測.
Measurement of Routing Switch-over Time with Redundant Link Masaki Hirabaru (NICT), Teruo Nakai (KDDI), Yoshitaka Hattori (KDDI), Motohiro Ishii (QIC),
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
APAN 10Gbps End-to-End Performance Measurement Masaki Hirabaru (NICT), Takatoshi Ikeda (KDDI/NICT), and Yasuichi Kitamura (NICT) July 19, 2006 Network.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
E-VLBI and End-to-End Performance Masaki HirabaruYasuhiro KoyamaTetsuro Kondo NICT KoganeiNICT Kashima
High TCP performance over wide area networks Arlington, VA May 8, 2002 Sylvain Ravot CalTech HENP Working Group.
APII/APAN Measurement Framework: Advanced Network Observatory Masaki Hirabaru (NICT), Takatoshi Ikeda (KDDI Lab), Motohiro Ishii (QIC), and Yasuichi Kitamura.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
Masaki Hirabaru Tsukuba WAN Symposium 2005 March 8, 2005 e-VLBI and End-to-End Performance over Global Research Internet.
CS 164: Slide Set 2: Chapter 1 -- Introduction (continued).
Masaki Hirabaru Network Performance Measurement and Monitoring APAN Conference 2005 in Bangkok January 27, 2005 Advanced TCP Performance.
Data Transport Challenges for e-VLBI Julianne S.O. Sansa* * With Arpad Szomoru, Thijs van der Hulst & Mike Garret.
Masaki Hirabaru NICT Koganei 3rd e-VLBI Workshop October 6, 2004 Makuhari, Japan Performance Measurement on Large Bandwidth-Delay Product.
An Introduction to UDT Internet2 Spring Meeting Yunhong Gu Robert L. Grossman (Advisor) National Center for Data Mining University.
Masaki Hirabaru 情報処理学会四国支部講演会 December 17, 2004 インターネットでの長距離高性能データ転送.
1 Capacity Dimensioning Based on Traffic Measurement in the Internet Kazumine Osaka University Shingo Ata (Osaka City Univ.)
TCP transfers over high latency/bandwidth networks Internet2 Member Meeting HENP working group session April 9-11, 2003, Arlington T. Kelly, University.
Performance Engineering E2EpiPEs and FastTCP Internet2 member meeting - Indianapolis World Telecom Geneva October 15, 2003
1 Masaki Hirabaru Network Architecture Group PL Meeting New Generation Network Research Center July 26, 2006 A Role of Network Architecture in e-VLBI.
Masaki Hirabaru NICT APAN JP NOC Meeting August 31, 2004 e-VLBI for SC2004 bwctl experiment with Internet2.
Data Transport Challenges for e-VLBI Julianne S.O. Sansa* * With Arpad Szomoru, Thijs van der Hulst & Mike Garret.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
July 6, 2004APAN Cairns Meetings CIREN Backbone July 6, 2004 Kazunori Konishi.
APAN Backbone Committee Meeting 2004 Intra-Continental Update of APII IPv6 R & D Testbed Project Kiyoshi IGARASHI Communications Research Laboratory, Incooporated.
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
INDIANAUNIVERSITYINDIANAUNIVERSITY Status of FAST TCP and other TCP alternatives John Hicks TransPAC HPCC Engineer Indiana University APAN Meeting – Hawaii.
1 Masaki Hirabaru and Yasuhiro Koyama PFLDnet 2006 Febrary 2, 2006 International e-VLBI Experience.
1 eVLBI Developments at Jodrell Bank Observatory Ralph Spencer, Richard Hughes- Jones, Simon Casey, Paul Burgess, The University of Manchester.
Masaki Hirabaru and Yasuhiro Koyama NICT APAN NOC Meeting November 22, 2004 Report on e-VLBI Demonstration in SC2004.
An Analysis of AIMD Algorithm with Decreasing Increases Yunhong Gu, Xinwei Hong, and Robert L. Grossman National Center for Data Mining.
Masaki Hirabaru (NICT) and Jin Tanaka (KDDI) Impact of Bottleneck Queue on Long Distant TCP Transfer August 25, 2005 NOC-Network Engineering Session Advanced.
Networking between China and Europe
Transport Protocols over Circuits/VCs
e-VLBI Deployments with Research Internet
Prepared by Les Cottrell & Hadrien Bullot, SLAC & EPFL, for the
Achieving reliable high performance in LFNs (long-fat networks)
Presentation transcript:

Masaki Hirabaru NICT The 3rd International HEP DataGrid Workshop August 26, 2004 Kyungpook National Univ., Daegu, Korea High Performance Data Transfer over TransPAC

Acknowledgements NICT Kashima Space Research Center Yasuhiro Koyama, Tetsuro Kondo MIT Haystack Observatory David Lapsley, Alan Whitney APAN Tokyo NOC JGN II NOC NICT R&D Management Department Indiana U. Global NOC

Contents e-VLBI Performance Measurement TCP test over TransPAC TCP test in the Laboratory

Motivations MIT Haystack – NICT Kashima e-VLBI Experiment on August 27, 2003 to measure UT1-UTC in 24 hours –41.54 GB CRL => MIT 107 Mbps (~50 mins) GB MIT => CRL 44.6 Mbps (~120 mins) –RTT ~220 ms, UDP throughput Mbps However TCP ~6-8 Mbps (per session, tuned) –BBFTP with 5 x 10 TCP sessions to gain performance HUT – NICT Kashima Gigabit VLBI Experiment -RTT ~325 ms, UDP throughput ~70 Mbps However TCP ~2 Mbps (as is), ~10 Mbps (tuned) -Netants (5 TCP sessions with ftp stream restart extension) They need high-speed / real-time / reliable / long-haul high-performance, huge data transfer.

VLBI (Very Long Baseline Interferometry) delay radio signal from a star correlator A/D clock A/D Internet clock e-VLBI geographically distributed observation, interconnecting radio antennas over the world Gigabit / real-time VLBI multi-gigabit rate sampling High Bandwidth – Delay Product Network issue (NICT Kashima Radio Astronomy Applications Group) Data rate 512Mbps ~

Recent Experiment of UT1-UTC Estimation between NICT Kashima and MIT Haystack (via Washington DC) July 30, am-6am JST Kashima was upgraded to 1G through JGN II 10G link. All processing done in ~4.5 hours (last time ~21 hours) Average ~30 Mbps transfer by bbftp (under investigation) test experiment

Kwangju Busan 2.5G Fukuoka Korea 2.5G SONET KOREN Taegu Daejon 10G 1G (10G) 1G Seoul XP Genkai XP Kitakyushu Kashima 1G (10G) Fukuoka Japan 250km 1,000km 2.5G TransPAC / JGN II 9,000km 4,000km Los Angeles Chicago Washington DC MIT Haystack 10G 2.4G (x2) APII/JGNII Abilene Koganei 1G(10G) Indianapolis 100km bwctl server Network Diagram for e-VLBI and test servers 10G Tokyo XP *Info and key exchange page needed like: perf server e-vlbi server – Done 1 Gbps upgrade at Kashima – On-going 2.5 Gbps upgrade at Haystack – Experiments using 1 Gigabit bps or more – Using real-time correlation JGNII e-VLBI:

APAN JP Maps written in perl and fig2div

Purposes Measure, analyze and improve end-to-end performance in high bandwidth-delay product networks –to support for networked science applications –to help operations in finding a bottleneck –to evaluate advanced transport protocols (e.g. Tsunami, SABUL, HSTCP, FAST, XCP, [ours]) Improve TCP under easier conditions –with a signle TCP stream –memory to memory –bottleneck but no cross traffic Consume all the available bandwidth

Path ReceiverSender Backbone B1 <= B2 & B1 <= B3 Access B1 B2 B3 a) w/o bottleneck queue ReceiverSender Backbone B1 > B2 || B1 > B3 Access B1 B2 B3 b) w/ bottleneck queue bottleneck

TCP on a path with bottleneck bottleneck overflow queue The sender may generate burst traffic. The sender recognizes the overflow after the delay < RTT. The bottleneck may change over time. loss

Limiting the Sending Rate ReceiverSender 1Gbps a) b) congestion 20Mbps throughput ReceiverSender 100Mbps congestion 90Mbps throughput better!

Web100 ( A kernel patch for monitoring/modifying TCP metrics in Linux kernel We need to know TCP behavior to identify a problem. Iperf ( –TCP/UDP bandwidth measurement bwctl ( –Wrapper for iperf with authentication and scheduling

1 st Step: Tuning a Host with UDP Remove any bottlenecks on a host –CPU, Memory, Bus, OS (driver), … Dell PowerEdge 1650 (*not enough power) –Intel Xeon 1.4GHz x1(2), Memory 1GB –Intel Pro/1000 XT onboard PCI-X (133Mhz) Dell PowerEdge 2650 –Intel Xeon 2.8GHz x1(2), Memory 1GB –Intel Pro/1000 XT PCI-X (133Mhz) Iperf UDP throughput 957 Mbps –GbE wire rate: headers: UDP(20B)+IP(20B)+EthernetII(38B) –Linux (RedHat 9) with web100 –PE1650: TxIntDelay=0

2 nd Step: Tuning a Host with TCP Maximum socket buffer size (TCP window size) –net.core.wmem_max net.core.rmem_max (64MB) –net.ipv4.tcp_wmem net.tcp4.tcp_rmem (64MB) Driver descriptor length –e1000: TxDescriptors=1024 RxDescriptors=256 (default) Interface queue length –txqueuelen=100 (default) –net.core.netdev_max_backlog=300 (default) Interface queue descriptor –fifo (default) MTU –mtu=1500 (IP MTU) Iperf TCP throughput 941 Mbps –GbE wire rate: headers: TCP(32B)+IP(20B)+EthernetII(38B) –Linux (RedHat 9) with web100 Web100 (incl. High Speed TCP) –net.ipv4.web100_no_metric_save=1 (do not store TCP metrics in the route cache) –net.ipv4.WAD_IFQ=1 (do not send a congestion signal on buffer full) –net.ipv4.web100_rbufmode=0 net.ipv4.web100_sbufmode=0 (disable auto tuning) –Net.ipv4.WAD_FloydAIMD=1 (HighSpeed TCP) –net.ipv4.web100_default_wscale=7 (default)

Tokyo XP Kashima 0.1G 2.5G TransPAC 9,000km 4,000km Los Angeles Washington DC MIT Haystack 10G 1G Abilene Koganei 1G Indianapolis I2 Venue 1G 10G 100km server (general) server (e-VLBI) Network Diagram for TransPAC/I2 Measurement (Oct. 2003) 1G x2 sender receiver Mark5 Linux (RH 7.1) P3 1.3GHz Memory 256MB GbE SK-9843 PE1650 Linux (RH 9) Xeon 1.4GHz Memory 1GB GbE Intel Pro/1000 XT Iperf UDP ~900Mbps (no loss)

TransPAC/I2 #1: High Speed (60 mins)

TransPAC/I2 #2: Reno (10 mins)

TransPAC/I2 #3: High Speed (Win 12MB)

Test in a laboratory – with bottleneck Packet Sphere ReceiverSender L2SW (FES12GCF) Bandwidth 800Mbps Buffer 256KB Delay 88 ms Loss 0 GbE/SX GbE/T PE 2650PE 1650 #1: Reno => Reno #2: High Speed TCP => Reno 2*BDP = 16MB

Laboratory #1,#2: 800M bottleneck Reno HighSpeed

Laboratory #3,#4,#5: High Speed (Limiting) Window Size (16MB) Rate Control Cwnd Clamp 270 us every 10 packets With limited slow-start (1000) (95%) With limited slow-start (100) With limited slow-start (1000)

How to know when bottleneck changed End host probes periodically (e.g. packet train) Router notifies to the end host (e.g. XCP)

Another approach: enough buffer on router At least 2xBDP (bandwidth delay product) e.g. 1G bps x 200ms x 2 = 500Mb ~ 50MB Replace Fast SRAM with DRAM in order to reduce space and cost

Test in a laboratory – with bottleneck (2) Network Emulator ReceiverSender L2SW (FES12GCF) Bandwidth 800Mbps Buffer 64MB Delay 88 ms Loss 0 GbE/SX GbE/T PE 2650PE 1650 #6: High Speed TCP => Reno 2*BDP = 16MB

Laboratory #6: 800M bottleneck HighSpeed

Report on MTU Increasing MTU (packet size) results in better performance. Standard MTU is 1500B. MTU 9KB is available throughout Abilene, TransPAC, APII backbones. On Aug 25, 2004, a remaining link with 1500B was upgraded to 9KB in Tokyo XP. MTU 9KB is available from Busan to Los Angeles.

Current and Future Plans of e-VLBI KOA (Korean Observatory of Astronomy) has one existing radio telescope but in a different band from ours. They are building another three radio telescopes. Using a dedicated light path from Europe to Asia through US is being considered. e-VLBI Demonstration in SuperComputing2004 (November) is being planned, interconnecting radio telescopes from Europe, US, and Japan. Gigabit A/D converter ready and now implementing 10G. Our peformance measurement infrastructure will be merged into a framework of Global (Network) Observatory maintained by NOC people. (Internet2 piPEs, APAN CMM, and e-VLBI)

Questions? See for VLBI