UltraLight: Network & Applications Research at UF Dimitri Bourilkov University of Florida CISCO - UF Collaborative Team Meeting Gainesville, FL, September.

Slides:



Advertisements
Similar presentations
Gigabyte Bandwidth Enables Global Co-Laboratories Prof. Harvey Newman, Caltech Jim Gray, Microsoft Presented at Windows Hardware Engineering Conference.
Advertisements

Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes.
Highest Energy e + e – Collider LEP at CERN GeV ~4km radius First e + e – Collider ADA in Frascati GeV ~1m radius e + e – Colliders.
Shawn P. McKee University of Michigan University of Michigan UltraLight Meeting, NSF January 26, 2005 Network Working Group Report.
GNEW 2004 CERN, Geneva, Switzerland March 16th, 2004Shawn McKee The UltraLight Program UltraLight: An Overview for GNEW2004 Shawn McKee University of Michigan.
Global Lambdas and Grids for Particle Physics in the LHC Era Harvey B. Newman Harvey B. Newman California Institute of Technology SC2005 Seattle, November.
HEP Prospects, J. Yu LEARN Strategy Meeting Prospects on Texas High Energy Physics Network Needs LEARN Strategy Meeting University of Texas at El Paso.
Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.
1 INDIACMS-TIFR TIER-2 Grid Status Report IndiaCMS Meeting, Sep 27-28, 2007 Delhi University, India.
KEK Network Qi Fazhi KEK SW L2/L3 Switch for outside connections Central L2/L3 Switch A Netscreen Firewall Super Sinet Router 10GbE 2 x GbE IDS.
CMS Data Transfer Challenges LHCOPN-LHCONE meeting Michigan, Sept 15/16th, 2014 Azher Mughal Caltech.
1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
IRODS performance test and SRB system at KEK Yoshimi KEK Building data grids with iRODS 27 May 2008.
Large File Transfer on 20,000 km - Between Korea and Switzerland Yusung Kim, Daewon Kim, Joonbok Lee, Kilnam Chon
J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology High speed WAN data transfers for science Session Recent Results.
J. Bunn, D. Nae, H. Newman, S. Ravot, X. Su, Y. Xia California Institute of Technology State of the art in the use of long distance network International.
Shawn McKee / University of Michigan USATLAS Tier1 & Tier2 Network Planning Meeting December 14, BNL UltraLight Overview.
100G R&D at Fermilab Gabriele Garzoglio (for the High Throughput Data Program team) Grid and Cloud Computing Department Computing Sector, Fermilab Overview.
Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)
Data GRID Activity in Japan Yoshiyuki WATASE KEK (High energy Accelerator Research Organization) Tsukuba, Japan
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
February 2006 Iosif Legrand 1 Iosif Legrand California Institute of Technology February 2006 February 2006 An Agent Based, Dynamic Service System to Monitor,
USATLAS Network/Storage and Load Testing Jay Packard Dantong Yu Brookhaven National Lab.
LambdaStation Monalisa DoE PI meeting September 30, 2005 Sylvain Ravot.
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
Shawn P. McKee / University of Michigan International ICFA Workshop on HEP Networking, Grid and Digital Divide Issues for Global e-Science May 25, 2005.
RAC parameter tuning for remote access Carlos Fernando Gamboa, Brookhaven National Lab, US Frederick Luehring, Indiana University, US Distributed Database.
1 Grid Related Activities at Caltech Koen Holtman Caltech/CMS PPDG meeting, Argonne July 13-14, 2000.
1 High Energy Physics (HEP) Computing HyangKyu Park Kyungpook National University Daegu, Korea 2008 Supercomputing & KREONET Workshop Ramada Hotel, JeJu,
Sejong STATUS Chang Yeong CHOI CERN, ALICE LHC Computing Grid Tier-2 Workshop in Asia, 1 th December 2006.
Shawn McKee University of Michigan University of Michigan UltraLight: A Managed Network Infrastructure for HEP CHEP06, Mumbai, India February 14, 2006.
Data Logistics in Particle Physics Ready or Not, Here it Comes… Prof. Paul Sheldon Vanderbilt University Prof. Paul Sheldon Vanderbilt University.
DYNES Storage Infrastructure Artur Barczyk California Institute of Technology LHCOPN Meeting Geneva, October 07, 2010.
The LHCb CERN R. Graciani (U. de Barcelona, Spain) for the LHCb Collaboration International ICFA Workshop on Digital Divide Mexico City, October.
Rick Cavanaugh, University of Florida CHEP06 Mumbai, 13 February, 2006 An Ultrascale Information Facility for Data Intensive Research.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
National HEP Data Grid Project in Korea Kihyeon Cho Center for High Energy Physics (CHEP) Kyungpook National University CDF CAF & Grid Meeting July 12,
The Design and Demonstration of the UltraLight Network Testbed Presented by Xun Su GridNets 2006, Oct.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.
Performance Engineering E2EpiPEs and FastTCP Internet2 member meeting - Indianapolis World Telecom Geneva October 15, 2003
30 June Wide Area Networking Performance Challenges Olivier Martin, CERN UK DTI visit.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
US LHC NWG Dynamic Circuit Services in US LHCNet Artur Barczyk, Caltech Joint Techs Workshop Honolulu, 01/23/2008.
TCP transfers over high latency/bandwidth networks & Grid DT Measurements session PFLDnet February 3- 4, 2003 CERN, Geneva, Switzerland Sylvain Ravot
The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
Final EU Review - 24/03/2004 DataTAG is a project funded by the European Commission under contract IST Richard Hughes-Jones The University of.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
S. Ravot, J. Bunn, H. Newman, Y. Xia, D. Nae California Institute of Technology CHEP 2004 Network Session September 1, 2004 Breaking the 1 GByte/sec Barrier?
HENP SIG Austin, TX September 27th, 2004Shawn McKee The UltraLight Program UltraLight: An Overview and Update Shawn McKee University of Michigan.
1 June 11/Ian Fisk CMS Model and the Network Ian Fisk.
Fall 2005 Internet2 Member Meeting International Task Force Julio Ibarra, PI Heidi Alvarez, Co-PI Chip Cox, Co-PI John Silvester, Co-PI September 19, 2005.
1 Particle Physics Data Grid (PPDG) project Les Cottrell – SLAC Presented at the NGI workshop, Berkeley, 7/21/99.
US ATLAS Tier-2 Networking Shawn McKee University of Michigan US ATLAS Tier-2 Meeting San Diego, March 8 th, 2007.
Recent experience with PCI-X 2.0 and PCI-E network interfaces and emerging server systems Yang Xia Caltech US LHC Network Working Group October 23, 2006.
LHC collisions rate: Hz New PHYSICS rate: Hz Event selection: 1 in 10,000,000,000,000 Signal/Noise: Raw Data volumes produced.
DATA ACCESS and DATA MANAGEMENT CHALLENGES in CMS.
ALICE internal and external network
California Institute of Technology
Experiences with Large Data Sets
LHC DATA ANALYSIS INFN (LNL – PADOVA)
Enabling High Speed Data Transfer in High Energy Physics
Using Netflow data for forecasting
ExaO: Software Defined Data Distribution for Exascale Sciences
The Internet2 HENP SIG Internet2 Fall Meeting September 28, 2004
Summer 2002 at SLAC Ajay Tirumala.
Presentation transcript:

UltraLight: Network & Applications Research at UF Dimitri Bourilkov University of Florida CISCO - UF Collaborative Team Meeting Gainesville, FL, September 12, 2006 Gainesville, FL, September 12, 2006

D.Bourilkov UltraLight2 Overview a NSF Project

D.Bourilkov UltraLight3 The UltraLight Team  Steering Group: H. Newman (Caltech, PI), P. Avery (U. Florida), J. Ibarra (FIU), S. McKee (U. Michigan)  Project Management: Richard Cavanaugh (Project Coordinator), PI and Working Group Coordinators:  Network Engineering: Shawn McKee (Michigan); + S. Ravot (LHCNet), R. Summerhill (Abilene/HOPI), D. Pokorney (FLR), J. Ibarra (WHREN, AW), C. Guok (ESnet), L. Cottrell (SLAC), D. Petravick, M. Crawford (FNAL), S. Bradley, J. Bigrow (BNL), et al.  Applications Integration: Frank Van Lingen (Caltech); + I. Legrand (MonALISA), J. Bunn (GAE + TG); C. Steenberg, M. Thomas (GAE), Sanjay Ranka (Sphinx) et al.  Physics Analysis User Group: Dimitri Bourilkov (UF; CAVES, Codesh)  Network Research, Wan In Lab Liaison: Steven Low (Caltech)  Education and Outreach: Laird Kramer (FIU), + H. Alvarez, J. Ibarra, H. Newman

D.Bourilkov UltraLight4 TOTEM pp, general purpose; HI LHCb: B-physics ALICE : HI  pp  s =14 TeV L=10 34 cm -2 s -1  27 km Tunnel in Switzerland & France Large Hadron Collider CERN, Geneva: 2007 Start CMS Atlas Higgs, SUSY, Extra Dimensions, CP Violation, QG Plasma, … the Unexpected Physicists 250+ Institutes 60+ Countries Challenges: Analyze petabytes of complex data cooperatively Harness global computing, data & NETWORK resources

D.Bourilkov UltraLight5 LHC Data Grid Hierarchy CERN/Outside Ratio Smaller; Expanded Role of Tier1s & Tier2s: Greater Reliance on Networks CERN/Outside Resource Ratio ~1:4 Tier0/(  Tier1)/(  Tier2) ~1:2:2 DISUN : 4 of 7 US CMS Tier2s Shown With ~8 MSi2k; 1.5 PB Disk by 2007 >100 Tier2s at LHC Gbps Gbps

D.Bourilkov UltraLight6 Tier-2s ~100 Identified – Number still growing

D.Bourilkov UltraLight7 HENP Bandwidth Roadmap for Major Links (in Gbps) Continuing Trend: ~1000 Times Bandwidth Growth Per Decade; HEP: Co-Developer as well as Application Driver of Global Nets

D.Bourilkov UltraLight8 Data Samples and Transport Scenarios 10 7 Event Samples Data Volume (TBytes) Transfer Time 0.9 Gbps Transfer Time 3 Gbps Transfer Time 8 Gbps AOD – – 0.28 RECO – – 1.4 RAW+RECO – 9.6 MC  10 7 Events is a typical data sample for analysis or reconstruction development [Ref.: MONARC]; equivalent to just ~1 day’s running  Transporting datasets with quantifiable high performance is needed for efficient workflow, and thus efficient use of CPU and storage resources  One can only transmit ~2 RAW + REC or MC samples per day on a 10G path  Movement of 10 8 event samples (e.g. after re-reconstruction) will take ~1 day (RECO) to ~1 week (RAW, MC) with a 10G link at high occupancy  Transport of significant data samples will require one, or multiple 10G links

D.Bourilkov UltraLight9 UltraLight Goals Goal: Enable the network as an integrated managed resource Goal: Enable the network as an integrated managed resource Meta-Goal: Enable physics analysis & discoveries which otherwise could not be achieved Meta-Goal: Enable physics analysis & discoveries which otherwise could not be achieved Caltech, Florida, Michigan, FNAL, SLAC, CERN, BNL, Internet2/HOPI UERJ (Rio), USP(Sao Paulo), FIU, KNU (Korea), KEK (Japan), TIFR (India), PERN (Pakistan) NLR, ESnet, CENIC, FLR, MiLR, US Net, Abilene, JGN2, GLORIAD, RNP, CA*net4; UKLight, Netherlight, Taiwan Cisco, Neterion, Sun … Next generation Information System, with the network as an integrated, actively managed subsystem in a global Grid Hybrid network infrastructure: packet-switched + dynamic optical paths End-to-end monitoring; Realtime tracking and optimization Dynamic bandwidth provisioning; Agent-based services spanning all layers

D.Bourilkov UltraLight10 Large Scale Data Transfers Network aspect: Bandwidth*Delay Product (BDP); we have to use TCP windows matching it in the kernel AND the application On a local connection with 1GbE and RTT 0.19 ms, to fill the pipe we need around 2*BDP 2*BDP = 2*1Gb/s* s = ~ 48 KBytes Or, for a 10 Gb/s LAN: 2*BDP = ~ 480 KBytes Now on the WAN: from Florida to Caltech the RTT is 115 ms. So for 1 Gb/s to fill the pipe we need 2*BDP = 2*1Gb/s*0.115s = ~ 28.8 MBytes etc. User aspect: are the servers on both ends capable of matching these rates for useful disk-to-disk? Tune kernels, get highest possible disk read/write speed etc. Tables turned: WAN outperforms disk speeds!

D.Bourilkov UltraLight11 bbcp Tests bbcp was selected as a starting tool for data transfers on the WAN: Supports multiple streams, highly tunable (window size etc), peer-to-peer type Well supported by Andy Hanushevsky from SLAC Is used successfully in BaBar I have used it in 2002 for CMS production: massive data transfers from Florida to CERN; the only limit observed at the time was disk writing speed (LAN), network (WAN) Starting point Florida  Caltech: < 0.5 MB/s on the WAN, very poor performance

D.Bourilkov UltraLight12 Evolution of Tests Leading to SC|05 End points in Florida (uflight1) and Caltech (nw1): AMD Opterons over UL network Tuning of Linux kernels (2.6.x) and bbcp window sizes – coordinated iterative procedure Current status (for file sizes ~ 2GB): Gb/s with iperf up to 6 Gb/s memory to memory 2.2 Gb/s ramdisk  remote disk write >the speed was the same writing to SCSI disk which is supposedly less than 80 MB/s or writing to a raid array, so de facto it always goes first to memory cache (the Caltech node has 16 GB ram) Used successfully with up to 8 bbcp processes in parallel from Florida to the show floor in Seattle; CPU load still OK

D.Bourilkov UltraLight13 bbcp Examples Florida  Caltech data]$ iperf -i 5 -c t Client connecting to , TCP port 5001 TCP window size: 256 MByte (default) [ 3] local port connected with port 5001 [ 3] sec 2.73 GBytes 4.68 Gbits/sec [ 3] sec 3.73 GBytes 6.41 Gbits/sec [ 3] sec 3.73 GBytes 6.40 Gbits/sec [ 3] sec 3.73 GBytes 6.40 Gbits/sec bbcp: uflight1.ultralight.org kernel using a send window size of not bbcp -s 8 -f -V -P 10 -w 10m big2.root bbcp: Sink I/O buffers (245760K) > 25% of available free memory (231836K); copy may be slow bbcp: Creating /dev/null/big2.root Source cpu=5.654 mem=0K pflt=0 swap=0 File /dev/null/big2.root created; bytes at KB/s 24 buffers used with 0 reorders; peaking at 0. Target cpu=3.768 mem=0K pflt=0 swap=0 1 file copied at effectively KB/s bbcp -s 8 -f -V -P 10 -w 10m big2.root bbcp: uflight1.ultralight.org kernel using a send window size of not bbcp: Creating./dimitri/big2.root Source cpu=5.455 mem=0K pflt=0 swap=0 File./dimitri/big2.root created; bytes at KB/s 24 buffers used with 0 reorders; peaking at 0. Target cpu= mem=0K pflt=0 swap=0 1 file copied at effectively KB/s

D.Bourilkov UltraLight14 bbcp Examples Caltech  Florida dimitri]$ iperf -s -w 256m -i 5 -p l Server listening on TCP port 5001 TCP window size: 512 MByte (WARNING: requested 256 MByte) [ 4] local port 5001 connected with port [ 4] sec 2.72 GBytes 4.68 Gbits/sec [ 4] sec 3.73 GBytes 6.41 Gbits/sec [ 4] sec 3.73 GBytes 6.40 Gbits/sec [ 4] sec 3.73 GBytes 6.40 Gbits/sec [ 4] sec 3.73 GBytes 6.40 Gbits/sec bbcp -s 8 -f -V -P 10 -w 10m big2.root bbcp: Sink I/O buffers (245760K) > 25% of available free memory (853312K); copy may be slow bbcp: Source I/O buffers (245760K) > 25% of available free memory (839628K); copy may be slow bbcp: nw1.caltech.edu kernel using a send window size of not bbcp: Creating /dev/null/big2.root Source cpu=5.962 mem=0K pflt=0 swap=0 File /dev/null/big2.root created; bytes at KB/s 24 buffers used with 0 reorders; peaking at 0. Target cpu=4.053 mem=0K pflt=0 swap=0 1 file copied at effectively KB/s

D.Bourilkov UltraLight15 SuperComputing 05 Bandwidth Challenge 475 TBytes Transported in < 24 h Above 100 Gbps for Hours

D.Bourilkov UltraLight16 Outlook The UltraLight network is already very performant SC|05 was a big success The hard problem from the user perspective now is to match it with servers capable of sustained rates for large files > 20 GB (when the memory caches are exhausted); fast disk writes are key (raid arrays) To fill 10 Gb/s pipes we need several pairs (3-4) of servers Next step: disk-to-disk transfers between Florida, Caltech, Michigan, FNAL, BNL, CERN, preparations for SC|06 (next talk) More info: