Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes.

Slides:



Advertisements
Similar presentations
Storage System Integration with High Performance Networks Jon Bakken and Don Petravick FNAL.
Advertisements

Big Data over a 100G Network at Fermilab Gabriele Garzoglio Grid and Cloud Services Department Computing Sector, Fermilab CHEP 2013 – Oct 15, 2013.
Cross-site data transfer on TeraGrid using GridFTP TeraGrid06 Institute User Introduction to TeraGrid June 12 th by Krishna Muriki
Experience and proposal for 100 GE R&D at Fermilab Interactomes – May 22, 2012 Gabriele Garzoglio Grid and Cloud Computing Department Computing Sector,
High Throughput Data Program at Fermilab R&D Parag Mhashilkar Grid and Cloud Computing Department Computing Sector, Fermilab Network Planning for ESnet/Internet2/OSG.
Middleware Support for RDMA-based Data Transfer in Cloud Computing Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi Department of Electrical.
University of Chicago Department of Energy The Parallel and Grid I/O Perspective MPI, MPI-IO, NetCDF, and HDF5 are in common use Multi TB datasets also.
Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes.
1 GridTorrent Framework: A High-performance Data Transfer and Data Sharing Framework for Scientific Computing.
Esma Yildirim Department of Computer Engineering Fatih University Istanbul, Turkey DATACLOUD 2013.
Introduction The Open Science Grid (OSG) is a consortium of more than 100 institutions including universities, national laboratories, and computing centers.
GridPP meeting Feb 03 R. Hughes-Jones Manchester WP7 Networking Richard Hughes-Jones.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Matei Ripeanu.
Minerva Infrastructure Meeting – October 04, 2011.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Advisor: Professor.
GridFTP Guy Warner, NeSC Training.
GlobusWorld 2012: Experience with EXPERIENCE WITH GLOBUS ONLINE AT FERMILAB Gabriele Garzoglio Computing Sector Fermi National Accelerator.
1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),
Protocols for Wide-Area Data-intensive Applications: Design and Performance Issues Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi, Brian.
Fermi National Accelerator Laboratory 3 Fermi National Accelerator Laboratory Mission Advances the understanding of the fundamental nature of matter.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
Improving Network I/O Virtualization for Cloud Computing.
Why GridFTP? l Performance u Parallel TCP streams, optimal TCP buffer u Non TCP protocol such as UDT u Order of magnitude greater l Cluster-to-cluster.
100G R&D at Fermilab Gabriele Garzoglio (for the High Throughput Data Program team) Grid and Cloud Computing Department Computing Sector, Fermilab Overview.
100G R&D at Fermilab Gabriele Garzoglio Grid and Cloud Computing Department Computing Sector, Fermilab Overview Fermilab Network R&D 100G Infrastructure.
100G R&D at Fermilab Gabriele Garzoglio Grid and Cloud Computing Department Computing Sector, Fermilab Overview Fermilab Network R&D 100G Infrastructure.
GlobusWorld 2012: Experience with EXPERIENCE WITH GLOBUS ONLINE AT FERMILAB Gabriele Garzoglio Computing Sector Fermi National Accelerator.
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008 John Bresnahan Michael Link Raj Kettimuthu Argonne National Laboratory and.
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
UDT as an Alternative Transport Protocol for GridFTP Raj Kettimuthu Argonne National Laboratory The University of Chicago.
Data Transfers in the Grid: Workload Analysis of Globus GridFTP Nicolas Kourtellis, Lydia Prieto, Gustavo Zarrate, Adriana Iamnitchi University of South.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
DYNES Storage Infrastructure Artur Barczyk California Institute of Technology LHCOPN Meeting Geneva, October 07, 2010.
George Kola Computer Sciences Department University of Wisconsin-Madison DiskRouter: A Mechanism for High.
Spectrum of Support for Data Movement and Analysis in Big Data Science Network Management and Control E-Center & ESCPS Network Management and Control E-Center.
GridFTP GUI: An Easy and Efficient Way to Transfer Data in Grid
GridFTP Richard Hopkins
Distributed applications monitoring at system and network level A.Brunengo (INFN- Ge), A.Ghiselli (INFN-Cnaf), L.Luminari (INFN-Roma1), L.Perini (INFN-Mi),
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
Scott Koranda, UWM & NCSA 14 January 2016www.griphyn.org Lightweight Data Replicator Scott Koranda University of Wisconsin-Milwaukee & National Center.
AERG 2007Grid Data Management1 Grid Data Management GridFTP Carolina León Carri Ben Clifford (OSG)
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Computing Sector, Fermi National Accelerator Laboratory 4/12/12GlobusWorld 2012: Experience with
Bulk Data Transfer Activities We regard data transfers as “first class citizens,” just like computational jobs. We have transferred ~3 TB of DPOSS data.
Parag Mhashilkar Computing Division, Fermi National Accelerator Laboratory.
1.3 ON ENHANCING GridFTP AND GPFS PERFORMANCES A. Cavalli, C. Ciocca, L. dell’Agnello, T. Ferrari, D. Gregori, B. Martelli, A. Prosperini, P. Ricci, E.
GridFTP Guy Warner, NeSC Training Team.
1 GridFTP and SRB Guy Warner Training, Outreach and Education Team, Edinburgh e-Science.
Run-time Adaptation of Grid Data Placement Jobs George Kola, Tevfik Kosar and Miron Livny Condor Project, University of Wisconsin.
A Sneak Peak of What’s New in Globus GridFTP John Bresnahan Michael Link Raj Kettimuthu (Presenting) Argonne National Laboratory and The University of.
CRISP WP18, High-speed data recording Krzysztof Wrona, European XFEL PSI, 18 March 2013.
100G R&D for Big Data at Fermilab Gabriele Garzoglio Grid and Cloud Computing Department Computing Sector, Fermilab ISGC – March 22, 2013 Overview Fermilab.
With the recent rise in cloud computing, applications are routinely accessing and interacting with data on remote resources. As data sizes become increasingly.
Parag Mhashilkar (Fermi National Accelerator Laboratory)
100G R&D at Fermilab Gabriele Garzoglio (for the High Throughput Data Program team) Grid and Cloud Computing Department Computing Sector, Fermilab Overview.
Big Data over a 100G Network at Fermilab Gabriele Garzoglio Grid and Cloud Services Department Computing Sector, Fermilab CHEP 2013 – Oct 15, 2013 Overview.
Scott Koranda, UWM & NCSA 20 November 2016www.griphyn.org Lightweight Replication of Heavyweight Data Scott Koranda University of Wisconsin-Milwaukee &
Diskpool and cloud storage benchmarks used in IT-DSS
Experiences with http/WebDAV protocols for data access in high throughput computing
DOE Facilities - Drivers for Science: Experimental and Simulation Data
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
Enabling High Speed Data Transfer in High Energy Physics
Grid Canada Testbed using HEP applications
Outline Problem DiskRouter Overview Details Real life DiskRouters
GridTorrent Framework: A High-performance Data Transfer and Data Sharing Framework for Scientific Computing.
Presentation transcript:

Current Testbed : 100 GE 2 sites (NERSC, ANL) with 3 nodes each. Each node with 4 x 10 GE NICs Measure various overheads from protocols and file sizes Basic network capacity using nuttcp GridFTP tests Globus Online controlling GridFTP tests XrootD tests SC2011 Demo : shared 100 GE Grid & Cloud Computing Dept. of Fermilab demonstrated the use of 100 GE network to move CMS data with GridFTP Test Characteristics 15 NERSC & 26 ANL nodes w/ 10 GE NIC 10 CMS files of 2 GB (RAM to RAM only) Total 30 TB transferred in one hour Result: data transfer rate at ~70 Gbps sustained with peaks at 75 Gbps LIMAN Testbed : 40 GE Main tools tested on the Long Island Metropolitan Area Netrowk: Globus Online(GO) and GridFTP(GF) Compare 3 transfers mechanisms (to see overheads from GO and control channels) Local GF Xfer (server to server) FNAL-controlled GridFTP Xfer GO-controlled GridFTP Xfer Compare 3 sets of files with different sizes (to see transfer protocol overhead effects on small files) Result: Main overheads observed for Globus-Online with small files Identifying Gaps in Grid Middleware on Fast Networks with The Advanced Networking Initiative Motivation Goal of the High Throughput Data Program(HTDP) at the Fermilab Computing Sector is to prepare Fermilab and its stakeholders for growing into a 100Gbps network infrastructure. Focus: compile a list of key services used by research communities and facilities identify gaps in current infrastructure and tools when interfacing 100Gbps networks We are conducting a series of tests with key tools on a test bed 100Gbps network which is operated by US DoE ESnet’s Advanced Networking Initiative (ANI) Basic Network Throughput Test with nuttcp Motivation: Confirm basic performance of network with parameter tuning. Compare with baseline provided by ANI team. Results NIC to NIC : 9.89 Gbps (as expected from 10 GE NIC) 4 NICs to 4 NICs between 2 nodes : 39 Gbps (as expected from 4NICs) Aggregate Throughput Using 10 TCP streams (10 pairs of NIC-NIC) : 99 Gbps GridFTP and Globus-Online Test Motivation 1 : Observation: Using a single instance of GridFTP client/server is not efficient What is an efficient way to increase the throughput via each NIC? What is an efficient way to transfer a single file? Answer: use multiple parallel streams for each file transfer, globus-url-copy –p N What is an efficient way to transfer a set of files? Answer: use multiple concurrent globus-gridftp-servers, globus-url-copy –cc M We launch multiple clients and servers with multiple streams opened between them Motivation 2 : we expect protocol overheads to be different for various file sizes Files of various sizes are transferred from client disk to server memory Dataset split into 3 sets: Small (8KB – 4MB), Medium (8MB – 1G), Large (2, 4, 8 GB) Motivation 3 : In addition to locally-controlled GridFTP, we tested 2 remotely-controlled configurations 1.Use port-forwarding to access GridFTP clients/servers (labeled “Remote”) 2.Use Globus-Online We also compare server-server with client-server transfers Results GridFTP does not suffer from protocol overhead for large and medium size files Observe significant overhead in the case of small size files Remote use of GridFTP via Globus-Online suffers from protocol overhead XrootD Test Motivation: What is an efficient way to increase the throughput via each NIC? We are focusing on tuning transfer parameters of xrootd Test begins with a single instance of xrdcp and xrootd Server side: One xrootd writing to RAMdisk or HDD Are multiple concurrent transfers possible in xrootd? The equivalent of the “GridFTP –cc” option is not available but we can emulate it by launching multiple xrdcp. xrootd server accepts multiple connections using multithreading. How efficient is it? Are multiple parallel transfer possible in xrootd? Not practical for our test Results : Limited by RAMdisk, we estimate aggregate by scaling result from one-NIC Gbps, Fermilab is Operated by the Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359 with the United States Department of Energy D. Dykstra, G. Garzoglio, H. Kim, P. Mhashilkar Scientific Computing Division, Fermi National Accelerator Laboratory GUC/cor e GUC streams GUC TCP Window Size Files/GU C MAX BW Sustain BW T D112Default T2122MB16552 D2122MB16552 T3422MB17370 D3422MB17570 May – October 2011 November 2011 January – May 2012 Conclusions Basic network capacity test is close to 100 Gpbs Can saturate the bandwidth capacity by increasing data streams GridFTP: suffers from protocol overhead for small files Globus Online: working with GO to improve performance XrootD: test at initial stage but gives throughput comparable to GridFTP. Not many performance tuning options available. Tests on the current ANI 100GE Testbed CHEP2012 Poster ID 214 Local: Client-Server Local: Server-Server Remote: Server-Server Globus Online Large87.92 Gbps92.74 Gbps91.19 Gbps62.90 Gbps Medium76.90 Gbps90.94 Gbps81.79 Gbps28.49 Gbps Small2.99 Gbps2.57 Gbps2.11 Gbps2.36 Gbps 1 client2 client4 clients8 clients 8 GB3 Gbps5 Gbps7.9 GbpsN/A Large (2, 4 GB) 2.4, 2.7 Gbps 3.1, 4.4 Gbps 3.7, 5.8 Gbps 4.9, 8.7 Gbps Medium (64M, 256M) 230, 760 Mbps 406, 1160 Mbps 830, 1228 Mbps 1650, 1890 Mbps Small (256K, 4M) 3, 14 Mbps 6, 31 Mbps 12, 69 Mbps 22, 126 Mbps LargeMediumSmall