BNL Service Challenge 3 Site Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.

Slides:



Advertisements
Similar presentations
Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC.
Advertisements

Exporting Raw/ESD data from Tier-0 Tier-1s Wrap-up.
HEPiX GFAL and LCG data management Jean-Philippe Baud CERN/IT/GD.
LCG Tiziana Ferrari - SC3: INFN installation status report 1 Service Challenge Phase 3: Status report Tiziana Ferrari on behalf of the INFN SC team INFN.
Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.
Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
SC4 Workshop Outline (Strong overlap with POW!) 1.Get data rates at all Tier1s up to MoU Values Recent re-run shows the way! (More on next slides…) 2.Re-deploy.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
BNL Facility Status and Service Challenge 3 Zhenping Liu, Razvan Popescu, Xin Zhao and Dantong Yu USATLAS Computing Facility Brookhaven National Lab.
LHC Data Challenges and Physics Analysis Jim Shank Boston University VI DOSAR Workshop 16 Sept., 2005.
Data management for ATLAS, ALICE and VOCE in the Czech Republic L.Fiala, J. Chudoba, J. Kosina, J. Krasova, M. Lokajicek, J. Svec, J. Kmunicek, D. Kouril,
BNL Grid Projects. 2 OutLine  Network/dCache  USATLAS Tier 1 Network Design  TeraPaths  Service Challenge 3  Service Challenge 4 Planning  USATLS.
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
USATLAS SC4. 2 ?! …… The same host name for dual NIC dCache door is resolved to different IP addresses depending.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
São Paulo Regional Analysis Center SPRACE Status Report 22/Aug/2006 SPRACE Status Report 22/Aug/2006.
LCG Storage workshop at CERN. July Geneva, Switzerland. BNL’s Experience dCache1.8 and SRM V2.2 Carlos Fernando Gamboa Dantong Yu RHIC/ATLAS.
Testing the UK Tier 2 Data Storage and Transfer Infrastructure C. Brew (RAL) Y. Coppens (Birmingham), G. Cowen (Edinburgh) & J. Ferguson (Glasgow) 9-13.
Optimisation of Grid Enabled Storage at Small Sites Jamie K. Ferguson University of Glasgow – Jamie K. Ferguson – University.
USATLAS SC4. 2 ?! …… The same host name for dual NIC dCache door is resolved to different IP addresses depending.
BNL Wide Area Data Transfer for RHIC & ATLAS: Experience and Plans Bruce G. Gibbard CHEP 2006 Mumbai, India.
BNL Facility Status and Service Challenge 3 HEPiX Karlsruhe, Germany May 9~13, 2005 Zhenping Liu, Razvan Popescu, and Dantong Yu USATLAS/RHIC Computing.
The ATLAS Grid Progress Roger Jones Lancaster University GridPP CM QMUL, 28 June 2006.
Author: Andrew C. Smith Abstract: LHCb's participation in LCG's Service Challenge 3 involves testing the bulk data transfer infrastructure developed to.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
ATLAS Bulk Pre-stageing Tests Graeme Stewart University of Glasgow.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Status SC3 SARA/Nikhef 20 juli Status & results SC3 throughput phase SARA/Nikhef Mark van de Sanden.
Derek Ross E-Science Department DCache Deployment at Tier1A UK HEP Sysman April 2005.
SC4 Planning Planning for the Initial LCG Service September 2005.
INFSO-RI Enabling Grids for E-sciencE The gLite File Transfer Service: Middleware Lessons Learned form Service Challenges Paolo.
December 26, 2015 RHIC/USATLAS Grid Computing Facility Overview Dantong Yu Brookhaven National Lab.
U.S. ATLAS Computing Facilities Bruce G. Gibbard GDB Meeting 16 March 2005.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
Scott Koranda, UWM & NCSA 14 January 2016www.griphyn.org Lightweight Data Replicator Scott Koranda University of Wisconsin-Milwaukee & National Center.
BNL Service Challenge 3 Site Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
BNL Grid Projects. 2 OutLine  Network/dCache  USATLAS Tier 1 Network Design  TeraPaths  Service Challenge 3  Service Challenge 4 Planning  USATLS.
LCG Storage Workshop “Service Challenge 2 Review” James Casey, IT-GD, CERN CERN, 5th April 2005.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Service Challenge Meeting “Review of Service Challenge 1” James Casey, IT-GD, CERN RAL, 26 January 2005.
ASCC Site Report Eric Yen & Simon C. Lin Academia Sinica 20 July 2005.
8 August 2006MB Report on Status and Progress of SC4 activities 1 MB (Snapshot) Report on Status and Progress of SC4 activities A weekly report is gathered.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
1 5/4/05 Fermilab Mass Storage Enstore, dCache and SRM Michael Zalokar Fermilab.
Summary of SC4 Disk-Disk Transfers LCG MB, April Jamie Shiers, CERN.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Acronyms GAS - Grid Acronym Soup, LCG - LHC Computing Project EGEE - Enabling Grids for E-sciencE.
VO Box discussion ATLAS NIKHEF January, 2006 Miguel Branco -
LCG GDB – Nov’05 1 Expt SC3 Status Nick Brook In chronological order: ALICE CMS LHCb ATLAS.
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
LHCC meeting – Feb’06 1 SC3 - Experiments’ Experiences Nick Brook In chronological order: ALICE CMS LHCb ATLAS.
LCG Service Challenge: Planning and Milestones
ATLAS Use and Experience of FTS
James Casey, IT-GD, CERN CERN, 5th September 2005
Data Challenge with the Grid in ATLAS
BNL FTS services Hironori Ito.
LHC Data Analysis using a worldwide computing grid
The LHCb Computing Data Challenge DC06
Presentation transcript:

BNL Service Challenge 3 Site Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven National Lab

2 Services at BNL  FTS (version 2.3.1) client + server and its backend Oracle and myproxy servers.  FTS does the job of reliable file transfer from CERN to BNL.  Most Functionalities were implemented. It became reliable in controlling data transfer after several rounds of redeployments for bug fixing: short timeout value causing excessive failures, incompatibility with dCache/SRM.  Does not support DIRECT data transfer between CERN to BNL dCache data pool server (dCache SRM third party data transfer). The data transfers actually go through a few dCache GridFTP door nodes at BNL, which presents scalability issue. We had to move these door nodes to non-blocking networking ports to distribute traffic.  Both BNL and RAL discovered that the number of streams per file could not be more than 10, (a bug)?  Networking to CERN:  Network for dCache was upgraded to 2*1Gpbs around June.  Shared link with Long Round Trip Time: >140 ms, while RTT for Europe sites to CERN is about 20ms.  Occasional packet losses were discovered along the path between BNL-CERN.  1.5 G bps aggregated bandwidth observed by iperf with 160 TCP streams.

3 Services at BNL  dCache/SRM ( V , with SRM 1.1 interface, Total 332 (3.06 Ghz, 2GByte Memory and 3 SCSI 140 Gbyte drives) nodes with about 170 TB disks, Multiple GridFTP, SRM, and dCap doors ): USATLAS production dCache system. q All nodes have Scientific Linux 3 with XFS module compiled. q Experienced High load on write pool serves during large amount data transfer. Was fixed by replacing the EXT file systems with XFS file system. q Core server crashed once. Reason was identified and fixed. q Small buffer space (1.0TB) for data written into dCache system. q dCache can now deliver up to 200MB/second for input/output (limited by network speed.)  LFC (1.3.4) client and server was installed at BNL Replica Catalog Server.  Server was installed. Tested the basic functionalities: lfc-ls, lfc-mkdir etc.  Will populate LFC with the entries in our production globus RLS server.  ATLAS VO Box (DDM + LCG VO box) was deployed at BNL.

4 Read pools DCap doors SRM door doors GridFTP doors doors Control Channel write pools Data Channel DCap Clients Pnfs ManagerPool Manager HPSS GridFTP Clientsd SRM Clients Oak Ridge Batch system DCache System BNL dCache Configuration

5 CERN Storage System

6 Data Transfer from CERN to BNL (ATLAS Tier 1)

7 Transfer Plots Castor2 LSF plugin problem plugin problem

8 BNL SC3 data transfer All data actually are routed through GridFtp doors SC3 Monitored at BNL

9 Data Transfer Status  BNL stablized FTS data transfer with high successful completion rate, as shown in the left image.  We have attained150 MB/second rate for about one hour with large number (> 50) of parallel file transfers. CERN FTS had the limit of 50 files per channel, which is not enough to fill up CERN  BNL data channel.

10 Final Data Transfer Reports

11 Lessons Learned From SC2  Four file transfer servers with 1 Gigabit WAN network connection to CERN.  Meet the performance/throughput challenges (70~80MB/second disk to disk).  Enabled data transfer between dCache/SRM and CERN SRM at openlab  Design our own script to control SRM data transfer.  Enabled data transfer between BNL GridFtp servers and CERN openlab GridFtp servers controlled by Radiant software.  Many components need to be tuned  250 ms RRT, high packet dropping rate, has to use multiple TCP streams and multiple file transfers to fill up network pipe.  Sluggish parallel file I/O with EXT2/EXT3, lot of processes with “D” state, more file streams, worse the performance on file system.  Slight improvement with XFS system. Still need to tune file system parameter

12 Some Issues  Service Challenge also challenges resource:  Tuned network pipes, optimized the configuration and performance of BNL production dCache system and its associate OS, file systems,  Required more than one staff’s involvements to stabilize the newly deployed FTS, dCache and network infrastructure.  Staffing level decreased as services became stable.  Limited Resources are shared by experiments and users.  At CERN, SC3 infrastructure are shared by multiple Tier 1 sites.  Due to the heterogeneous nature of Tier 1 sites, data transfer for each site should be optimized non-uniformly based on site’s various aspects: i.e. network RRT, packet loss rates, experiment requirements etc.  At BNL, network and dCache are also used by production users.  Need to closely monitor the SRM and network to avoid impacting production activities.  At CERN, James Casey alone handles answering , setting up the system, reporting problems and running data transfer. He provides 7/16 support himself.  How to scale to 7/24 production support/production center?  How to handle the time difference between US and CERN?  CERN Support Phone (Tried once, but the operator did not speak English)

13 What have been done.  SC3 Tier 2 Data Transfer  Data were transferred to three selected Tier 2 sites.  SC3 Tape Transfer  Tape Data Transfer was stablized at 60 MB/second with loaned tape resources.  Met the goal defined at the beginning of Service Challenge.  Full Chain of data transfer was exercised.

14 ATLAS SC3 Service Phase

15 ATLAS SC3 Service Phase goals  Exercise ATLAS data flow  Integration of data flow with the ATLAS Production System  Tier-0 exercise  More information: 

16 ATLAS-SC3 Tier0  Quasi-RAW data generated at CERN and reconstruction jobs run at CERN  No data transferred from the pit to the computer centre  “Raw data” and the reconstructed ESD and AOD data are replicated to Tier 1 sites using agents on the VO Boxes at each site.  Exercising use of CERN infrastructure …  Castor 2, LSF  and the LCG Grid middleware …  FTS, LFC, VO Boxes  Distributed Data Management (DDM) software

17 ATLAS Tier-0 EF CPU T1 castor RAW 1.6 GB/file 0.2 Hz 17K f/day 320 MB/s 27 TB/day ESD 0.5 GB/file 0.2 Hz 17K f/day 100 MB/s 8 TB/day AOD 10 MB/file 2 Hz 170K f/day 20 MB/s 1.6 TB/day AODm 500 MB/file 0.04 Hz 3.4K f/day 20 MB/s 1.6 TB/day RAW AOD RAW ESD (2x) AODm (10x) RAW ESD AODm 0.44 Hz 37K f/day 440 MB/s 1 Hz 85K f/day 720 MB/s 0.4 Hz 190K f/day 340 MB/s 2.24 Hz 170K f/day (temp) 20K f/day (perm) 140 MB/s

18 ATLAS-SC3 Tier-0  Main goal is a 10% exercise  Reconstruct “10%” of the number of events ATLAS will get in 2007 using “10%” of the full resources that will be needed at that time  Tier-0  ~300 kSI2k  “EF” to CASTOR: 32 MB/s  Disk to tape: 44 MB/s (32 for raw and 12 for ESD+AOD)  Disk to WN: 34 MB/s  T0 to each T1: 72 MB/s  3.8 TB to “tape” per day  Tier-1 (in average):  ~8500 files per day  At a rate of ~72 MB/s

19 24h before 4 day intervention 29/10 - 1/11 We achieved quite good rate in the testing phase (sustained MB/s to three sites (PIC, BNL and CNAF). ATLAS DDM Monitoring

20 Data Distribution  Use a generated “dataset”  Contains 6035 files (3 TB) and we tried to replicate it to BNL, CNAF and PIC.  BNL Data Transfer is under way.  PIC: 3600 files copied and registered  2195 ‘failed replication’ after 5 retries by us x 3 FTS retries  Problem under investigation  205 ‘assigned’ - still waiting to be copied  31 ‘validation failed’ since SE is down  4 ‘no replicas found’ LFC connection error  CNAF: 5932 files copied and registered  89 ‘failed replication’  14 ‘no replicas found’

21 General view of SC3  When everything is running smoothly ATLAS get good results  The middleware (FTS) is stable but there were still lots of compatibility issues:  FTS does not work new version of dCache/SRM (version 1.3).  ATLAS DDM software dependencies can also cause problems when sites upgrade middleware  not managed to exhaust anything production s/w; LCG m/w)  Still far from concluding the exercise and not running stably in any way.  Exercise will continue adding new sites.

22 BNL Service Challenge 4 Plan  Several steps needed to set-up hardware or service (ex: choose, procure, start install, end install, make operational)  LAN, Tape system, Computing farm, disk storage  dCache/SRM, FTS, LFC, DDM.  Continue to maintain and support the services with the define SLA (Service Level Agreement).  December/2005: begin installation of expanded LAN, new tape system and make the new installation operational.  January/2006: begin data transfer with the newly upgraded infrastructure, the target rate is 200M bytes/second and deploy all required baseline software.

23 BNL Service Challenge 4 Plan  April/2006, establish the stable data transfer in the speed of 200M Bytes/second to disks and 200 M Bytes/second to tape.  May/2006, disk and computing farm upgrading.  June/01/2006: stable data transfer driven by ATLAS production system and ATLAS data management infrastructure between T0~T1 (200M Bytes/second) and provide services to satisfy SLA (Service level agreement).  Details of involving Tier 2 are in planning too.