HEPiX Rome – April 2006 The High Energy Data Pump A Survey of State-of-the-Art Hardware & Software Solutions Martin Gasthuber / DESY Graeme Stewart / Glasgow.

Slides:



Advertisements
Similar presentations
Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC.
Advertisements

Exporting Raw/ESD data from Tier-0 Tier-1s Wrap-up.
HEPiX Edinburgh 28 May 2004 LCG les robertson - cern-it-1 Data Management Service Challenge Scope Networking, file transfer, data management Storage management.
Service Challenge Meeting “Service Challenge 2 update” James Casey, IT-GD, CERN IN2P3, Lyon, 15 March 2005.
CASTOR Project Status CASTOR Project Status CERNIT-PDP/DM February 2000.
12. March 2003Bernd Panzer-Steindel, CERN/IT1 LCG Fabric status
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
HEPiX Catania 19 th April 2002 Alan Silverman HEPiX Large Cluster SIG Report Alan Silverman 19 th April 2002 HEPiX 2002, Catania.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
FZU participation in the Tier0 test CERN August 3, 2006.
1. There are different assistant software tools and methods that help in managing the network in different things such as: 1. Special management programs.
WLCG Service Report ~~~ WLCG Management Board, 27 th January 2009.
SC4 Workshop Outline (Strong overlap with POW!) 1.Get data rates at all Tier1s up to MoU Values Recent re-run shows the way! (More on next slides…) 2.Re-deploy.
WLCG Service Report ~~~ WLCG Management Board, 27 th October
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
ATLAS Metrics for CCRC’08 Database Milestones WLCG CCRC'08 Post-Mortem Workshop CERN, Geneva, Switzerland June 12-13, 2008 Alexandre Vaniachine.
CMS STEP09 C. Charlot / LLR LCG-DIR 19/06/2009. Réunion LCG-France, 19/06/2009 C.Charlot STEP09: scale tests STEP09 was: A series of tests, not an integrated.
LCG Service Challenge Phase 4: Piano di attività e impatto sulla infrastruttura di rete 1 Service Challenge Phase 4: Piano di attività e impatto sulla.
LHC Data Challenges and Physics Analysis Jim Shank Boston University VI DOSAR Workshop 16 Sept., 2005.
GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.
4/5/2007Data handling and transfer in the LHCb experiment1 Data handling and transfer in the LHCb experiment RT NPSS Real Time 2007 FNAL - 4 th May 2007.
CCRC’08 Weekly Update Jamie Shiers ~~~ LCG MB, 1 st April 2008.
CERN Physics Database Services and Plans Maria Girone, CERN-IT
Testing the UK Tier 2 Data Storage and Transfer Infrastructure C. Brew (RAL) Y. Coppens (Birmingham), G. Cowen (Edinburgh) & J. Ferguson (Glasgow) 9-13.
Light weight Disk Pool Manager experience and future plans Jean-Philippe Baud, IT-GD, CERN September 2005.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
HEPiX FNAL ‘02 25 th Oct 2002 Alan Silverman HEPiX Large Cluster SIG Report Alan Silverman 25 th October 2002 HEPiX 2002, FNAL.
Alberto Aimar CERN – LCG1 Reliability Reports – May 2007
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
ATLAS Bulk Pre-stageing Tests Graeme Stewart University of Glasgow.
SC4 Planning Planning for the Initial LCG Service September 2005.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
WLCG Planning Issues GDB June Harry Renshall, Jamie Shiers.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 24 January 2005.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
WLCG Service Report ~~~ WLCG Management Board, 31 st March 2009.
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
WLCG Service Report ~~~ WLCG Management Board, 18 th September
CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.
LCG Service Challenges SC2 Goals Jamie Shiers, CERN-IT-GD 24 February 2005.
LCG Storage Workshop “Service Challenge 2 Review” James Casey, IT-GD, CERN CERN, 5th April 2005.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
Service Challenge Meeting “Review of Service Challenge 1” James Casey, IT-GD, CERN RAL, 26 January 2005.
8 August 2006MB Report on Status and Progress of SC4 activities 1 MB (Snapshot) Report on Status and Progress of SC4 activities A weekly report is gathered.
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
WLCG Service Report Jean-Philippe Baud ~~~ WLCG Management Board, 24 th August
CASTOR in SC Operational aspects Vladimír Bahyl CERN IT-FIO 3 2.
Summary of SC4 Disk-Disk Transfers LCG MB, April Jamie Shiers, CERN.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
ASGC incident report ASGC/OPS Jason Shih Nov 26 th 2009 Distributed Database Operations Workshop.
Top 5 Experiment Issues ExperimentALICEATLASCMSLHCb Issue #1xrootd- CASTOR2 functionality & performance Data Access from T1 MSS Issue.
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
ATLAS Computing: Experience from first data processing and analysis Workshop TYL’10.
WLCG ‘Weekly’ Service Report ~~~ WLCG Management Board, 19 th August 2008.
“A Data Movement Service for the LHC”
James Casey, IT-GD, CERN CERN, 5th September 2005
3D Application Tests Application test proposals
Maximum Availability Architecture Enterprise Technology Centre.
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
ATLAS Software Installation redundancy Alessandro De Salvo Alessandro
Olof Bärring LCG-LHCC Review, 22nd September 2008
1 VO User Team Alarm Total ALICE ATLAS CMS
Data Management cluster summary
LHC Data Analysis using a worldwide computing grid
The LHCb Computing Data Challenge DC06
Presentation transcript:

HEPiX Rome – April 2006 The High Energy Data Pump A Survey of State-of-the-Art Hardware & Software Solutions Martin Gasthuber / DESY Graeme Stewart / Glasgow Jamie Shiers / CERN

The High Energy Data Pump – Survey of Solutions Summary  Software and hardware technologies for providing high-bandwidth sustained bulk data transfers well understood  Still much work to do to reach stability of services over long periods (several weeks at a time)  This includes recovery from all sorts of inevitable operational problems  Distributed computing ain’t easy – distributed services harder!  Communication and collaboration fundamental  The schedule of our future follows (7 – 8 periods of 1 month)…

R.Bailey, Chamonix XV, January Breakdown of a normal year 7-8 ~ days for physics per year Not forgetting ion and TOTEM operation Leaves ~ days for proton luminosity running ? Efficiency for physics 50% ? ~ 50 days ~ 1200 h ~ s of proton luminosity running / year - From Chamonix XIV - Service upgrade slots?

The High Energy Data Pump – Survey of Solutions

The red line is the target average daily rate!

The High Energy Data Pump – Survey of Solutions ServiceChallengeFourBlog (LCG  SC)  06/04 09:00 TRIUMF exceeded their nominal data rate of 50MB/s yesterday, despite the comments below. Congratulations! Jamie  05/04 23:59 A rough day with problems that are not yet understood (see the tech list), but we also reached the highest rate ever (almost 1.3 GB/s) and we got FNAL running with srmcopy. Most sites are below their nominal rates, and at that they need too many concurrent transfers to achieve those rates, so we still have some debugging ahead of us. CASTOR has been giving us timeouts on SRM get requests and Olof had to clean up the request database. To be continued... Maarten  05/04 16:30 The Lemon monitoring plots show that almost exactly at noon the output of the SC4 WAN cluster dropped to zero. It looks like the problem was due to an error in the load generator, which might also explain the bumpy transfers BNL saw. Maarten  05/04 11:02 Maintenance on USLHCNET routers completed. (During the upgrade of the Chicago router, the traffic was rerouted through GEANT). Dan  05/04 11:06 Database upgrade completed by 10am.DLF database was recreated from scratch. Backup scripts activated. DB Compatibility moved to release , automatic startup/shutdown of the database tested. Nilo  05/04 10:50 DB upgrade is finished and CASTOR services have restarted. SC4 activity can resume. Miguel  05/04 09:32 SC4 CASTOR services stopped. Miguel  05/04 09:30 Stopped all channels to allow for upgrade of Oracle DB backend to more powerful node in CASTOR. James  04/04 IN2P3 meet their target nominal data rate for the past 24 hours (200MB/s). Congratulations! Jamie

The High Energy Data Pump – Survey of Solutions Conclusions  We have to practise repeatedly to get sustained daily average data rates  Proposal is to repeat an LHC running period (reduced – just ten days) every month  Transfers driven as dteam with low priority  We still have to add tape backend – not to mention the full DAQ-T0-T1 chain – and drive with experiment software First data will arrive next year NOT an option to get things going later