GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.

Slides:



Advertisements
Similar presentations
GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
Advertisements

Storage Workshop Summary Wahid Bhimji University Of Edinburgh On behalf all of the participants…
1Oxford eSc – 1 st July03 GridPP2: Application Requirement & Developments Nick Brook University of Bristol ALICE Hardware Projections Applications Programme.
Tony Doyle GridPP2 Proposal, BT Meeting, Imperial, 23 July 2003.
Andrew McNab - Manchester HEP - 22 April 2002 EU DataGrid Testbed EU DataGrid Software releases Testbed 1 Job Lifecycle Authorisation at your site More.
CMS Applications Towards Requirements for Data Processing and Analysis on the Open Science Grid Greg Graham FNAL CD/CMS for OSG Deployment 16-Dec-2004.
Batch Production and Monte Carlo + CDB work status Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.
Sep Donatella Lucchesi 1 CDF Status of Computing Donatella Lucchesi INFN and University of Padova.
JIM Deployment for the CDF Experiment M. Burgon-Lyon 1, A. Baranowski 2, V. Bartsch 3,S. Belforte 4, G. Garzoglio 2, R. Herber 2, R. Illingworth 2, R.
18 Feb 2004Computing Division Project Status Report1 Project Status Report : SAMGrid  SAMGrid Management, Status, Operations – Merritt  SAMGrid Development.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
Stuart Wakefield Imperial College London1 How (and why) HEP uses the Grid.
F Run II Experiments and the Grid Amber Boehnlein Fermilab September 16, 2005.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
High Energy Physics At OSCER A User Perspective OU Supercomputing Symposium 2003 Joel Snow, Langston U.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
Grid Job and Information Management (JIM) for D0 and CDF Gabriele Garzoglio for the JIM Team.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
CDF Grid Status Stefan Stonjek 05-Jul th GridPP meeting / Durham.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
CHEP'07 September D0 data reprocessing on OSG Authors Andrew Baranovski (Fermilab) for B. Abbot, M. Diesburg, G. Garzoglio, T. Kurca, P. Mhashilkar.
Nick Brook Current status Future Collaboration Plans Future UK plans.
- Iain Bertram R-GMA and DØ Iain Bertram RAL 13 May 2004 Thanks to Jeff Templon at Nikhef.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
SAMGrid as a Stakeholder of FermiGrid Valeria Bartsch Computing Division Fermilab.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
DØ Computing Model & Monte Carlo & Data Reprocessing Gavin Davies Imperial College London DOSAR Workshop, Sao Paulo, September 2005.
Grid Computing Status Report Jeff Templon PDP Group, NIKHEF NIKHEF Scientific Advisory Committee 20 May 2005.
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
22 nd September 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
D0RACE: Testbed Session Lee Lueking D0 Remote Analysis Workshop February 12, 2002.
Dzero MC production on LCG How to live in two worlds (SAM and LCG)
16 September GridPP 5 th Collaboration Meeting D0&CDF SAM and The Grid Act I: Grid, Sam and Run II Rick St. Denis – Glasgow University Act II: Sam4CDF.
1 DØ Grid PP Plans – SAM, Grid, Ceiling Wax and Things Iain Bertram Lancaster University Monday 5 November 2001.
GridPP Building a UK Computing Grid for Particle Physics Professor Steve Lloyd, Queen Mary, University of London Chair of the GridPP Collaboration Board.
The Experiments – progress and status Roger Barlow GridPP7 Oxford 2 nd July 2003.
Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
…building the next IT revolution From Web to Grid…
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
GridPP11 Liverpool Sept04 SAMGrid GridPP11 Liverpool Sept 2004 Gavin Davies Imperial College London.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
May Donatella Lucchesi 1 CDF Status of Computing Donatella Lucchesi INFN and University of Padova.
Outline: Status: Report after one month of Plans for the future (Preparing Summer -Fall 2003) (CNAF): Update A. Sidoti, INFN Pisa and.
Feb. 14, 2002DØRAM Proposal DØ IB Meeting, Jae Yu 1 Proposal for a DØ Remote Analysis Model (DØRAM) Introduction Partial Workshop Results DØRAM Architecture.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Frank Wuerthwein, UCSD Update on D0 and CDF computing models and experience Frank Wuerthwein UCSD For CDF and DO collaborations October 2 nd, 2003 Many.
The GridPP DIRAC project DIRAC for non-LHC communities.
Run II Review Closeout 15 Sept., 2004 FNAL. Thanks! …all the hard work from the reviewees –And all the speakers …hospitality of our hosts Good progress.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
The GridPP DIRAC project DIRAC for non-LHC communities.
Victoria A. White Head, Computing Division, Fermilab Fermilab Grid Computing – CDF, D0 and more..
July 26, 2007Parag Mhashilkar, Fermilab1 DZero On OSG: Site And Application Validation Parag Mhashilkar, Fermi National Accelerator Laboratory.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
CDF SAM Deployment Status Doug Benjamin Duke University (for the CDF Data Handling Group)
DØ Computing Model and Operational Status Gavin Davies Imperial College London Run II Computing Review, September 2005.
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
DØ Grid Computing Gavin Davies, Frédéric Villeneuve-Séguier Imperial College London On behalf of the DØ Collaboration and the SAMGrid team The 2007 Europhysics.
BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.
Monte Carlo Production and Reprocessing at DZero
Data Challenge with the Grid in ATLAS
Production Resources & Issues p20.09 MC-data Regeneration
Readiness of ATLAS Computing - A personal view
DØ MC and Data Processing on the Grid
The LHCb Computing Data Challenge DC06
Presentation transcript:

GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College London

GridPP18 Glasgow Mar 07Introduction Tevatron –Running experiments (Less data than LHC, but still PBs/experiment) –Growing - great physics & better still to come.. Have 2fb -1 of data and expect up 6fb -1 more by end running been discussed Computing model: Datagrid (SAM) for all data handling & originally distributed computing with evolution to automated use of common tools/solutions on the grid (SAMGrid) for all tasks –Started with production tasks –Started with production tasks eg MC generation, data processing Greatest need & easiest to ‘gridify’ - ahead of the wave and a running expt. –Base on SAMGrid, but have a program of interoperability from v. early on Initially LCG and then OSG –Increased automation, user analysis considered last SAM gives remote data analysis

GridPP18 Glasgow Mar 07 Computing Model Remote Analysis Systems Data Handling Services Central Analysis Systems Remote Farms Central Farms User Desktops Central Storage Raw Data RECO Data RECO MC User Data

GridPP18 Glasgow Mar 07 Components - Terminology SAM (Sequential Access to Metadata) –Well developed metadata & distributed data replication system –Originally developed by DØ & FNAL-CD, now used by CDF & MINOS JIM ( Job Information and Monitoring) –handles job submission and monitoring (all but data handling) –SAM + JIM → SAMGrid – computational grid Runjob –handles job workflow management UK Role –Project leadership –Key technology – runjob, integration of SAMGrid dev & production

GridPP18 Glasgow Mar 07 SAM plots Over 10 PB (250B evts) last yr Up to 1.2 PB moved per month (x5 increase over 2 yrs ago) SAM TV - monitor SAM and SAM stations Continued success: SAM shifters – often remote 1PB / month All

GridPP18 Glasgow Mar 07SAMGrid-plots JIM: > 10 active execution sites “Moving to forwarding nodes” “No longer add red dots”

GridPP18 Glasgow Mar 07 SAMGrid Interoperability Long programme of interoperability – LCG 1 st and then OSG Step 1: Co-existence – use shared resources with SAM(Grid) headnode –Widely done for both MC and 2004/5 data reprocessing Nikhef MC v. good example – GridPP10 talk Step 2 – SAMGrid-LCG interface –SAM does data handling & JIM job submission –Basically forwarding mechanism –Data fixing in early 2006 –MC since OSG activity – learnt from LCG activity –P20 data reprocessing now Replicate as needed

GridPP18 Glasgow Mar 07 Monte Carlo Massive increase with spread of SAMGrid use & LCG (OSG later) P17 – 455M events since 09/05 30M events/month 80% in Eu –Almost a const of nature UKRAC –Full details on web – neu/d0_uk_rac/d0_uk_rac.html LCG gridwide submission reached scaling problem

GridPP18 Glasgow Mar 07 P14 Reprocessing: Winter 2003/04 –100M events remotely, 25M in UK –Distributed computing rather than Grid P17 Reprocessing: Spring – Autumn 05 –x 10 larger ie 1B events, 250TB, from raw –SAMGrid as default (using mc_runjob) P17 Fixing: Spring 06 –All RunIIa – 1.4B events in 6 weeks –SAMGrid-LCG ‘burnt-in’ Moving to primary processing and skimming Data – reprocessing & fixing Site certification

GridPP18 Glasgow Mar 07 A comment.. if I may Largest data challenges (I believe) in HEP using the grid Learnt a lot about the technology, and especially how it scales Learnt a lot about organisation / operation of such projects Some of these can be abstracted and of benefit to others… (a different talk…)

GridPP18 Glasgow Mar 07 A comment - graphically P20 reprocessing –I know its OSG –(started with LCG) –SAMGrid-LCG Will use to catch-up IN2P3 OSG A lot of green A lot of red

GridPP18 Glasgow Mar 07 (DØ –) Runjob Used in all production tasks – UK responsibility In 04 we froze SAM at v5 & mc_runjob used by SAMGrid for MC and reprocessing from then till summer 06 DØrunjob - the rewrite Joint (CDF,) CMS, DØ, FNAL-CD project Base classes from common Runjob package Things got messy – but triumph –Sustainable, long term product with SAM v7 For details see: Runjob CDFRunjobCMSRunjobDØRunjob

GridPP18 Glasgow Mar 07 Next steps / issues - I Complete endgame development – ability to analysis larger datasets with decreasing manpower –Additional functionality – skimming, primary processing at multiple sites, MC prod at diff stages, diff output… –Additional resources - Completing the forwarding nodes Full data /MC capability Scaling issues to access the full LCG and OSG worlds –Data analysis – how gridified do we go? – an open issue My feeling – need to be ‘interoperable’ – Fermigrid, certain large LCG sites Will need development, deployment and operations effort –And operations..

GridPP18 Glasgow Mar 07 Next steps / issues - II “Steady” state – goal to reach by end of CY 07 (≥ 2yrs running) –Maintenance of existing functionality –Continued experimental requests –Continued evolution as grid standard’s evolve –Operations You do still need manpower  and not just to make sure the hardware works  MC and data are not fire and forget Manpower a real issue (Especially with data analysis on the grid)

GridPP18 Glasgow Mar 07 Summary / plans DØ and Tevatron performing very well –Big physics results have come out, better yet on their way –Much more data to come  increasing needs, with reduced effort SAM & SAMGrid critical to DØ –Without the grid DØ would not have worked – THANKS - –GridPP key part of effort (technical / leadership) – THANKS - –Users - demanding, hard to develop and maintain production level services Baseline: Ensure (scaling for) production tasks –Move to SAMv7 and d0runjob –Accessing all LCG - establishing UKRAC – forwarding nodes In parallel open question of data analysis – will need to go part way Manpower for development, integration and operation is a real issue

GridPP18 Glasgow Mar 07 Back-ups

GridPP18 Glasgow Mar 07 SAMGrid Architecture