Presentation on theme: "Tony Doyle - University of Glasgow E-Science and LCG-2 PPAP Summary Results from GridPP1/LCG1 Value of the UK contribution to LCG? Aims of GridPP2/LCG2."— Presentation transcript:
Tony Doyle - University of Glasgow E-Science and LCG-2 PPAP Summary Results from GridPP1/LCG1 Value of the UK contribution to LCG? Aims of GridPP2/LCG2 UK special contribution to LCG2? How much effort will be needed to continue activities during the LHC era?
Tony Doyle - University of Glasgow 26 October 2004PPAP Outline 1.What has been achieved in GridPP1?  GridPP I (09/01-08/04)Prototypecomplete 2.What is being attempted in GridPP2?  GridPP II (09/04-08/07)Productionshort timescale What is the value of a UK LCG Phase-2 contribution? 3.Resources needed in medium-long term?  (09/07-08/10) Exploitationmedium Focus on resources needed in 2008 (09/10-08/14) Exploitationlong-term
Tony Doyle - University of Glasgow 26 October 2004PPAP Executive Summary Introduction Project Management Resources CERN Middleware Applications Tier-1/A Tier-2 Dissemination Exploitation Ref: the Grid is a reality A project was/is needed (under control via Project Map) Deployed according to planning Phase 1.. Phase 2 Prototype(s) made impact Fully engaged (value added) Tier-1 production mode Resources now being utilised UK flagship project Preliminary planning
Tony Doyle - University of Glasgow 26 October 2004PPAP GridPP Deployment Status Three Grids on Global scale in HEP (similar functionality) sitesCPUs LCG (GridPP)82 (14)7300 (1500) Grid3 [USA] NorduGrid GridPP deployment is part of LCG (Currently the largest Grid in the world) The future Grid in the UK is dependent upon LCG releases
Tony Doyle - University of Glasgow 26 October 2004PPAP Deployment Status (26/10/04) Incremental releases: significant improvements in reliability, performance and scalability –within the limits of the current architecture –scalability is much better than expected a year ago Many more nodes and processors than anticipated –installation problems of last year overcome –many small sites have contributed to MC productions Full-scale testing as part of this years data challenges GridPP The Grid becomes a reality – widely reported British Embassy (Russia) Technology Sites British Embassy (USA)
Tony Doyle - University of Glasgow 26 October 2004PPAP Data Challenges Ongoing.. Grid and non-Grid Production Grid now significant CMS - 75 M events and 150 TB: first of this years Grid data challenges ALICE - 35 CPU Years Phase 1 done Phase 2 ongoing LCG Entering Grid Production Phase..
Tony Doyle - University of Glasgow 26 October 2004PPAP Data Challenge 7.7 M GEANT4 events and 22 TB UK ~20% of LCG Ongoing.. (3) Grid Production ~150 CPU years so far Largest total computing requirement Small fraction of what ATLAS need.. Entering Grid Production Phase..
Tony Doyle - University of Glasgow 26 October 2004PPAP LHCb Data Challenge 424 CPU years (4,000 kSI2k months), 186M events UKs input significant (>1/4 total) LCG(UK) resource: –Tier-1 7.7% –Tier-2 sites: –London 3.9% –South 2.3% –North 1.4% DIRAC: –Imperial 2.0% –L'pool 3.1% –Oxford 0.1% –ScotGrid 5.1% DIRAC alone LCG in action /day LCG paused Phase 1 Completed /day LCG restarted 186 M Produced Events Entering Grid Production Phase..
Tony Doyle - University of Glasgow 26 October 2004PPAP Paradigm Shift Transition to Grid… Jun: 80%:20% 25% of DC04 Aug: 27%:73% 42% of DC04 May: 89%:11% 11% of DC04 Jul: 77%:23% 22% of DC CPU · Years
Tony Doyle - University of Glasgow 26 October 2004PPAP What was GridPP1? A team that built a working prototype grid of significant scale > 1,500 (7,300) CPUs > 500 (6,500) TB of storage > 1000 (6,000) simultaneous jobs A complex project where 82% of the 190 tasks for the first three years were completed A Success The achievement of something desired, planned, or attempted
Tony Doyle - University of Glasgow 26 October 2004PPAP Aims for GridPP2? From Prototype to Production BaBar D0 CDF ATLAS CMS LHCb ALICE 19 UK Institutes RAL Computer Centre CERN Computer Centre SAMGrid BaBarGrid LCG EDG GANGA EGEE UK Prototype Tier-1/A Centre CERN Prototype Tier-0 Centre 4 UK Tier-2 Centres LCG UK Tier-1/A Centre CERN Tier-0 Centre UK Prototype Tier-2 Centres ARDA Separate Experiments, Resources, Multiple Accounts 'One' Production Grid Prototype Grids
Tony Doyle - University of Glasgow 26 October 2004PPAP Planning: GridPP2 ProjectMap Need to recognise future requirements in each area…
Tony Doyle - University of Glasgow 26 October 2004PPAP Tier 0 and LCG: Foundation Programme Aim: build upon Phase 1 Ensure development programmes are linked Project management: GridPPLCG Shared expertise: LCG establishes the global computing infrastructure Allows all participating physicists to exploit LHC data Earmarked UK funding being reviewed Required Foundation: LCG Deployment F. LHC Computing Grid Project (LCG Phase 2) [review]
Tony Doyle - University of Glasgow 26 October 2004PPAP Tier 0 and LCG: RRB meeting today Jos Engelen proposal to RRB members (Richard Wade [UK]) on how a 20MCHF shortfall for LCG phase II can be funded Funding from UK (£1m), France, Germany and Italy for 5 staff. Others? Spain to fund ~2 staff. Others at this level? Now vitally important that the LCG effort established predominantly via UK funding (40%) is sustained at this level (~10%) URGENT Value to the UK? Required Foundation: LCG Deployment
Tony Doyle - University of Glasgow 26 October 2004PPAP Issues First large- scale Grid production problems being addressed… at all levels
Tony Doyle - University of Glasgow 26 October 2004PPAP Annual data storage: PetaBytes per year 100 Million SPECint ,000 PCs (3 GHz Pentium 4) Concorde (15 km) CD stack with 1 year LHC data (~ 20 km) What lies ahead? Some mountain climbing.. Quantitatively, were ~7% of the way there in terms of CPU (7,000 ex 100,000) and disk (4 ex 12-14*3-4 years)… In production terms, weve made base camp We are here (1 km) Importance of step-by-step planning… Pre-plan your trip, carry an ice axe and crampons and arrange for a guide…
Tony Doyle - University of Glasgow 26 October 2004PPAP Grid and e-Science Support in 2008 What areas require support? IV Running the Tier-1 Data Centre IVHardware annual upgrade IVContribution to Tier-2 Sysman effort (non-PPARC) hardware IVFrontend Tier-2 hardware IVContribution to Tier-0 support IIIOne M/S/N expert in each of 6 areas IIIProduction manager and four Tier-2 coordinators IIApplication/Grid experts (UK support) IATLAS Computing MoU commitments and support ICMS Computing MoU commitments and support ILHCb Core Tasks and Computing Support IALICE Computing support IFuture experiments adopt e-Infrastructure methods No GridPP management: (assume production mode established + devolved management to Institutes) III. Grid Middleware I. Experiment Layer II. Application Middleware IV. Facilities and Fabrics
Tony Doyle - University of Glasgow 26 October 2004PPAP PPARC Financial Input: GridPP1 Components LHC Computing Grid Project (LCG) Applications, Fabrics, Technology and Deployment European DataGrid (EDG) Middleware Development UK Tier-1/A Regional Centre Hardware and Manpower Grid Application Development LHC and US Experiments + Lattice QCD Management Travel etc
Tony Doyle - University of Glasgow 26 October 2004PPAP C. Grid Application Development LHC and US Experiments + Lattice QCD + Phenomenology B. Middleware Security Network Development F. LHC Computing Grid Project (LCG Phase 2) [review] E. Tier-1/A Deployment: Hardware, System Management, Experiment Support A. Management, Travel, Operations D. Tier-2 Deployment: 4 Regional Centres - M/S/N support and System Management PPARC Financial Input: GridPP2 Components
Tony Doyle - University of Glasgow 26 October 2004PPAP IV. Hardware Support UK Tier CPU Total (MSI2k)4.2 Disk Total (PB)3.8 Total Tape (PB)2.3 UK Tier CPU Total (MSI2k)8.0 Disk Total (PB)1.0 1.Global shortfall of Tier-1 CPU (-13%) and Disk (-55%) 2.UK Tier-1 input corresponds to ~40% (~10%) of global disk (CPU) 3.UK Tier-2 CPU and disk resources significant 4.Rapid physics analysis turnaround is a necessity 5.Priority is to ensure that ALL required software (experiment, middleware, OS) is routinely deployed on this hardware well before 2008
Tony Doyle - University of Glasgow 26 October 2004PPAP III. Middleware, Security and Network M/S/N builds upon UK strengths as part of International development Configuration Management Storage Interfaces Network Monitoring Security Information Services Grid Data Management Security Middleware Networking Require some support expertise in each of these areas in order to maintain the Grid
Tony Doyle - University of Glasgow 26 October 2004PPAP II. Application Middleware GANGA SAMGrid Lattice QCD AliEn CMS BaBar Require some support expertise in each of these areas in order to maintain the Grid applications. Need to develop e-Infrastructure portals for new experiments starting up in exploitation era. Pheomenology
ATLAS UK e-science forward look (Roger Jones) Both will move from development to optimisation & maintenance Current core and infrastructure activities: Run Time Testing and Validation Framework, tracking and trigger instantiations Provision of ATLAS Distributed Analysis & production tools Production management GANGA development Metadata development ATLFast simulation ATLANTIS Event Display Physics Software Tools ~11 FTEs mainly ATLAS e-science with some GridPP & HEFCE Current Tracking and Trigger e-science: Alignment effort ~6FTEs Core software ~2.5FTEs Tracking tools ~6FTEs Trigger ~2FTEs The current eScience funding will only take us (at best) to first data Expertise required for the real-world problems and maintenance Note for the HLT, the installation and commissioning will continue into the running period because of staging Need ~15 FTE (beyond existing rolling grant) in 2007/9 - continued e-science/GridPP support
CMS UK e-science forward look (Dave Newbold) NB: First look estimates; well inevitably change as we approach running Need ~9 FTE (beyond existing rolling grant) in 2007/9 - continued e-science/GridPP support Work areaCurrent FTEsFTEs FTEs Cmp sys / support (e-science WP1) ramp UP for running phase steady state Monitoring / DQM (e-science WP3) initial running1.5 - support / maintenance Tracker software (e-science WP4) initial deployment running support / maintenance ECAL software (e-science WP5) initial running1.0 - support / maintenance Data management (GridPP2) final dplymnt / support support / maintenance Analysis system (GridPP2) final dplymnt / support support / maintenance Computing system / support Development / tuning of computing model + system; management User support for T1 / T2 centres (globally); liaison with LCG ops Monitoring / DQM Online data gathering/expert systems for CMS tracker, trigger Tracker /ECAL software Installation / calibration support; low-level reconstruction codes Data management Phedex system for bulk offline data movement and tracking System-level metadata; movement of HLT farm data online (new area) Analysis system CMS-specific parts of distributed analysis system on LCG
LHCb UK e-science forward look (Nick Brook) Current core activities: GANGA development Provision of DIRAC & production tools Development of conditions DB The production bookkeeping DB Data management & metadata Tracking Data Challenge Production Manager ~10 FTEs mainly GridPP, e-science, studentships with some HEFCE support Will move from development to maintenance phase - UK pro rata share of LHCb core computing activities ~5 FTEs Current RICH & VELO e-science: RICH: UK provide bulk of the RICH s/w team including s/w coordinator ~7 FTEs about 50:50 e-science funding+rolling grant/HEFCE VELO: UK provide bulk of the VELO s/w team including s/w coordinator ~4 FTEs about 50:50 e-science funding+rolling grant/HEFCE ALL essential alignment activities for both detectors through e-science funding Will move from development to maintenance and operational alignment ~3FTEs for alignment in Need ~9 FTE (core+alignment+UK support) in 2007/9 - continued e-science support
Tony Doyle - University of Glasgow 26 October 2004PPAP Priorities in context of a financial snapshot in 2008 Grid (£5.6m p.a.) and e-Science (£2.7m p.a.) Assumes no GridPP project management Savings? –EGEE Phase 2 ( ) may contribute –UK e-Science context is 1.NGS (National Grid Service) 2.OMII (Open Middleware Infrastructure Institute) 3.DCC (Digital Curation Centre) Timeline? Grid and e-Science funding requirements To be compared with Road Map: Not a Bid - Preliminary Input
Tony Doyle - University of Glasgow 26 October 2004PPAP Grid and e-Science Exploitation Timeline? PPAP initial inputOct 2004 Science Committee initial input PPARC call assessment ( )2005 Science Committee outcomeOct 2005 PPARC call Jan 2006 PPARC close of call May 2006 Assessment Jun-Dec 2006 PPARC outcome Dec 2006 Institute Recruitment/RetentionJan-Aug 2007 Grid and e-Science Exploitation Sep …. Note if the assessment from PPARC internal planning differs significantly from this preliminary advice from PPAP and SC, then earlier planning is required.
Tony Doyle - University of Glasgow 26 October 2004PPAP Summary 1.What has been achieved in GridPP1? Widely recognised as successful at many levels 2.What is being attempted in GridPP2? Prototype to Production – typically most difficult phase UK should invest further in LCG Phase-2 3.Resources needed for Grid and e-Science in medium-long term? Current Road Map ~£6m p.a. Resources needed in 2008 estimated at £8.3m Timeline for decision-making outlined..