Last update: 01/06/2015 04:09 LCG les robertson - cern-it 1 LHC Computing Grid Project - LCG The LHC Computing Grid First steps towards a Global Computing.

Slides:



Advertisements
Similar presentations
CERN STAR TAP June 2001 Status of the EU DataGrid Project Fabrizio Gagliardi CERN EU-DataGrid Project Leader June 2001
Advertisements

Resources for the ATLAS Offline Computing Basis for the Estimates ATLAS Distributed Computing Model Cost Estimates Present Status Sharing of Resources.
HP Puerto-Rico – 9 February CERN and the LHC Computing Grid Ian Bird IT Department CERN, Geneva, Switzerland HP Puerto Rico 9 February 2004
Randall Sobie The ATLAS Experiment Randall Sobie Institute for Particle Physics University of Victoria Large Hadron Collider (LHC) at CERN Laboratory ATLAS.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Last update: 02/06/ :05 LCG les robertson - cern-it 1 The LHC Computing Grid Project Preparing for LHC Data Analysis NorduGrid Workshop Stockholm,
LCSC October The EGEE project: building a grid infrastructure for Europe Bob Jones EGEE Technical Director 4 th Annual Workshop on Linux.
1 Developing Countries Access to Scientific Knowledge Ian Willers CERN, Switzerland.
Les Les Robertson WLCG Project Leader WLCG – Worldwide LHC Computing Grid Where we are now & the Challenges of Real Data CHEP 2007 Victoria BC 3 September.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
Frédéric Hemmer, CERN, IT DepartmentThe LHC Computing Grid – October 2006 LHC Computing and Grids Frédéric Hemmer IT Deputy Department Head October 10,
Frédéric Hemmer, CERN, IT Department The LHC Computing Grid – June 2006 The LHC Computing Grid Visit of the Comité d’avis pour les questions Scientifiques.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
CERN Deploying the LHC Computing Grid The LCG Project Ian Bird IT Division, CERN CHEP March 2003.
11 December 2000 Paolo Capiluppi - DataGrid Testbed Workshop CMS Applications Requirements DataGrid Testbed Workshop Milano, 11 December 2000 Paolo Capiluppi,
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
LCG LHC Computing Grid Project – LCG CERN – European Organisation for Nuclear Research Geneva, Switzerland LCG LHCC Comprehensive.
Bob Jones Technical Director CERN - August 2003 EGEE is proposed as a project to be funded by the European Union under contract IST
CERN LCG Deployment Overview Ian Bird CERN IT/GD LHCC Comprehensive Review November 2003.
EGEE is a project funded by the European Union under contract IST Middleware Planning for LCG/EGEE Bob Jones EGEE Technical Director e-Science.
GridPP Building a UK Computing Grid for Particle Physics Professor Steve Lloyd, Queen Mary, University of London Chair of the GridPP Collaboration Board.
JRA Execution Plan 13 January JRA1 Execution Plan Frédéric Hemmer EGEE Middleware Manager EGEE is proposed as a project funded by the European.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
…building the next IT revolution From Web to Grid…
The LHC Computing Grid – February 2008 The Challenges of LHC Computing Dr Ian Bird LCG Project Leader 6 th October 2009 Telecom 2009 Youth Forum.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
Ian Bird LCG Deployment Area Manager & EGEE Operations Manager IT Department, CERN Presentation to HEPiX 22 nd October 2004 LCG Operations.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
LCG Denis Linglin - 1 MàJ : 9/02/03 07:24 LHC Computing Grid Project Status Report 12 February 2003.
EGEE MiddlewareLCG Internal review18 November EGEE Middleware Activities Overview Frédéric Hemmer EGEE Middleware Manager EGEE is proposed as.
LCG LCG Workshop – March 23-24, Middleware Development within the EGEE Project LCG Workshop CERN March 2004 Frédéric Hemmer.
Presentation of the A particle collision = an event Physicist's goal is to count, trace and characterize all the particles produced and fully.
SC4 Planning Planning for the Initial LCG Service September 2005.
LHC Computing, CERN, & Federated Identities
CERN LCG Deployment Overview Ian Bird CERN IT/GD LCG Internal Review November 2003.
12 March, 2002 LCG Applications Area - Introduction slide 1 LCG Applications Session LCG Launch Workshop March 12, 2002 John Harvey, CERN LHCb Computing.
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
EGEE is a project funded by the European Union under contract IST Roles & Responsibilities Ian Bird SA1 Manager Cork Meeting, April 2004.
EGEE Project Review Fabrizio Gagliardi EDG-7 30 September 2003 EGEE is proposed as a project funded by the European Union under contract IST
Last update: 27/02/ :04 LCG Early Thinking on ARDA in the Applications Area Torre Wenaus, BNL/CERN LCG Applications Area Manager PEB Dec 9, 2003.
LHC Computing, SPC-FC-CC-C; H F Hoffmann1 CERN/2379/Rev: Proposal for building the LHC computing environment at CERN (Phase 1) Goals of Phase.
Last update: 03/03/ :37 LCG Grid Technology Area Quarterly Status & Progress Report SC2 February 6, 2004.
- 11apr03 # 1 Operations Operations Management = LCG deployment management Management team at CERN (+7 FTEs) Core infrastructure.
12 March, 2002 LCG Applications Area - Introduction slide 1 LCG Applications Session LCG Launch Workshop March 12, 2002 John Harvey, CERN LHCb Computing.
Dominique Boutigny December 12, 2006 CC-IN2P3 a Tier-1 for W-LCG 1 st Chinese – French Workshop on LHC Physics and associated Grid Computing IHEP - Beijing.
10-Feb-00 CERN HepCCC Grid Initiative ATLAS meeting – 16 February 2000 Les Robertson CERN/IT.
Hans Hoffmann, Director Technology Transfer & Scientific Computing, ( ) Why CERN provides a unique IT challenge for industry: from the Web to the.
Bob Jones EGEE Technical Director
WLCG Tier-2 Asia Workshop TIFR, Mumbai 1-3 December 2006
Regional Operations Centres Core infrastructure Centres
BaBar-Grid Status and Prospects
“A Data Movement Service for the LHC”
EGEE Middleware Activities Overview
JRA3 Introduction Åke Edlund EGEE Security Head
SA1 Execution Plan Status and Issues
Ian Bird GDB Meeting CERN 9 September 2003
Long-term Grid Sustainability
LHC Computing Grid Project - LCG
Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002
LHC Data Analysis using a worldwide computing grid
The LHC Computing Grid Project Status & Plans
LHC Computing Grid Project
Overview & Status Al-Ain, UAE November 2007.
Presentation transcript:

last update: 01/06/ :09 LCG les robertson - cern-it 1 LHC Computing Grid Project - LCG The LHC Computing Grid First steps towards a Global Computing Facility for Physics 18 September 2003 Les Robertson – LCG Project Leader CERN – European Organization for Nuclear Research Geneva, Switzerland LCG

last update 01/06/ :09 LCG les robertson - cern-it-2 LHC Computing Grid Project  The LCG Project is a collaboration of –  The LHC experiments  The Regional Computing Centres  Physics institutes.. working together to prepare and deploy the computing environment that will be used by the experiments to analyse the LHC data  This includes support for applications  provision of common tools, frameworks, environment, data persistency .. and the development and operation of a computing service  exploiting the resources available to LHC experiments in computing centres, physics institutes and universities around the world  presenting this as a reliable, coherent environment for the experiments

last update 01/06/ :09 LCG les robertson - cern-it-3 Joint with EGEE Applications Area Development environment Joint projects Data management Distributed analysis Middleware Area Provision of a base set of grid middleware – acquisition, development, integration, testing, support CERN Fabric Area Large cluster management Data recording Cluster technology Networking Computing service at CERN Grid Deployment Area Establishing and managing the Grid Service - Middleware certification, security, operations, registration, authorisation, accounting Operational Management of the Project

last update 01/06/ :09 LCG les robertson - cern-it-4 Applications Area Projects  Software Process and Infrastructure (SPI) (A.Aimar)  Librarian, QA, testing, developer tools, documentation, training, …  Persistency Framework (POOL) (D.Duellmann)  POOL hybrid ROOT/relational data store  Core Tools and Services (SEAL) (P.Mato)  Foundation and utility libraries, basic framework services, object dictionary and whiteboard, math libraries, (grid enabled services)  Physicist Interface (PI) (V.Innocente)  Interfaces and tools by which physicists directly use the software. Interactive analysis, visualization, (distributed analysis & grid portals)  Simulation (T.Wenaus et al)  Generic framework, Geant4, FLUKA integration, physics validation, generator services  Close relationship with -- ROOT (R.Brun)  ROOT I/O event store; Analysis package  Group currently working on distributed analysis requirements

LHC data 40 million collisions per second After filtering, collisions of interest per second 1-10 Megabytes of data digitised for each collision = recording rate of Gigabytes/sec collisions recorded each year = ~15 Petabytes/year of data CMSLHCbATLASALICE 1 Megabyte (1MB) A digital photo 1 Gigabyte (1GB) = 1000MB A DVD movie 1 Terabyte (1TB) = 1000GB World annual book production 1 Petabyte (1PB) = 1000TB Annual production of one LHC experiment 1 Exabyte (1EB) = 1000 PB World annual information production

LHC data ~15 PetaBytes – about 20 million CDs each year! Concorde (15 Km) Balloon (30 Km) CD stack with 1 year LHC data! (~ 20 Km) Mt. Blanc (4.8 Km) Its analysis will need the computing power of ~ 100,000 of today's fastest PC processors! Where will the experiments store all of this data? and where will they find this computing power?

The CERN Computing Centre ~2,000 processors ~100 TBytes of disk ~1 PB of magnetic tape Even with technology-driven improvements in performance and costs – CERN can provide nowhere near enough capacity for LHC!

Computing for LHC Solution: Computing centres, which were isolated in the past, will be connected into a computing grid Europe: 267 institutes 4603 users Elsewhere: 208 institutes 1632 users -- uniting the computing resources of particle physicists in the world!

last update 01/06/ :09 LCG les robertson - cern-it-9 LCG Regional Centres First wave centres  CERN  Academica Sinica Taiwan  Brookhaven National Lab  PIC Barcelona  CNAF Bologna  Fermilab  FZK Karlsruhe  IN2P3 Lyon  KFKI Budapest  Moscow State University  University of Prague  Rutherford Appleton Lab (UK)  University of Tokyo Other Centres  Caltech  GSI Darmstadt  Italian Tier 2s(Torino, Milano, Legnaro)  JINR Dubna  Manno (Switzerland)  NIKHEF Amsterdam  Ohio Supercomputing Centre  Sweden (NorduGrid)  Tata Institute (India)  Triumf (Canada)  UCSD  UK Tier 2s  University of Florida– Gainesville  …… Pilot production service : 2003 – 2005

last update 01/06/ :09 LCG les robertson - cern-it-10 Some of the Sources for Middleware & Tools used by LCG The Virtual Data Toolkit - VDT

last update 01/06/ :09 LCG les robertson - cern-it-11 Goals for LCG-1 – The Pilot Grid Service for LHC Experiments  Be the principal service for Data Challenges in 2004  Initially focused on batch production work  And later also interactive analysis  Get experience in close collaboration between the Regional Centres  Learn how to maintain and operate a global grid  Focus on a production-quality  Robustness, fault-tolerance, predictability, and supportability take precedence; additional functionality gets prioritized  LCG should be integrated into the sites’ mainline physics computing services – it should not be something apart  This requires coordination between participating sites in:  Policies and collaborative agreements  Resource planning and scheduling  Operations and Support

last update 01/06/ :09 LCG les robertson - cern-it-12 Elements of a Production LCG Service  Middleware:  Integration, testing and certification  Packaging, configuration, distribution and site validation  Support – problem determination and resolution; feedback to middleware developers  Operations:  Grid infrastructure services  Site fabrics run as production services  Operations centres – trouble and performance monitoring, problem resolution – 24x7 globally  Support:  Experiment integration – ensure optimal use of system  User support – call centres/helpdesk – global coverage; documentation; training

last update 01/06/ :09 LCG les robertson - cern-it-13  Certification and distribution process established  Middleware package – components from –  European DataGrid (EDG)  US (Globus, Condor, PPDG, GriPhyN)  the Virtual Data Toolkit  Agreement reached on principles for registration and security  Rutherford Lab (UK) to provide the initial Grid Operations Centre  FZK (Karlsruhe) to operate the Call Centre  Pre-release middleware deployed in July to the initial 10 centres  The “certified” release was made available to 13 centres on 1 September – Academia Sinica Taipei, BNL, CERN, CNAF, FNAL, FZK, IN2P3 Lyon, KFKI Budapest, Moscow State Univ., Prague, PIC Barcelona, RAL, Univ. Tokyo  Next steps – - Get the experiments going - Expand to other centres LCG LCG Service Status

last update 01/06/ :09 LCG les robertson - cern-it-14 Preliminary full simulation and reconstruction tests with ALICE ALIEN submitting work to the LCG service  Aliroot fully reconstructed events  CPU-intensive, RAM-demanding (up to 600MB), long lasting jobs ( average 14 hours )  Outcome:  > 95 % successful job submission, execution and output retrieval

last update 01/06/ :09 LCG les robertson - cern-it-15 LCG 1 at the Time of First Release  Impressive improvement on Stability w.r.t. old 1.x EDG releases and corresponding testbeds  Lots of room for further improvements  Additional features to be added before the end of the year, in preparation for the data challenges of 2004  As more centres join, Scalability will surely become a major issue

last update 01/06/ :09 LCG les robertson - cern-it-16 Resources committed for 1Q04 Resources in Regional Centres  Resources planned for the period of the data challenges in 2004  CERN ~12% of the total capacity  Numbers have to be refined – different standards used by different countries  Efficiency of use is a major question mark  Reliability  Efficient scheduling  Sharing between Virtual Organisations (user groups) CPU (kSI2K) Disk TB Support FTE Tape TB CERN Czech Repub France Germany Holland Italy Japan Poland Russia Taiwan Spain Sweden Switzerland UK USA Total

last update 01/06/ :09 LCG les robertson - cern-it-17 From LCG-1 to LHC Startup

last update 01/06/ :09 LCG les robertson - cern-it-18 Where are we now with Grid Technology?  For LHC – - we now understand the basic requirements for batch processing  And we have a prototype solution developed by Globus and Condor in the US and the DataGrid and related projects in Europe  It is more difficult than was expected – - reliability, scalability, monitoring, operation,.. And we are not yet seeing useful industrial products  But we are ready to start re-engineering the components  part of the large EGEE project proposal submitted to the EU  re-write of Globus using a web-services architecture is now available  Many more practical problems will be discovered now that we start running a grid as a sustained round-the-clock service and the LHC experiments begin to use it for doing real work

last update 01/06/ :09 LCG les robertson - cern-it-19 Grid Middleware for LCG in the Longer Term Requirements –  A second round of specification of the basic grid requirements is being completed now - HEPCAL II  A team has started to specify the higher level requirements for distributed analysis – batch and interactive – and define the HEP-specific tools that will be needed For basic middleware the current strategy is to assume  that the US DoE/NSF will provide a well supported Virtual Data Toolkit based on Globus Toolkit 3  that the EGEE project, approved for EU 6 th framework funding, will develop the additional tools needed by LCG And the LCG Applications Area will develop higher-level HEP- specific functionality LCG

ARDA – An Architectural Roadmap Towards Distributed Analysis Work in Progress !

last update 01/06/ :09 LCG les robertson - cern-it-21 How close are we to LHC startup? agree spec. of initial service (LCG-1) open LCG-1 used for simulated event productions * TDR – technical design report stabilise, expand the service develop operations centre, etc first data physicscomputing service Starter toolkit – components from VDT and EDG

last update 01/06/ :09 LCG les robertson - cern-it-22 Timeline for the LCG computing service agree spec. of initial service (LCG-1) open LCG-1 used for simulated event productions principal service for LHC data challenges – batch analysis and simulation LCG-2 upgraded middleware, mgt. s/w * TDR – technical design report stabilise, expand the service develop operations centre, etc first data physicscomputing service This is the full complement of the first generation middleware from VDT/EDG - hardened to provide a stable, reliable service Computing model TDRs *

last update 01/06/ :09 LCG les robertson - cern-it-23 Timeline for the LCG computing service agree spec. of initial service (LCG-1) open LCG-1 used for simulated event productions principal service for LHC data challenges – batch analysis and simulation validation of computing models LCG-2 upgraded middleware, mgt. s/w LCG-3 full multi-tier prototype service – batch and interactive * TDR – technical design report stabilise, expand the service develop operations centre, etc first data physicscomputing service Computing model TDRs * testing, hardening of 2 nd generation middleware TDR for the Phase 2 grid At this point we expect to start deploying second generation middleware components - building this up during the year to attain the full base functionality required for LHC startup

last update 01/06/ :09 LCG les robertson - cern-it-24 Timeline for the LCG computing service agree spec. of initial service (LCG-1) open LCG-1 used for simulated event productions principal service for LHC data challenges – batch analysis and simulation validation of computing models LCG-2 upgraded middleware, mgt. s/w LCG-3 full multi-tier prototype service – batch and interactive TDR for the Phase 2 grid acquisition, installation, commissioning of Phase 2 service (for LHC startup) Phase 2 service in production * TDR – technical design report stabilise, expand the service develop operations centre, etc. testing, hardening of 2 nd generation middleware first data experiment setup & preparation physicscomputing service Computing model TDRs * At CERN the acquisition process will have started already during 2004 with a market survey

last update 01/06/ :09 LCG les robertson - cern-it-25 Timeline for the LCG computing service agree spec. of initial service (LCG-1) open LCG-1 used for simulated event productions principal service for LHC data challenges – batch analysis and simulation validation of computing models LCG-2 upgraded middleware, mgt. s/w LCG-3 full multi-tier prototype service – batch and interactive TDR for the Phase 2 grid acquisition, installation, commissioning of Phase 2 service (for LHC startup) Phase 2 service in production * TDR – technical design report stabilise, expand the service develop operations centre, etc. testing, hardening of 2 nd generation middleware first data experiment setup & preparation physicscomputing service Computing model TDRs *

last update 01/06/ :09 LCG les robertson - cern-it-26 Evolution of the Base Technology  These are still very early days – with very few grids providing a reliable, round-the-clock “production” service  And few applications that are managing gigantic distributed databases  Although the basic ideas and tools have been around for a long time, we are only now seeing these applied to large scale services  Developing the grid concept continues to attract substantial interest and public funding  There are major changes taking place in architecture and frameworks –  E.g. the Open Grid Services Architecture and Infrastructure (OGSA, OGSI) and there will be more to come as experience grows  There is a lot of commercial interest from potential software suppliers (IBM, HP, Microsoft,..) – but no clear sight of useful products

last update 01/06/ :09 LCG les robertson - cern-it-27 Adapting to the changing landscape  In the short-term there will many grids and several middleware implementations -- for LCG - inter-operability will be a major headache  Will be all agree on a common set of tools? - unlikely!  Or will we have to operate a grid of grids – some sort of federation?  Or will computing centres be part of several grids?  The Global Grid Forum – promises to provide a mechanism for evolving architecture and agreeing on standards – but this is a long-term process  In the medium-term, until there is substantial practical experience with different architectures and different implementations, de facto standards will emerge  How quickly will we recognise the winners?  Will we have become too attached to our own developments to change?

last update 01/06/ :09 LCG les robertson - cern-it this is probably the greatest risk that we take by adopting the grid model for LHC computing Access Rights and Security The grid idea assumes global authentication, and authorisation based on the user’s role in his virtual organisation -- one set of credentials for everything you do  The agreement for LHC is that all members of a physics collaboration will have access to all of its resources -- the political implications of this have still to be tested!  Could be an attractive target for hackers

last update 01/06/ :09 LCG les robertson - cern-it-29 Key LCG goals for next 12 months  Take-up by the experiments of the first versions of common applications – starts NOW  Evolve the LCG-1 service into a production facility for LHC experiments – validated in data challenges  Establish the requirements and a credible implementation plan for baseline distributed grid analysis for  the model  hep-specific tools  base grid technology - middleware to support the computing models of the experiments – Technical Design Reports due end 2004 LCG

last update 01/06/ :09 LCG les robertson - cern-it-30 Summary  The LCG Project has a clear goal of providing the environment and services for recording and analysing the LHC data when the accelerator starts operation in 2007  The computational requirements of LHC dictate a geographically distributed solution, taking maximum advantage of the facilities available to LHC around the world --- a computational GRID  A pilot service – LCG-1 – has been opened to learn how to use this technology to provide a reliable, efficient service encompassing many independent computing centres  It is already clear that the current middleware will have to re-engineered or replaced to achieve the goals of reliability and scalability  In the medium term we expect to get this new middleware from EU and US funded projects  But the technology is evolving rapidly – and LCG will have to adapt to a changing environment  While we keep a strong focus on providing a continuing service for the LHC Collaborations