Presentation is loading. Please wait.

Presentation is loading. Please wait.

The LHC Computing Grid Project Status & Plans

Similar presentations


Presentation on theme: "The LHC Computing Grid Project Status & Plans"— Presentation transcript:

1 The LHC Computing Grid Project Status & Plans
LCG CERN-US Co-operation Committee 18 June 2004 Jürgen Knobloch IT Department, CERN This file is available at:

2 The LHC Computing Grid Project - LCG
Collaboration LHC Experiments Grid projects: Europe, US Regional & national centres worldwide Choices Adopt Grid technology. Go for a “Tier” hierarchy. Use Intel CPUs in standard PCs Use LINUX operating system. Goal Prepare and deploy the computing environment for the analysis of data from the LHC detectors. Lab m Uni x grid for a regional group Uni a CERN Tier 1 Lab a UK USA FNAL Tier-1 Tier3 physics department USA BNL France Uni n Tier-2 CERN Tier 0 Japan Italy Lab b Germany Lab c grid for a physics study group Uni y Uni b Desktop

3 Operational Management of the Project
Applications Area Development environment Joint projects Data management Distributed analysis Middleware Area – now EGEE Provision of a base set of grid middleware – acquisition, development, integration, testing, support ARDA A Realisation of Distributed Analysis for LHC CERN Fabric Area Large cluster management Data recording Cluster technology Networking Computing service at CERN Grid Deployment Area Establishing and managing the Grid Service - Middleware certification, security, operations, registration, authorisation, accounting Joint with a new European project EGEE Phase 1 – development of common software prototyping and operation of a pilot computing service Phase 2 – acquire, build and operate the LHC computing service Enabling Grids for e-Science in Europe

4 Sites in LCG-2/EGEE-0: June 18 2004
Austria U-Innsbruck Canada Triumf Alberta Carleton Montreal Toronto Czech Republic Prague-FZU Prague-CESNET France CC-IN2P3 Clermont-Ferrand Germany FZK Aachen DESY GSI Karlsruhe Wuppertal Greece HellasGrid Hungary Budapest India TIFR Israel Tel-Aviv Weizmann Italy CNAF Frascati Legnaro Milano Napoli Roma Torino Japan Tokyo Netherlands NIKHEF SARA Pakistan NCP Poland Krakow Portugal LIP Russia SINP-Moscow JINR-Dubna Spain PIC UAM USC UB-Barcelona IFCA CIEMAT IFIC Switzerland CERN CSCS Taiwan ASCC IPAS NCU UK RAL Birmingham Cavendish Glasgow Imperial Lancaster Manchester QMUL RAL-PP Sheffield UCL US BNL FNAL HP Puerto-Rico 22 Countries 62 Sites (48 Europe, 2 US, 5 Canada, 6 Asia, 1 HP) Coming: New Zealand, China, other HP (Brazil, Singapore) 4000 CPUs

5 LCG-2 sites – status 17 June 2004

6 LCG Service Status Certification and distribution process established
Middleware package – components from – European DataGrid (EDG) US (Globus, Condor, PPDG, GriPhyN)  the Virtual Data Toolkit Principles for registration and security agreed Grid Operations Centre at Rutherford Lab (UK) A second centre is coming online at Academia Sinica in Taipei Call Centre FZK (Karlsruhe) LCG-2 software released February 2004 62 centres connected with more than 4000 processors Four collaborations run data challenges on the grid Status on 24 July 2003 – I will an update on the 11th August There are two activities going on in parallel: -- Pre-production software is being deployed to 10 Regional Centres (Academia Sinica Taipei, BNL, CERN, CNAF, FNAL, FZK, IN2P3 Lyon, Moscow State Univ., RAL, Univ. Tokyo). This is a final validation of the distribution and installation process and tools, and establishes the network of service support people, but the Grid middleware is not the final version yet. About half of the sites have successfully installed this – a few (including BNL and Lyon) are very slow. -- Final testing is under way of the first production set of Grid middleware. This is being done by a group of grid experts from the LHC experiments (the “loose cannons”). There are a few “show stoppers” and about 15 very serious bugs. The target is to install the production software by the end of the month – one week from now!

7 Data challenges Now Summer ‘04 Fall ‘04
The 4 LHC experiments currently run data challenges using the LHC computing grid Part 1: World-wide production of simulated data Job submission, resource allocation and monitoring Catalog of distributed data Part 2: Test of Tier-0 operation Continuous (24 x 7) recording of data up to 450 MB/s per experiment (target for ALICE in 2005: 750 MB/s) First pass data reconstruction and analysis Distribute data in real-time to Tier-1 centres Part 3: Distributed analysis on the Grid Access of data from anywhere in the world – in an organized as well as in a “chaotic” access pattern Now Summer ‘04 Fall ‘04

8 Grid operation After 3 months of intensive use the basic middleware of LCG-2 is proving to be reliable – although there are many issues of functionality, scaling, performance, scheduling The grid deployment process is working well – Integration – certification – debugging Installation and validation of new sites Constructive and useful feedback from first data challenges – especially on data issues from CMS Interoperability with Grid3 in the US being studied (presentation by FNAL at Grid Deployment Board in May) Implementation of a common interface to Mass Storage completed or in plan at all Tier-1 Centres Proposal for convergence on a single Linux flavour – based on the FNAL Scientific Linux package

9 Service Challenges Exercise the operations and support infrastructure
Gain experience in service management Uncover problems with long-term operation Focus on Data management, batch production and analysis Reliable data transfer Integration of high bandwidth networking Also: massive job submission, test reaction to security incidents Target by end 2004 – Robust and reliable data management services in continuous operation between CERN, Tier-1 and large Tier-2 centres Sufficient experience with sustained high performance data transfer to guide wide area network planning The Service Challenges are a complement to the experiment Data Challenges

10 Networking 5.44 Gb/s 1.1 TB in 30 min. 6.25 Gb/s 20 April 04
Key element for LHC computing strategy Aiming at sustained 500 MB/s by end 2004 – requiring 10 Gb/s networks to some Tier-1 centres based on existing facilities 5.44 Gb/s 1.1 TB in 30 min. 6.25 Gb/s 20 April 04

11 LCG-2 and Next Generation Middleware
prototyping product 2004 2005 LCG-3 LCG-2 focus on production, large-scale data handling The service for the 2004 data challenges Provides experience on operating and managing a global grid service Strong development programme driven by data challenge experience Evolves to LCG-3 as components progressively replaced with new middleware Next generation middleware focus on analysis Developed by EGEE project in collaboration with VDT (US) LHC applications and users closely involved in prototyping & development (ARDA project) Short development cycles Completed components integrated in LCG-2

12 US participation - some examples
US LHC funds Miron Livny on ARDA/EGEE middleware US CMS working on end to end ARDA prototypes for the fall. US CMS will participate in CCS milestone for analysis across grids. NSF funds VDT/EGEE joint effort. Wisconsin certification testbeds for national NMI testing, VDT, LCG and EGEE middleware... etc. encourage coherence across the middleware versions. PPDG helps fund ATLAS ARDA representative. PPDG extension proposal is funded. Enables some immediate attention to Open Science Grid needs. Open Science Grid “Technical Groups” scope explicitly includes cooperation with the EGEE/LCG peers groups. A single Linux flavour – based on the FNAL Scientific Linux package

13 Phase 2 Planning Outline
June Establish editorial board for LCG TDR September 2004 – Consolidated Tier-1, Tier-2 Regional Centre Plan Background for the draft MoU  October C-RRB Revised version of basic computing models Revised estimates of overall Tier-1, Tier-2 resources Current state of commitments of resources Regional Centres ↔ Experiments High-level plan for ramping up the Tier-1 and large Tier-2 centres October 2004 C-RRB – Draft MoU End Initial computing models agreed by experiments April 2005 C-RRB - Final MoU End June TDR

14 Planning groups High level planning group – chair Les Robertson
Overall picture & implied resources at centres Representatives from the experiments and LCG management Representatives from three regional centres – Italy, Spain, US Grid Deployment Area steering group – chair Ian Bird Interoperability, resources, overview Service Challenges Representatives from regional centres Service challenge group – chair Bernd Panzer Carrying out service challenges People working on service challenges Networking – chair David Foster Working towards required end-to-end bandwidth MoU taskforce – chair David Jacobs Draft LCG MoU and one for each of the four experiments Representatives from experiments, six countries, LCG management TDR editorial board – chair Jürgen Knobloch

15 102 FTE-years missing for Phase 2
LCG Staffing for the Common Activities at CERN Staffing level in CERN budget External funding LCG Phase 1 EGEE Phase 1 EGEE Phase 2 (assumed) Staffing required for planned service level 102 FTE-years missing for Phase 2

16 Progressive evolution
Preparing for 2007 2003 – has demonstrated event production In 2004 we must show that we can also handle the data – even if the computing model is very simple -- This is a key goal of the Data Challenges & Service Challenges Target for end of this year – Basic model demonstrated using current grid middleware All Tier-1s and ~25% of Tier-2s operating a reliable service Validate security model, understand storage model Clear idea of the performance, scaling, and management issues 2004 2005 2006 2007 first data Initial service in operation Decisions on final core middleware Demonstrate core data handling and batch analysis Installation and commissioning Progressive evolution

17 Running on several Grids
Aiming for common interfaces and interoperability

18 Conclusions We are now in a position to start from an operational environment towards a computing system for LHC by Completing functionality Improving reliability Ramping up capacity and performance Data challenges and service challenges are essential to keep the right track All partners are committed to arrive at interoperability of the grids involved We are grateful for the very significant contributions from and the effective collaboration with the US The full funding for LCG phase 2 still needs to be secured


Download ppt "The LHC Computing Grid Project Status & Plans"

Similar presentations


Ads by Google