Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Processing and the LHC Computing Grid (LCG) Jamie Shiers Database Group, IT Division CERN, Geneva, Switzerland

Similar presentations


Presentation on theme: "Data Processing and the LHC Computing Grid (LCG) Jamie Shiers Database Group, IT Division CERN, Geneva, Switzerland"— Presentation transcript:

1 Data Processing and the LHC Computing Grid (LCG) Jamie Shiers [Jamie.Shiers@cern.ch] Database Group, IT Division CERN, Geneva, Switzerland http://cern.ch/db/

2 Jamie ShiersLCG Data Processing 2 Overview Brief overview / recap of LHC (emphasis on Data) The LHC Computing Grid The Importance of the Grid The role of the Database Group (IT-DB) Summary

3 LHC Overview

4 CERN – European Organization for Nuclear Research

5 The LHC Machine

6 Jamie ShiersLCG Data Processing 6 CMS Data Rates: 1PB/s from detector 100MB/s – 1.5GB/s to ‘disk’ 5-10PB growth / year ~3GB/s per PB of data Data Processing: 100,000 of today’s fastest PCs Level 1 Level 2 40 MHz (1000 TB/sec) Level 3 75 KHz (75 GB/sec) 5 KHz (5 GB/sec) 100 Hz (100 MB/sec) Data Recording & Offline Analysis

7 selectivity: 1 in 10 13 - 1 person in a thousand world populations - A needle in 20 million haystacks LHC: Higgs Decay into 4 muons

8 RAWRAW ESDESD AODAOD TAG random seq. 1PB/yr (1PB/s prior to reduction!) 100TB/yr 10TB/yr 1TB/yr Data Users Tier0 Tier1

9 Jamie ShiersLCG Data Processing 9 LHC Data Grid Hierarchy Tier 1 Tier2 Center Online System CERN 700k SI95 ~1 PB Disk; Tape Robot FNAL: 200k SI95; 600 TB IN2P3 Center INFN Center RAL Center Institute Institute ~0.25TIPS Workstations ~100-400 MBytes/sec 2.5 Gbps 100 - 1000 Mbits/sec Physics data cache ~PByte/sec ~2.5 Gbits/sec Tier2 Center ~2.5 Gbps Tier 0 +1 Tier 3 Tier 4 Tier2 Center Tier 2 Experiment CERN/Outside Resource Ratio ~1:2 Tier0/(  Tier1)/(  Tier2) ~1:1:1

10 Jamie ShiersLCG Data Processing 10 HEP Data Analysis Physicists work on analysis “channels” Find collisions with similar features Physics extracted by collective iterative discovery – small groups of professors and students Each institute has ~10 physicists working on one or more channels Order 1000 physicists in 100 institutes in 10 countries

11 Jamie ShiersLCG Data Processing 11 LHC Computing Characteristics Perfect parallelism Independent events (collisions) bulk of the data is read-only – in conventional files New versions rather than updates meta-data (few %) in databases  very large aggregate requirements computation, data, i/o  chaotic workload – unpredictable demand, data access patterns no limit to the requirements

12 The LHC Computing Grid (LCG) http://cern.ch/LCG/

13 Jamie ShiersLCG Data Processing 13 From Mainframes to the Grid

14 Jamie ShiersLCG Data Processing 14

15 Jamie ShiersLCG Data Processing 15 The GRID Vision Computing resources Data Knowledge Instruments People Solution Complex problem GRID

16 Jamie ShiersLCG Data Processing 16 And Reality… les.robertson@cern.ch physicist Lab a Uni a Lab c Uni n Lab m Lab b Uni b Uni y Uni x    Germany USA UK France Italy ………. CERN Tier 1 Japan CERN Tier 0

17 CMS ATLAS LHCb CERN Tier 0 Centre at CERN grid for a physics study group grid for a regional group les.robertson@cern.ch Tier2 Lab a Uni a Lab c Uni n Lab m Lab b Uni b Uni y Uni x Tier3 physics department    Desktop Germany Tier 1 USA UK France Italy Spain CERN Tier 1 Japan The LHC Computing Centre The promise of Grid technology CERN Tier 0 The opportunity of Grid technology

18 Jamie ShiersLCG Data Processing 18 Virtual Computing Centre The user --- sees the image of a single cluster does not need to know - where the data is - where the processing capacity is - how things are interconnected - the details of the different hardware and is not concerned by the conflicting policies of the equipment owners and managers

19 Jamie ShiersLCG Data Processing 19 The LHC Computing Grid Project applications support – develop and support the common tools, frameworks, and environment needed by the physics applications computing system – build and operate a global data analysis environment -integrating large local computing fabrics -and high bandwidth networks -to provide a service for ~6K researchers -in over ~40 countries Goal – Prepare and deploy the LHC computing environment This is not yet another grid technology project – it is a grid deployment project

20 Jamie ShiersLCG Data Processing 20 The LHC Computing Grid Project Phase 1 – 2002-05 Development and prototyping Operate a 50% prototype of the facility needed by one of the larger experiments Two phases Phase 2 – 2006-08 Installation and operation of the full world-wide initial production Grid for all four experiments

21 Jamie ShiersLCG Data Processing 21 Leveraging Other Grid Projects US projects European projects Many national, regional Grid projects -- GridPP(UK), INFN-grid(I), NorduGrid, … C ross G rid significant R&D funding for Grid middleware scope for divergence global grids need standards the trick will be to recognise and be willing to migrate to the winning solutions

22 Jamie ShiersLCG Data Processing 22 LHC Computing Grid Project The first Milestone - within one year - deploy a Global Grid Service sustained 24 X 7 service including sites from three continents  identical or compatible Grid middleware and infrastructure several times the capacity of the CERN facility and as easy to use Having stabilised this base service – progressive evolution – number of nodes, performance, capacity and quality of service integrate new middleware functionality migrate to de facto standards as soon as they emerge

23 Jamie ShiersLCG Data Processing 23 LCG Production CMS preparing for distributed data challenge, starting Q3 2003, ending Q1 2004 Tier0 (CERN), 2-3 Tier1, 5-10 Tier2 Total data volume ~100TB Need to be production ready with Grid Computing System and Applications by 1 st July 2003

24 The Importance of the Grid

25 Jamie ShiersLCG Data Processing 25 Birth of the Web Original proposal 1989 1992 – explosion inside HEP 1993 – explosion across the world Largely due to NCSA Mosaic browser Now totally ubiquitous: every firm must have a Website!

26 Jamie ShiersLCG Data Processing 26

27 Jamie ShiersLCG Data Processing 27 US Grid Projects NASA Information Power Grid DOE Science Grid NSF National Virtual Observatory NSF GriPhyN DOE Particle Physics Data Grid NSF TeraGrid DOE ASCI Grid DOE Earth Systems Grid DARPA CoABS Grid NEESGrid DOH BIRN NSF iVDGL

28 Jamie ShiersLCG Data Processing 28 European Grid Projects UK e-Science Grid Netherlands – VLAM, PolderGrid Germany – UNICORE, Grid proposal France – Grid funding approved Italy – INFN Grid Eire – Grid proposals Switzerland - Network/Grid proposal Hungary – DemoGrid, Grid proposal Nordic Grid … SPAIN:

29 Jamie ShiersLCG Data Processing 29 EU GridProjects DataGrid (CERN,..) EuroGrid (Unicore) DataTag (TTT…) Astrophysical Virtual Observatory GRIP (Globus/Unicore) GRIA (Industrial applications) GridLab (Cactus Toolkit) CrossGrid (Infrastructure Components) EGSO (Solar Physics)

30 Jamie ShiersLCG Data Processing 30 IBM and the Grid Interview with Irving Wladawsky-Berger ‘Grid computing is a set of research management services that sit on top of the OS to link different systems together’ ‘We will work with the Globus community to build this layer of software to help share resources’ ‘All of our systems will be enabled to work with the grid, and all of our middleware will integrate with the software’

31 Jamie ShiersLCG Data Processing 31 Industrial Engagement? ‘Grid Computing is one of the three next big things for Sun and our customers’ Ed Zander, COO Sun ‘The alignment of OGSA with XML Web services is important because it will make Internet-scale, distributed Grid Computing possible’ Robert Wahbe, General Manager of Web Services, Microsoft Oracle: starting Grid activities…

32 Jamie ShiersLCG Data Processing 32 HP and Grids The Grid fabric will be: Soft – share everything, failure tolerant Dynamic – resources will constantly come and go, no steady state, ever Federated – global structure not owned by any single authority Heterogeneous – from supercomputer clusters to P2P PCs John Manley, HP Labs

33 The Role of the Database Group

34 Jamie ShiersLCG Data Processing 34 CERN-IT-DB Provides… Database Infrastructure for CERN laboratory Currently based on Oracle – all sectors Applications support in certain key areas Oracle Application Server, Engineering Data Management Service Physics Data Management support Services for the LHC experiments, Applications, … Grid Data Management European DataGrid WP2 – Data Management Corresponding LCG Services LCG Persistency Project: POOL

35 Jamie ShiersLCG Data Processing 35 Los Endos


Download ppt "Data Processing and the LHC Computing Grid (LCG) Jamie Shiers Database Group, IT Division CERN, Geneva, Switzerland"

Similar presentations


Ads by Google