Presentation is loading. Please wait.

Presentation is loading. Please wait.

UK Tony Doyle - University of Glasgow title.open ( ); revolution {execute}; LHC Computing Challenge Methodology? Hierarchical Information in a Global Grid.

Similar presentations


Presentation on theme: "UK Tony Doyle - University of Glasgow title.open ( ); revolution {execute}; LHC Computing Challenge Methodology? Hierarchical Information in a Global Grid."— Presentation transcript:

1 UK Tony Doyle - University of Glasgow title.open ( ); revolution {execute}; LHC Computing Challenge Methodology? Hierarchical Information in a Global Grid Supernet Aspiration?HIGGSDataGRID-UKAspiration? ALL Data Intensive Computation Teamwork

2 UK Tony Doyle - University of Glasgow Outline Starting Point Starting Point The LHC Computing Challenge The LHC Computing Challenge Data Hierarchy Data Hierarchy DataGRID DataGRID Analysis Architectures Analysis Architectures GRID Data Management GRID Data Management Industrial Partnership Industrial Partnership Regional Centres Regional Centres Todays World Todays World Tomorrows World Tomorrows World Summary Summary

3 UK Tony Doyle - University of Glasgow Starting Point

4 UK Tony Doyle - University of Glasgow Starting Point Current technology would not be able to scale data to such an extent, which is where the teams at Glasgow and Edinburgh Universities come in. The funding awarded will enable the scientists to prototype a Scottish Computing Centre which could develop the computing technology and infrastructure needed to cope with the high levels of data produced in Geneva, allowing the data to be processed, transported, stored and mined. Once scaled down, the data will be distributed for analysis by thousands of scientists around the world. The project will involve participation from Glasgow University's Physics & Astronomy and Computing Science departments, Edinburgh University's Physics & Astronomy department and the Edinburgh Parallel Computing Centre, and is funded by the Scottish Higher Education Funding Council's (SHEFC Joint Research Equipment Initiative). It is hoped that the computing technology developed during the project will have wider applications in the future, with possible uses in astronomy, computing science and genomics observation, as well as providing generic technology and software for the next generation Internet.

5 UK Tony Doyle - University of Glasgow The LHC Computing Challenge Detector for ALICE experiment Detector for LHCb experiment

6 UK Tony Doyle - University of Glasgow A Physics Event Gated electronics response from a proton-proton collision Gated electronics response from a proton-proton collision Raw data: hit addresses, digitally converted charges and times Raw data: hit addresses, digitally converted charges and times Marked by a unique code: Marked by a unique code: Proton bunch crossing number, RF bucket Event number Collected, Processed, Analyzed, Archived…. Collected, Processed, Analyzed, Archived…. Variety of data objects become associated Event migrates through analysis chain: may be reprocessed; selected for various analyses; replicated to various locations.

7 UK Tony Doyle - University of Glasgow LHC Computing Model Hierarchical, distributed tiers Hierarchical, distributed tiers GRID ties distributed resources together GRID ties distributed resources together Tier-2 Tier-1 Tier-0 Dedicated or QoS Network Links ScotGRID CERN Universities RAL

8 UK Tony Doyle - University of Glasgow coordination required at collaboration and group levels Data Structure Raw Data Reconstruction Data Acquisition Level 3 trigger Trigger Tags Event Summary Data ESD Event Summary Data ESD Event Tags Physics Models Monte Carlo Truth Data MC Raw Data Reconstruction MC Event Summary Data MC Event Tags Detector Simulation Calibration Data Run Conditions Trigger System

9 UK Tony Doyle - University of Glasgow Physics Analysis ESD: Data or Monte Carlo Event Tags Event Selection Analysis Object Data AOD Analysis Object Data AOD Calibration Data Analysis, Skims Raw Data Tier 0,1 Collaboration wide Tier 2 Analysis Groups Tier 3, 4 Physicists Physics Analysis Physics Objects Physics Objects Physics Objects INCREASING DATA FLOWINCREASING DATA FLOW

10 UK Tony Doyle - University of Glasgow ATLAS Parameters Running conditions at startup: Running conditions at startup: Raw event size ~2 MB (recently revised upwards...) Raw event size ~2 MB (recently revised upwards...) 2.7x10 9 event sample 5.4 PB/year, before data processing 2.7x10 9 event sample 5.4 PB/year, before data processing Reconstructed events, Monte Carlo data ~9 PB/year (2PB disk) Reconstructed events, Monte Carlo data ~9 PB/year (2PB disk) CPU: ~2M SpecInt95 CPU: ~2M SpecInt95 CERN alone can handle only 1/3 of these resources

11 UK Tony Doyle - University of Glasgow Data Hierarchy RAW, ESD, AOD, TAG RAW Recorded by DAQ Triggered events Detector digitisation ~2 MB/event ESD Pseudo-physical information: Clusters, track candidates (electrons, muons), etc. Reconstructedinformation ~100 kB/event AOD Physical information: Transverse momentum, Association of particles, jets, (best) id of particles, Physical info for relevant objects Selectedinformation ~10 kB/event TAG Analysisinformation ~1 kB/event Relevant information for fast event selection

12 UK Tony Doyle - University of Glasgow Testbed DataBase Object Model: Atlas Simulated Raw Events Object Model: Atlas Simulated Raw Events b PEvent b PEventObjVector b PEventObj b PSiDetector b PSiDigit b PMDT_Detector b PMDT_Digit b PCaloRegion b PCaloDigit b PTruthVertex b PTruthTrack System DBRaw Data DB1Raw Data DB2... Event ContainerRaw Data Container PEvent #1 PEventObjeVector PEventObjVector : PEvent #2 PEventObjVector : PSiDetector PSiDigit... PTRT_Detector PTRTDigit... PMDT_Detector PMDT_Digit... PCaloRigion PCaloDigit... PTruthVertex PTruthTrack... :

13 UK Tony Doyle - University of Glasgow LHC Computing Challenge Tier2 Centre ~1 TIPS Online System Offline Farm ~20 TIPS CERN Computer Centre >20 TIPS RAL Regional Centre US Regional Centre French Regional Centre Italian Regional Centre Institute Institute ~0.25TIPS Workstations ~100 MBytes/sec 100 - 1000 Mbits/sec One bunch crossing per 25 ns 100 triggers per second Each event is ~1 Mbyte Physicists work on analysis channels Each institute has ~10 physicists working on one or more channels Data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~ Gbits/sec or Air Freight Tier2 Centre ~1 TIPS ~Gbits/sec Tier 0 Tier 1 Tier 3 Tier 4 1 TIPS = 25,000 SpecInt95 PC (1999) = ~15 SpecInt95 ScotGRID++ ~1 TIPS Tier 2

14 UK Tony Doyle - University of Glasgow e.g. MySQL database daemon Basic 'crash-me' and associated tests Access times for basic insert, modify, delete, update database operations e.g. (on 256Mbyte, 800MHz Red Hat 6.2 linux box) Database Access Benchmark 350k data insert operations149 seconds 10k query operations97 seconds 350k data insert operations149 seconds 10k query operations97 seconds Many applications require database functionality Currently favoured HEP DataBase application e.g. BaBar, ZEUS software

15 UK Tony Doyle - University of Glasgow CPU Intensive Applications Numerically intensive simulations: Minimal input and output data ATLAS Monte Carlo (gg H bb) 228 sec/3.5 Mb event on 800 MHz linux box Standalone physics applications: 1. Simulation of neutron/photon/electron interactions for 3D detector design 2. NLO QCD physics simulation CompilerSpeed (MFlops) Fortran (g77) 27 C (gcc)43 Java (jdk)41 Compiler Tests:

16 UK Tony Doyle - University of Glasgow Network Monitoring Prototype Tools: Java Analysis Studio over TCP/IP Instantaneous CPU Usage Scalable Architecture Individual Node Info.

17 UK Tony Doyle - University of Glasgow Analysis Architecture Converter Algorithm Event Data Service Persistency Service Data Files Algorithm Transient Event Store Detec. Data Service Persistency Service Data Files Transient Detector Store Message Service JobOptions Service Particle Prop. Service Other Services Histogram Service Persistency Service Data Files Transient Histogram Store Application Manager Converter The Gaudi Framework - developed by LHCb - adopted by ATLAS (Athena)

18 UK Tony Doyle - University of Glasgow GRID Services Grid Services Grid Services Resource Discovery Scheduling Security Monitoring Data Access Policy Athena/Gaudi Services Athena/Gaudi Services Application manager Job Options service Event persistency service Detector persistency Histogram service User interfaces Visualization Database Database Event model Object federations Extensible interfaces and protocols being specified and developed: Tools: 1. UML 2. Java Protocols:1. XML 2. MySQL DataGRID Toolkit 3. LDAP }

19 UK Tony Doyle - University of Glasgow Virtual Data Scenario Example analysis scenario: Example analysis scenario: Physicist issues a query from Athena for a Monte Carlo dataset Issues: How expressive is this query? What is the nature of the query: declarative Creating new queries and language Algorithms are already available in local shared libraries An Athena service consults an ATLAS Virtual Data Catalog Consider possibilities: Consider possibilities: TAG file exists on local machine (e.g. Glasgow) Analyze it ESD file exists in a remote store (e.g. Edinburgh) Access relevant event files, then analyze that RAW File no longer exists (e.g. RAL) Regenerate, re-reconstruct, re-analyze !!! GRID Data Management

20 UK Tony Doyle - University of Glasgow Globus

21 UK Globus Data GRID Tool Kit

22 UK Tony Doyle - University of Glasgow GRID Data Management Goal: develop middle-ware infrastructure to manage petabyte-scale data Secure Region High Level Services Medium Level Services Core Services Service levels reasonably well defined Identify Key Areas Within Software Structure

23 UK Tony Doyle - University of Glasgow 5 areas for development 5 areas for development Data Accessor - hides specific storage system requirements. Mass Storage Management group. Replication - improves access by wide-area caching. Globus toolkit offers sockets and a communication library, Nexus. Meta Data Management - data catalogues, monitoring information (e.g. access pattern), grid configuration information, policies. MySQL over Lightweight Directory Access Protocol (LDAP) being investigated. Security - ensuring consistent levels of security for data and meta data. Query optimisation - cost minimisation based on response time and throughput Monitoring Services group. Identifiable UK Contributions RAL Identifying Key Areas RAL

24 UK Tony Doyle - University of Glasgow AstroGrid WP1 PROJECT MANAGEMENT WP2 REQUIREMENTS ANALYSIS : existing functionality and future requirements; community consultation WP3 SYSTEM ARCHITECTURES: benchmark and implement WP4 GRID-ENABLE CURRENT PACKAGES : implement and test performance WP5 DATABASE SYSTEMS : requirements analysis and implementation; scalable federation tools. WP6 DATA MINING ALGORITHMS : requirements analysis, development and implementation WP7 BROWSER APPLICATIONS : requirements analysis and software development WP8 VISUALISATION : concepts and requirements analysis, software development. WP9 INFORMATION DISCOVERY : concepts and requirements analysis, software development WP10 FEDERATION OF KEY CURRENT DATASETS : e.g.. SuperCOSMOS, INT-WFS, 2MASS, FIRST, 2dF WP11 FEDERATION OF NEXT GENERATION OPTICAL-IR DATASETS : esp. Sloan, WFCAM WP12 FEDERATION of HIGH ENERGY ASTROPHYSICS DATASETS : esp. Chandra, XMM WP13 FEDERATION of SPACE PLASMA and SOLAR DATASETS : esp. SOHO, Cluster, IMAGE WP14 COLLABORATIVE DEVELOPMENT OF VISTA, VST, and TERAPIX PIPELINES WP15 COLLABORATION PROGRAMME WITH INTERNATIONAL PARTNERS WP16 COLLABORATION PROGRAMME WITH OTHER DISCIPLINES Emphasis on High Level GUIs etc WP 1 Grid Workload Management A.Martin-QMW (0.5) WP 2 Grid Data Management A.Doyle-Glasgow (1.5) WP 3 Grid Monitoring services R.Middleton-RAL (1.8) WP 4 Fabric Management A.Sansum-RAL (0.5) WP 5 Mass Storage Management J.Gordon-RAL (1.5) WP 6 Integration Testbed D.Newbold-Bristol (3.0) WP 7 Network Services P.Clarke-PPNCG/UCL (2.0) WP 8 HEP Applications N/A (?) (4.0) WP 9 EO Science Applications ( c/o R.Middleton-RAL ) (0.0) WP 10 Biology Applications ( c/o P.Jeffreys-RAL ) (0.1) WP 11 Dissemination P.Jeffreys-RAL (0.1) WP 12 Project Management R.Middleton-RAL (0.5) Replication Fragmentation Emphasis on Low Level Services etc

25 UK Tony Doyle - University of Glasgow Testbed = Learning by Example +Cloning SRIF Expansion = expansion of open source ideas GRID Culture

26 UK Tony Doyle - University of Glasgow mission to accelerate the exploitation of simulation by industry, commerce and academia mission to accelerate the exploitation of simulation by industry, commerce and academia 45 staff, £2.5M turnover - externally funded 45 staff, £2.5M turnover - externally funded solve business problems - not sell technology solve business problems - not sell technology Partnership Important

27 UK Tony Doyle - University of Glasgow Industrial Partnership ping service ping monitor WAN LAN Adoption of OPEN Industry Standards +OO Methods Industry Research Council Inspiration: Data-Intensive Computation

28 UK Tony Doyle - University of Glasgow Regional Centres SRIF Infrastructure Grid Data Management Security Monitoring Networking Local Perspective: Consolidate Research Computing Optimisation of Number of Nodes? 4-5? Relative size dependent on funding dynamics Global Perspective: V. Basic Grid Skeleton Regional Expertise Model?

29 UK Tony Doyle - University of Glasgow Todays World Istituto Trentino Di Cultura Helsinki Institute of Physics Science Research Council SARA

30 UK Tony Doyle - University of Glasgow Tomorrows World CR2 AC12 AC13 AC14 Istituto Trentino Di Cultura Helsinki Institute of Physics Science Research Council AC7 AC8 AC9 AC10 AC11 CR3 AC15 AC16 AC17 CR4 SARA AC18 AC19 CR5 AC20 AC21 CR6 CO

31 UK Tony Doyle - University of Glasgow Summary General Engagement (£=OK) General Engagement (£=OK) Mutual Interest (ScotGRID Example) Mutual Interest (ScotGRID Example) Emphasis on Emphasis on DataGrid Core Development (e.g. Grid Data Management) CERN lead + Unique UK Identity Extension of Open Source Idea Grid Culture = Academia + Industry Multidisciplinary Approach = University + Regional Basis Use of Existing Structures (e.g. EPCC, RAL) Hardware Infrastructure via SRIF + Industrial Sponsorship Now LHC Grid Data Management Security Monitoring Networking Detector for ALICE experiment Detector for LHCb experiment ScotGRID


Download ppt "UK Tony Doyle - University of Glasgow title.open ( ); revolution {execute}; LHC Computing Challenge Methodology? Hierarchical Information in a Global Grid."

Similar presentations


Ads by Google