Presentation is loading. Please wait.

Presentation is loading. Please wait.

GridPP & The Grid Who we are & what it is Tony Doyle.

Similar presentations


Presentation on theme: "GridPP & The Grid Who we are & what it is Tony Doyle."— Presentation transcript:

1

2 GridPP & The Grid Who we are & what it is Tony Doyle

3 Web: information sharing Invented at CERN by Tim Berners-Lee No. of Internet hosts (millions) Year Agreed protocols: HTTP, HTML, URLs Anyone can access information and post their own Quickly crossed over into public use Tim Berners- Lee

4 @Home Projects Uses home PCs to run numerous calculations with dozens of variables. Distributed computing project, not a grid Other @home projects –BBC Climate Change Experiment SETI @ Home –FightAIDS@home Peer To Peer Networks Peer-to-peer network No centralised database of files Legal problems with sharing copyrighted material Security problems

5 Grid: Resource Sharing Share more than information Data, computing power, applications MIDDLEWARE CPU Cluster User Interface Machine CPU Cluster Resource Broker Disk Server Your Program Disks, CPU etc PROGRAMS OPERATING SYSTEM Word/Excel Email/Web Your Program Games Middleware handles everything Single computer The Grid

6 Analogy with the Electricity Power Grid 'Standard Interface' Distribution Infrastructure Power Stations Computing and Data Centres Fibre Optics of the Internet

7 The CERN LHC 4 Large Experiments The world’s most powerful particle accelerator - 2007

8 ALICE - heavy ion collisions, to create quark-gluon plasmas - 50,000 particles in each collision LHCb - to study the differences between matter and antimatter - will detect over 100 million b and b-bar mesons each year ATLAS - General purpose - Origin of mass - Supersymmetry - 2,000 scientists from 34 countries CMS - General purpose - 1,800 scientists from over 150 institutes “One Grid to Rule Them All”? The Experiments

9 Why do particle physicists need the Grid? Example from LHC: starting from this event… …we are looking for this “signature” Selectivity: 1 in 10 13 Like looking for 1 person in a thousand world populations Or for a needle in 20 million haystacks

10 Why do particle physicists need the Grid? Concorde (15 Km) Mt. Blanc (4.8 Km) One year’s data from LHC would fill a stack of CDs 20km high 100 million electronic channels 800 million proton-proton interactions per second 0.0002 Higgs per second 10 PBytes of data a year (10 million GBytes = 14 million CDs)

11 Who else can use a Grid? Astronomers Healthcare Profesionals Bioinformatics Digital curation To create digital Libraries and Museums Scanning Remote consultancy Optical X ray Digitize almost anything

12 19 UK Universities, CCLRC (RAL & Daresbury) Funded by PPARC GridPP1 2001-2004 "From Web to Grid" GridPP2 2004-2007 "From Prototype to Production" Developed a working, highly functional Grid Who are GridPP?

13 What Have We Done So Far Simulated 46 million molecules for medical research in 5 weeks, which would have taken over 80 years on a single PC Reached transfer speeds of 1 Gigabyte per second in high speed networking tests from CERN – a DVD every 5 seconds BaBar experiment has simulated 500 million particle physics collisions on the UK Grid UK’s #1 producer of data for LHCb, ATLAS and CMS

14 Worldwide LHC Computing Grid GridPP is part of EGEE and LCG (currently the largest Grid in the world) EGEE stats: 182 Sites 42 Countries 38,201 CPUs 9,145 TBytes Storage

15 Tier Structure Tier 0 Tier 1 National centres Tier 2 Regional groups Tier 3 Institutes Offline farm Online system CERN computer centre RAL,UK ScotGridNorthGridSouthGridLondon ItalyUSA GlasgowEdinburghDurham France Germany Detector

16 UK Tier-1/A Centre High quality data services National and International Role UK focus for International Grid development 1000 Dual CPU 200 TB Disk 220 TB Tape (Capacity 1PB) Grid Operations Centre

17 UK Tier-2 Centres ScotGrid Durham, Edinburgh, Glasgow NorthGrid Daresbury, Lancaster, Liverpool, Manchester, Sheffield SouthGrid Birmingham, Bristol, Cambridge, Oxford, RAL PPD, Warwick London Brunel, Imperial, QMUL, RHUL, UCL

18 Must share data between thousands of scientists with multiple interests link major and minor computer centres ensure all data accessible anywhere, anytime grow rapidly, yet remain reliable for more than a decade cope with different management policies of different centres ensure data security be up and running routinely by 2007 What are the Grid challenges?

19 Other Grids UK National Grid Service –UK’s core production computational and data Grid EGEE (Europe) –Enabling Grids for E- sciencE Nordugrid (Europe) –Grid Research and Development collaboration Open Science Grid (USA) –Science applications from HEP to biochemistry

20 The Future Grow the LHC Grid Spread beyond science –Healthcare, commercial uses, government, games Will it become part of everyday life?

21 Further Info http://www.gridpp.ac.uk

22 Backups

23 “UK contributes to EGEE's battle with malaria” BioMed Successes/Day 1107 Success % 77% WISDOM (Wide In Silico Docking On Malaria) The first biomedical data challenge for drug discovery, which ran on the EGEE grid production service from 11 July 2005 until 19 August 2005. GridPP resources in the UK contributed ~100,000 kSI2k-hours from 9 sites Number of Biomedical jobs processed by country Normalised CPU hours contributed to the biomedical VO for UK sites, July-August 2005

24 Is GridPP a Grid? 1.Coordinates resources that are not subject to centralized control 2.… using standard, open, general-purpose protocols and interfaces 3.… to deliver nontrivial qualities of service 1.YES. This is why development and maintenance of LCG is important. 2.YES. VDT (Globus/Condor-G) + EGEE(Glite) ~meet this requirement. 3.YES. LHC experiments data challenges over the summer of 2004. http://www-fp.mcs.anl.gov/~foster/Articles/WhatIsTheGrid.pdf http://agenda.cern.ch/fullAgenda.php?ida=a042133

25 Application Development ATLAS LHCbCMS BaBar (SLAC) SAMGrid (FermiLab) QCDGridPhenoGrid

26 Middleware Development

27 Configuration Management Storage Interfaces Network Monitoring Security Information Services Grid Data Management

28 Requirement Storage Element Basic File Transfer Reliable File Transfer Catalogue Services Data Management tools Compute Element Workload Management VO Agents VO Membership Services DataBase Services Posix-like I/O Application Software Installation Tools Job Monitoring Reliable Messaging Information System 15 Baseline Services for a functional Grid We rely upon gLite components This middleware builds upon VDT (Globus and Condor) and meets the requirements of all the basic scientific use cases: 1.Purple (amber) areas are (almost) agreed as part of the shared generic middleware stack by each of the application areas 2.Red are areas where generic middleware competes with application-specific software. www.glite.org gLite Middleware Stack

29 2005 Metrics and Quality Assurance TargetCurrent status Q2 2006 Target values Number of Users ~ 1000≥ 3000 Number of sites 12050 Number of CPU ~120009500 at month 15 Number of Disciplines 6≥ 5 Multinational2424≥ 15 countries

30 LCG Service Challenges SC2 SC3 LHC Service Operation Full physics run 200520072006 2008 First physics First beams cosmics June05 - Technical Design Report Sep05 - SC3 Service Phase May06 – SC4 Service Phase Sep06 – Initial LHC Service in stable operation SC4 SC2 – Reliable data transfer (disk-network-disk) – 5 Tier-1s, aggregate 500 MB/sec sustained at CERN SC3 – Reliable base service – most Tier-1s, some Tier-2s – basic experiment software chain – grid data throughput 500 MB/sec, including mass storage (~25% of the nominal final throughput for the proton period) SC4 – All Tier-1s, major Tier-2s – capable of supporting full experiment software chain inc. analysis – sustain nominal final grid data throughput LHC Service in Operation – September 2006 – ramp up to full operational capacity by April 2007 – capable of handling twice the nominal data throughput Apr07 – LHC Service commissioned

31 Status?: Exec 2 Summary 2005 was the first full year of a Production Grid: the UK Tier-1 was the largest CPU provider on the LCG and by the end of the year the Tier-2s provided twice the CPU of the Tier-1. The Production Grid is considered to be functional and hence the focus is now on improving performance of the system, especially w.r.t. data storage and management. The GridPP2 Project is now approaching halfway and has met 40% of its original targets with 91% of the metrics within specification.

32 Grid Overview Aim: by 2008 (full year’s data taking) -CPU ~100MSi2k (100,000 CPUs) -Storage ~80PB - Involving >100 institutes worldwide -Build on complex middleware being developed in advanced Grid technology projects, both in Europe (Glite) and in the USA (VDT) 1.Prototype went live in September 2003 in 12 countries 2.Extensively tested by the LHC experiments in September 2004

33 Some of the challenges for 2006 File transfers –Good initial progress –But some way still to go with testing - stressing reliability, performance –Can only be done with participation of experiments –Distribution to other sites being planned Distributed VO services –Plan agreed – T1 will sign off and then VO boxes may be deployed by T2s –But still to deploy pilot services - ALICE ATLAS CMS LHCb End-to-end testing of the T0-T1-T2 chain –MC production, reconstruction, distribution Full Tier-1 work load testing –Recording, reprocessing, ESD distribution, analysis, Tier-2 support Understanding the “Analysis Facility” –batch analysis @ T1 and T2 –interactive analysis Startup scenarios –Schedule is known at high level and defined for Service Challenges – testing time ahead (in many ways)

34 Data Processing 9 orders of magnitude

35 Getting Started http://ca.grid-support.ac.uk/ 1. Get a digital certificate 2. Join a Virtual Organisation (VO) For LHC join LCG and choose a VO 3. Get access to a local User Interface Machine (UI) and copy your files and certificate there Authentication – who you are http://lcg-registrar.cern.ch/ Authorisation – what you are allowed to do

36 Job Preparation ############# athena.jdl ################# Executable = "athena.sh"; StdOutput = "athena.out"; StdError = "athena.err"; InputSandbox = {"athena.sh", "MyJobOptions.py", "MyAlg.cxx", "MyAlg.h", "MyAlg_entries.cxx", "MyAlg_load.cxx", "login_requirements", "requirements", "Makefile"}; OutputSandbox = {"athena.out","athena.err", "ntuple.root", "histo.root", "CLIDDBout.txt"}; Requirements = Member("VO-atlas-release-10.0.4", other.GlueHostApplicationSoftwareRunTimeEnvironment); ################################################ Input files Output Files Choose ATLAS Version Prepare a file of Job Description Language (JDL): My C++ Code Job Options Script to run

37 Deployment Board Tier1/Tier2, Testbeds, Rollout Service specification & provision User Board Requirements Application Development User feedback MetadataWorkloadNetwork Security Info. Mon. PMB Storage III. Grid Middleware I. Experiment Layer II. Application Middleware IV. Facilities and Fabrics User Board Deployment Board Management: Mapping Grid Structures

38 GridPP Status? GridPP status (last night) 14 Sites 2,898 CPUs 124 TBytes storage


Download ppt "GridPP & The Grid Who we are & what it is Tony Doyle."

Similar presentations


Ads by Google