GridPP, The Grid & Industry Who we are, what it is and what we can do. Tony Doyle, Project Leader Steve Lloyd, Collaboration Board Chairman Robin Middleton,

1 GridPP, The Grid & Industry Who we are, what it is and what we can do. Tony Doyle, Project Leader Steve Lloyd, Collaboration Board Chairman Robin Middleton, Middleware Coordinator Neasan ONeill, Events Officer

2 19 UK Universities, CERN and CCLRC (RAL & Daresbury) Funded by PPARC: GridPP1 2001-2004 (£17m) From Web to Grid GridPP2 2004-2007 (£16m) From Prototype to Production GridPP3 2007-2011 (proposed) From Production to Exploitation Who are GridPP? Developed a working, highly functional Grid

3 Web: information sharing Invented at CERN by Tim Berners-Lee Agreed protocols: HTTP, HTML, URLs Anyone can access information and post their own Quickly crossed over into public use No. of Internet hosts (millions) Year

4 4 Large Experiments The CERN LHC The worlds most powerful particle accelerator Why do particle physicists need the Grid?


6 Example from LHC: starting from this event We are looking for this signature Selectivity: 1 in 10 13 Like looking for 1 person in a thousand world populations Or for a needle in 20 million haystacks! ~100,000,000 electronic channels 800,000,000 proton-proton interactions per second 0.0002 Higgs per second 10 PBytes of data a year (10 Million GBytes = 14 Million CDs) Concorde (15 Km) Mt. Blanc (4.8 Km) One years data from LHC would fill a stack of CDs 20km high

7 Solution – Build a Grid Share more than information Efficient use of resources at many institutes Leverage over other sources of funding Data, computing power, applications Join local communities Challenges: share data between thousands of scientists with multiple interests link major and minor computer centres ensure all data accessible anywhere, anytime grow rapidly, yet remain reliable for more than a decade cope with different management policies of different centres ensure data security be up and running routinely by 2007

8 Middleware is Everything MIDDLEWARE CPU Disks, CPU etc PROGRAMS OPERATING SYSTEM Word/Excel Email/Web Your Program Games CPU Cluster User Interface Machine CPU Cluster CPU Cluster Resource Broker Information Service Single PC Grid Disk Server Your Program Middleware is the Operating System of a distributed computing system Replica Catalogue Bookkeeping Service

9 GridPP Middleware Development Workload Management Storage Interfaces Network Monitoring SecurityInformation Services Grid Data Management

10 What you need to use the Grid 1. Get a digital certificate (UK Certificate Authority) 2. Join a Virtual Organisation (VO) 3. Get access to a local User Interface Machine (UI) and copy your files and certificate there Authentication – who you are Authorisation – what you are allowed to do 4. Write some Job Description Language (JDL) and scripts to wrap your programs ############# HelloWorld.jdl ################# Executable = "/bin/echo"; Arguments = "Hello welcome to the Grid "; StdOutput = "hello.out"; StdError = "hello.err"; OutputSandbox = {"hello.out","hello.err"}; #########################################

11 International Context LHC Computing Grid (LCG) Grid Deployment Project for LHC EU Enabling Grids for e-Science (EGEE) 2004-2008 Grid Deployment Project for all disciplines GridPP LCG EGEE GridPP is part of EGEE and LCG (currently the largest Grid in the world) UK National Grid Service UKs core production computational and data Grid Open Science Grid (USA) Science applications from HEP to biochemistry NorduGrid (Scandinavia) Grid Research and Development collaboration

12 The LCG Grid Status Worldwide 182 Sites 23,438 CPUs 9.2 PB Disk 2,200 Years of CPU time UK 21 Sites 4,482 CPUs 180 TB Disk 593 Years of CPU time

13 What GridPP Has Done So Far Reached transfer speeds of 1 Gigabyte per second in high speed networking tests from CERN – a DVD every 5 seconds Simulated 500 million particle physics collisions with the BaBar experiment Transformed the way particle physics computing problems are approached Analysed 300,000 possible drug components in the fight against the Avian Flu virus Simulated 46 million molecules for medical research in 5 weeks, which would have taken over 80 years on a single PC

14 Who else can use a Grid? Astronomy Healthcare Bioinformatics Gaming Engineering Commerce

15 UK contributes to EGEE's battle with malaria BioMed Successes/Day 1107 Success % 77% WISDOM (Wide In Silico Docking On Malaria) The first biomedical data challenge for drug discovery, which ran on the EGEE grid production service from 11 July 2005 until 19 August 2005. GridPP resources in the UK contributed ~100,000 kSI2k-hours from 9 sites Number of Biomedical jobs processed by country Normalised CPU hours contributed to the biomedical VO for UK sites, July-August 2005

16 "GridPP has been developed to help answer questions about the conditions in the Universe just after the Big Bang," said Professor Keith Mason, head of the Particle Physics and Astronomy Research Council (PPARC). "But the same resources and techniques can be exploited by other sciences with a more direct benefit to society."

17 GridPP & Industry What We Have To Offer Our Grid Security tools GridSite R-GMA APEL accounting system

18 Our Grid The UK Grid (via one of the individual university sites) can be used to run applications for areas such as finance and image processing.

19 Security Tools & Gridsite Grid Security for the Web Web platforms for Grids Digital Certificates Certification Authority Gridsite identifies users to websites with the digital certificates GridSiteWiki is an extension to the tool GridSite is open source (

20 RGMA & APEL accounting system Relational Grid Monitoring Architecture –An information and monitoring system for static and dynamic information about grid resources, applications and networks Accounting Processor for Event Logs –Provides a summary of the resources consumed based on attributes such as CPU time, Wall Clock Time, Memory and grid user identity

21 HP are sponsoring a joint project with GridPP at Bristol. GridPP has an association with IBM through collaboration on ScotGrid and R-GMA. Specific sites also have close relationships with various industrial suppliers. GridPP & Industry Current Involvement

22 Posters at Technology Opportunities from CERN: the impact of Big Physics on Industry. Attended KITE club meetings on: –Healthcare, –Medical image processing –Film and computer games Speakers at a forum on Network and Grid Security organised for the IT industry. GridPP & Industry Current Involvement

23 Future Plan to establish a small steering group to lead technology transfer activity. The group, working with various companies, would examine different methods of technology transfer and identify the GridPP activities that can be used in industry and business.

