Presentation on theme: "GLite adoption and opportunities for collaboration with industry Tony Doyle Distributed Computing Workshop Westminster, 21 May 2008."— Presentation transcript:
gLite adoption and opportunities for collaboration with industry Tony Doyle Distributed Computing Workshop Westminster, 21 May 2008
Introduction Context – PIPPS Projects Who are GridPP? Why do we need a Grid? What is our Grid? What do we offer?
PIPPS Projects David Sinclair and Chris Town (Cambridge Ontology Ltd) and Andy Parker (Cambridge e-Science Centre) –Mini-PIPSS to develop a Content Based Image Retrieval (CBIR) platform powered by gLite –On completion of the Mini-PIPSS project Cambridge Ontology received £535k private equity investment, changed its name to Imense, and is now doing a PIPSS project with Andy Parker Oleg Soloviev (Econophysica) and Steve Lloyd (QMUL) –Mini-PIPSS to develop a Grid based automated trading platform for the financial industry Constellation Technologies Ltd and Neil Geddes (RAL) –PIPSS to develop a commercial version of gLite middleware DiGS and George Beckett (Edinburgh, EPCC) –PIPSS to develop a Data Grid for Cell Biology, sharing biological images between researchers (an example of inter-disciplinary use of software) Other EGEE-wide Projects –Total Oil testbed studies (Aberdeen) –EU-wide biomed docking studies (anti-malarial and bird-flu drug development)
4 Who are GridPP? UKs contribution to LHC computing: - 19 UK Universities, STFC and CERN GridPP1 ( ) £17m From Web to Grid GridPP2 ( ) £16m From Prototype to Production GridPP3 (2008 – 2011) £25m From Production to Exploitation
4 Large Experiments CERN LHC The worlds most powerful particle accelerator Why do particle physicists need the Grid?
Example from LHC: starting from this event We are looking for this signature Selectivity: 1 in Like looking for 1 person in a thousand world populations Or for a needle in 20 million haystacks! ~100,000,000 electronic channels 800,000,000 proton-proton interactions per second Higgs per second 10 PBytes of data a year (10 Million GBytes = 14 Million CDs) Concorde (15 Km) Mt. Blanc (4.8 Km) One years data from LHC would fill a stack of CDs 20km high Who are GridPP? Why do particle physicists need the Grid?
A question of scale
Share more than information Efficient use of resources at many institutes Leverage over other sources of funding Data, computing power, applications Join local communities Challenges: share data between thousands of scientists with multiple interests link major and minor computer centres ensure all data accessible anywhere, anytime grow rapidly, yet remain reliable for more than a decade cope with different management policies of different centres ensure data security be up and running routinely in 2008 Solution – Build a Grid
MIDDLEWARE CPU Disks, CPU etc PROGRAMS OPERATING SYSTEM Word/Excel /Web Your Program Games CPU Cluster User Interface Machine CPU Cluster CPU Cluster Resource Broker Information Service Single PC Grid Disk Server Your Program Middleware is the Operating System of a distributed computing system Replica Catalogue Bookkeeping Service Middleware is the Key
Something like this… gridui JDL VOMS WLMS JS RB LFC BDII Logging & Bookkeeping 3 CPU Nodes Storage Grid Enabled Resources CPU Nodes Storage Grid Enabled Resources CPU Nodes Storage Grid Enabled Resources CPU Nodes Storage Grid Enabled Resources 4 5 Submitter VOMS-proxy-init 1 Job Submission 2 Job Status? 11 Job Retrieval
Grid Infrastructure Tier 0 Tier 1 National centres Tier 2 Regional groups Institutes Workstations Offline farm Online system CERN computer centre RAL,UK ScotGridNorthGridSouthGridLondon FranceItalyGermanySpain GlasgowEdinburghDurham Structure chosen for particle physics. Different for others. 11 T1 centres
Tagged release selected for certification Certified release selected for deployment Tagged package Problem reports add unit tested code to repository Run nightly build & auto. tests Grid certification Fix problems Application Certification Build System Certification Testbed ~40CPU Application Testbed ~1000CPU Certified public release for use by apps. 24x7 Build systemTest Group WPs Unit Test Build Certification Production Users Development Testbed ~15CPU Individual WP tests Integration Team Integration Overall release tests Releases candidate Tagged Releases Releases candidate Certified Releases Apps. Representatives Middleware Validation: From Testbed to Production Process to test: frameworks support policies documentation platforms/compilers
Status March 2007 March 2008 Status in 2007: 177 sites 32,412 CPUs ~13 PB storage Status in 2008: 250 sites, 50 countries 55,094 CPUs ~20PB storage
GridPP & Industry What Do We Offer? Middleware Expertise Our Grid (for test purposes) Examples: Adaptable User Interface (GANGA) Security tools (GridSite) Accounting tools (R-GMA & APEL)
Security Network Monitoring Information Services Grid Data Management Storage Interfaces Workload Management Middleware Expertise
The UK Grid (via the individual research sites) has been used to test applications for other areas e.g. biomedical research financial modelling device modelling oil exploration image processing Our Grid
Adaptable User Interface Job details Logical Folders Job Monitoring Log window Job builder Scriptor Ganga GUI
Grid Security for the Web Web platforms for Grids Digital Certificates Certification Authority Gridsite identifies users to websites with the digital certificates GridSiteWiki is an extension to the tool GridSite is open source (http://www.gridsite.org/) Security Tools
Relational Grid Monitoring Architecture –An information and monitoring system for static and dynamic information about grid resources, applications and networks Accounting Processor for Event Logs –Provides a summary of the resources consumed based on attributes such as CPU time, Wall Clock Time, Memory and grid user identity Accounting tools
Knowledge Exchange KnowledgeExchangeKnowledgeExchange KnowledgeExchangeKnowledgeExchange Accounting Standards Applications Portability Trust Security Business Models Quality of Service Open Source Support Software Licence Management Business Community Research Community Dissemination
Productise software for your business Sustain software on behalf of all users an essential component within the innovation cycle of any knowledge driven economy Dissemination Knowledge Exchange
Summary 1.Opportunity for knowledge creation through improved IT skills and an enhanced research base 2.GridPP supports locally-led activities (based upon an international core of expertise and ongoing examples of collaboration) 3.GridPP will work with companies to examine different methods of technology transfer and identify the activities that can be used for industry and business