Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tony Hey Director of UK e-Science Programme e-Science, Databases and the Grid.

Similar presentations

Presentation on theme: "Tony Hey Director of UK e-Science Programme e-Science, Databases and the Grid."— Presentation transcript:

1 Tony Hey Director of UK e-Science Programme e-Science, Databases and the Grid

2 e-Science and the Grid ‘e-Science is about global collaboration in key areas of science, and the next generation of infrastructure that will enable it.’ ‘e-Science will change the dynamic of the way science is undertaken.’ John Taylor Director General of Research Councils Office of Science and Technology

3 NASA’s IPG The vision for the Information Power Grid is to promote a revolution in how NASA addresses large-scale science and engineering problems by providing persistent infrastructure for –“highly capable” computing and data management services that, on-demand, will locate and co-schedule the multi-Center resources needed to address large-scale and/or widely distributed problems –the ancillary services that are needed to support the workflow management frameworks that coordinate the processes of distributed science and engineering problems

4 IPG Baseline System ARC SDSC LaRC GSFC MSFC KSC JSC NCSA Boeing JPL NGIX EDC NREN CMU GRC 300 node Condor pool NTON-II/SuperNet MCAT/ SRB O2000 DMF MDS CA O2000 cluster O2000 MDS

5 Lift Capabilities Drag Capabilities Responsiveness Deflection capabilities Responsiveness Thrust performance Reverse Thrust performance Responsiveness Fuel Consumption Braking performance Steering capabilities Traction Dampening capabilities Crew Capabilities - accuracy - perception - stamina - re-action times - SOP’s Engine Models Airframe Models Wing Models Landing Gear Models Stabilizer Models Human Models Multi-disciplinary Simulations Whole system simulations are produced by coupling all of the sub-system simulations

6 Multi-disciplinary Simulations Virtual National Air Space VNAS GRC Engine Models LaRC Airframe Models Landing Gear Models ARC Wing Models Stabilizer Models Human Models FAA Ops Data Weather Data Airline Schedule Data Digital Flight Data Radar Tracks Terrain Data Surface Data 22,000 Commercial US Flights a day 50,000 Engine Runs 22,000 Airframe Impact Runs 132,000 Landing/ Take-off Gear Runs 48,000 Human Crew Runs 66,000 Stabilizer Runs 44,000 Wing Runs Simulation Drivers (Being pulled together under the NASA AvSP Aviation ExtraNet (AEN) Many aircraft, flight paths, airport operations, and the environment are combined to get a virtual national airspace National Air Space Simulation Environment

7 The Grid as an Enabler for Virtual Organisations  Ian Foster and Carl Kesselman – ‘Take 2’ The Grid is a software infrastructure that enables flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions and resources -includes computational systems and data storage resources and specialized facilities Enabling infrastructure for transient ‘Virtual Organisations’

8 Globus Grid Middleware Single Sign-On –Proxy credentials, GRAM Mapping to local security mechanisms –Kerberos, Unix, GSI Delegation –Restricted proxies Community authorization and policy –Group membership, trust File-based –GridFTP gives high performance FTP integrated with GSI

9 US Grid Projects NASA Information Power Grid DOE Science Grid NSF National Virtual Observatory NSF GriPhyN DOE Particle Physics Data Grid NSF Distributed Terascale Facility DOE ASCI Grid DOE Earth Systems Grid DARPA CoABS Grid NEESGrid DOH BIRN NSF iVDGL

10 EU GridProjects DataGrid (CERN,..) EuroGrid (Unicore) DataTag (TTT…) Astrophysical Virtual Observatory GRIP (Globus/Unicore) GRIA (Industrial applications) GridLab (Cactus Toolkit) CrossGrid (Infrastructure Components) EGSO (Solar Physics)

11 National Grid Projects UK e-Science Grid Japan – Grid Data Farm, ITBL Netherlands – VLAM, PolderGrid Germany – UNICORE, Grid proposal France – Grid funding approved Italy – INFN Grid Eire – Grid proposals Switzerland - Grid proposal Hungary – DemoGrid, Grid proposal ApGrid ……

12 UK e-Science Initiative £120M Programme over 3 years £75M is for Grid Applications in all areas of science and engineering £10M for Supercomputer upgrade £35M for development of ‘industrial strength’ Grid middleware  Require £20M additional ‘matching’ funds from industry

13 Cambridge Newcastle Edinburgh Oxford Glasgow Manchester Cardiff Southampton London Belfast DL RAL Hinxton UK e-Science Grid

14 Generic Grid Middleware All e-Science Centres donating resources to form a UK ‘national’ Grid –Supercomputers, clusters, storage, facilities All Centres will run same Grid Software -Starting point is Globus, Storage Resource Broker and Condor Work with Global Grid Forum and major computing companies (IBM, Oracle, Microsoft, Sun,….) –Aim to ‘industry harden’ Grid software to be capable of realizing secure VO vision

15 IBM Grid Press Release: 2/8/01 Interview with Irving Wladawsky-Berger: ‘Grid computing is a set of research management services that sit on top of the OS to link different systems together’ ‘We will work with the Globus community to build this layer of software to help share resources’ ‘All of our systems will be enabled to work with the grid, and all of our middleware will integrate with the software’

16 Particle Physics and Astronomy e-Science Projects GridPP –links to EU DataGrid, CERN LHC Computing Project, U.S. GriPhyN and PPGrid Projects, and iVDGL Global Grid Project AstroGrid – links to EU AVO and US NVO projects

17 GridPP Project (1) CERN LHC Machine due to be completed by 2006 ATLAS/CMS Experiments each involve more than 2000 physicists from more than 100 organisations in USA, Europe and Asia Within first year of operation from 2006 (7?) each project will need to store, access, process and analyze 10 PetaBytes of data Use hierarchical ‘Tiers’ of data and compute Centres providing 200 Tflop/s

18 GridPP Project (2) LHC Datavolume expected to reach 1 Exabyte and require several PetaFlop/s compute power by 2015 Use simulated data production and analysis from 2002 –5% complexity data challenge (Dec 2002) –20% of the 2007 CPU and 100% complexity (Dec 2005) Start of LHC operation 2006 (7?) Testbed Grid deployments from 2001

19 Equator: Technological innovation in physical and digital life AKT: Advanced Knowledge Technologies - DIRC: Dependability of Computer-Based Systems From Medical Images and Signals to Clinical Information IRC e-HealthCare Grand Challenge

20 Comb-e-Chem:Structure-Property Mapping –Southampton, Bristol, Roche, Pfizer, IBM DAME: Distributed Aircraft Maintenance Environment –York, Oxford, Sheffield, Leeds, Rolls Royce Reality Grid: A Tool for Investigating Condensed Matter and Materials –QMW, Manchester, Edinburgh, IC, Loughborough, Oxford, Schlumberger, … EPSRC e-Science Projects (1)

21 EPSRC e-Science Projects (2) My Grid: Personalised Extensible Environments for Data Intensive in silico Experiments in Biology –Manchester, EBI, Southampton, Nottingham, Newcastle, Sheffield, GSK, Astra-Zeneca, IBM GEODISE: Grid Enabled Optimisation and Design Search for Engineering –Southampton, Oxford, Manchester, BAE, Rolls Royce Discovery Net: High Throughput Sensing Applications –Imperial College, Infosense, …

22 Comb-e-Chem:Structure- Property Mapping Goal is to integrate structure and property data sources within knowledge environment to find new chemical compounds with desirable properties Accumulate, integrate and model extensive range of primary data from combinatorial methods Support for provenance and automation including multimedia and metadata Southampton, Bristol, Cambridge Crystallographic Data Centre Roche Discovery, Pfizer, IBM

23 MyGrid: An e-Science Workbench Goal is to develop ‘workbench’ to support: –Experimental process of data accumulation –Use of community information –Scientific collaboration Provide facilities for resource selection, data management and process enactment Bioinformatics applications –Functional genomics, database annotation Manchester, EBI, Newcastle,Nottingham, Sheffield, Southampton GSK, AstraZeneca, Merck, IBM, Sun,...

24 Grid Database Requirements (1) Scalability –Store Petabytes of data at TB/hr –Low response time for complex queries to retrieve data for more processing –Large number of clients needing high access throughput Grid Standards for Security, Accounting,.. –GSI with digital certificates –Data from multiple DBMS –Co-schedule database and compute servers

25 Grid Database Requirements (2) Handling Unpredictable Usage –Most existing DB applications have reasonably predictable access patterns and usage ond DB resources can be restricted –Typical commercial applications generate large numbers of small transactions from large number of users –Grid applications can have small number of large transactions needing more ad hoc access to DBMS resources  much greater variations in time and resource usage

26 Grid Database Requirements (3) Metadata-driven access –Expect need 2-step access to data Step 1: Metadata search to locate required data on one or more DBMS Step 2: Data accessed, sent to compute server for further analysis –Application writer does not know which specific DBMS accessed in Step 2  Need standard API for Grid-enabled DBMS Multiple Database Integration - Support distributed queries and transactions - Scalability requirements

27 Grid-Service Interface to DBs (Thoughts of Paul Watson) Services would include: Metadata –Used by location and directory services Query –Use GridFTP, support streaming and computation co-scheduling? Transactions –Support distributed transactions via Virtual DBMS?

28 Grid-Service Interface to DBs (continued) Bulk Loading –Use Grid FTP? Scheduling –Allow DBMS and Compute resource to be co-scheduled and bandwidth pre-allocated –Major challenge for DBMS to support resource pre-allocation and management? Accounting –Provide information for Grid accounting and capacity planning

29 Application projects use Clusters, Supercomputers, Data Repositories Emphasis on support for data federation and annotation as much as computation Metadata and ontologies key to higher level Grid services For commercial success Grid needs to have interface to DBMS Summary

Download ppt "Tony Hey Director of UK e-Science Programme e-Science, Databases and the Grid."

Similar presentations

Ads by Google