Presentation is loading. Please wait.

Presentation is loading. Please wait.

Research at the National e-Science Centre Dr. Dave Berry Research Manager www.nesc.ac.uk 6 th November 2003.

Similar presentations


Presentation on theme: "Research at the National e-Science Centre Dr. Dave Berry Research Manager www.nesc.ac.uk 6 th November 2003."— Presentation transcript:

1 Research at the National e-Science Centre Dr. Dave Berry Research Manager www.nesc.ac.uk 6 th November 2003

2 Three Pillars of e-Science Research FoundationsTechnologyApplications Apply known results Focus for new work Enable new science Steering of development Edinburgh: - Informatics - Physics & Astronomy Glasgow: - Computing Science - Physics & Astronomy EPCC ETF&Testbeds edikt Repositories Computing Industry Research Departments Research Institutes Other Universities Commercial Customers

3 Information Grids FoundationsTechnologyApplications Apply known results Focus for new work Enable new science Steering of development Publishing Scientific Data GridPP ScotGrid QCDGrid OGSA-DAI/ DAIT edikt – eldas and BinX ODD-Genes AstroGrid BRIDGES FirstDIG Biological Spatio- Temporal Databases 1,000 th Download Sep 2003 Peter Buneman’s Group Tony Doyle & Steve Playfer Richard Kenway Richard Baldock

4 Computation Grids FoundationsTechnologyApplications Apply known results Focus for new work Enable new science Steering of development GridPP ScotGrid RealityGrid Enhance SunDCG ODD-Genes PGPGrid Murray Cole > 3000 doc downloads Paul Cockshott

5 Fabrics and Platforms FoundationsTechnologyApplications Apply known results Focus for new work Enable new science Steering of development AMUSE Dynamic Configuration of Grid Fabrics Dependable Grid Services MS.NETGrid GridWeaver OGSA Test Grid IBM Grid Evaluation Joe Sventek Stuart Anderson LCFG + SmartFrog

6 More foundations Service Composition Deductive Synthesis Techniques … Inferring QoS Properties for Grid Applications Mobile Code Mobile Resource Guarantees IRCs CoAKTinG EQUATOR Security Technologies for Information Environment Security Alan Bundy Don Sannella, Stephen Gilmore Austin Tate Matthew Chalmers

7 More applications Physics CDF Grid Development NeuroInformatics Grid-enabled Modelling Tools and Databases for Neuroinformatics BioInformatics e-Diamond (mammography) http://www.nesc.ac.uk/projects/ David Wilshaw Rob Procter

8 Data Repositories Medical Genetics Generation Scotland Human Genetics Unit Mouse Atlas Nuclear Protein Database Roslin Institute ArkDB, Informatics EUSTACE Corpus FlyTrap GeoSciences Antarctic Survey data Continental seismic survey data BGS offshore survey

9 Example: ODD-Genes ODD-Genes is a demonstrator Demonstrates how Grid technologies enable e-Science, accelerating scientific discovery SunDCG’s TOG software allows for job submission on remote compute resources OGSA-DAI provides access, control and discovery of data resources ODD-Genes used to investigate Wilms Tumour Routine statistical conditioning of microarray results Data-driven discovery of novel targets for investigation and potential therapy Collaborative project NeSC/EPCC Scottish Centre for Genomic Technology and Informatics (GTI) Human Genetics Unit at MRC, Western General Hospital (HGU) "This project has demonstrated how Grid technologies can be used to enable true e-Science - discoveries that would not otherwise have been achieved without this infrastructure in place." Professor Peter Ghazal, Director, GTI.

10 SunDCG – Enabling Routine Statistical Conditioning Choose analysis to perform Automates analysis process Provides predetermined workflow Can run more than one analysis at a time Multiple reproducible avenues for investigation Reduces cost (human, machine), increases availability TOG enables this by allowing access to HPC resources

11 SunDCG Compute Scheduler B Grid Engine abcd e efgh d A Globus 2 User A User B Integrates Grid Engine and Globus 2 GE execution methods provide job submission/control GE job context stores job specific information Globus GSI for security Globus GRAM enables interaction with remote resource GASS for small data transfer, GridFTP for large datasets

12 OGSA-DAI - Results Investigation Multiple views of data Raw Heat Map Cluster Map Wilms Tumour study takes a new direction two genes appear significant in early development Researchers would like more info on these genes…

13 OGSA-DAI - Data Resource Discovery OGSA-DAI uses keywords to locate relevant data resources May return data resources previously unknown to researcher Researcher selects most interesting data resource to query for information about gene Researcher selects Mouse atlas – narrow, deep database of spatial gene expression in mice embryonic development Contrast with GTI database of broad, shallow genome-wide gene expression across multiple organisms, stages & conditions

14 OGSA-DAI - Data Resource Query OGSA-DAI returns data from query Data and annotation displayed Data contains references to related images Researcher rapidly moves from numeric and textual description to spatial representation of relevant gene expression These show that the genes are stem cell markers Targets for focussed investigation, potential therapy

15 1a. Request to Registry for sources of data about “x” 1b. Registry responds with Factory handle 2a. Request to Factory for access to database 2c. Factory returns handle of GDS to client 3a. Client queries GDS with XPath, SQL, etc 3b. GDS interacts with database 3c. Results of query returned to client as XML SOAP/HTTP service creation API interactions RegistryFactory 2b. Factory creates GridDataService to manage access Grid Data Service Client XML / Relationa l database Data Access & Integration Services

16 Example: Mobile Resource Guarantees The MRG technology consists of programming languages; type systems for the languages; logics for expressing statements of resource consumption; and proof technology for proving these statements. Camelot, a high-level functional programming language with objects and resource control; Grail, a strongly-typed intermediate language which is the target language of the Camelot compiler and is interconvertible with Java byte code; A cost model, a formal semantics for byte code execution which tracks execution time and space allocation; A byte code logic allowing the expression of costs, embedded in a generic proof system (Isabelle).

17 Resource-bounded mobile code

18 Relevance to Grids Grid service providers need to schedule competing requests for access to resources. With 25Kb of code and 1Pb of sky survey data it is infeasible to ship the data to the code. There are projects which have supported scientific programming in functional languages (e.g. Psicho). An alternative would be to transfer the MRGtechnology to Java or Java-like languages (ESC/Java, SpecialJ, and Pizza).

19 Example: AMUSE Autonomic Management of Ubiquitous Systems for e-Health Automated management of complex distributed application systems Architectural pattern and prototype implementations for closed-loop management of such systems Policy-based management AMUSE will integrate these to address automated management of e-Health applications

20 Closed-loop Management Pattern (Self-Managed Cell) Measurement Adapters “System” Under Test Provisioning Analysis, Simulation, Optimization Measurement “System” Configuration Service Goals System Policy Policy Management Topology, Other Event Bus Trends & Prediction Raw Measurement Management Application

21 Two-level nesting Management Application Level n Agents “System” Prov Infer Meas ConfigPolicy Event Bus Measurement Adapter Provisioning Analysis, Simulation, Optimization Measurement “System” Configuration Service Goals System Policy Policy Management Topology, Other Event Bus Trends & Prediction Raw Measurement Level n-2 Level n-1

22 GGF: Standardisation Grid Research Oversight Committee & Programme Committee Prof. Malcolm Atkinson Data Access and Integration Services Working Group Dr Mario Antonioletti (Group Secretary & Editor), Dr Amy Krause (Editor) Prof. Malcolm Atkinson, Dr Martin Westhead, Neil Chue Hong (Authors) Dr. Mike Jackson Data Format Definition Language Working Group Dr Martin Westhead (Founder and Chair) Job Submission Definition Language Working Group Dr Ali Anjomshoaa (founder and chair) Open Grid Services Architecture Working Group Dr Dave Berry Open Grid Services Infrastructure Working Group Dr Mike Jackson, Daragh Byrne

23 Data Services GGF Data Access & Integration Services (DAIS) OGSI-compliant interfaces to access relational and XML databases Needs to be generalized to encompass other data sources (see next slide…) Generalized DAIS becomes the foundation for: Replication: Data located in multiple locations Federation: Composition of multiple sources Provenance: How was data generated?

24 GDTS 2 GDS 3 2 GDTS 1 S x S y 1a. Request to Registry for sources of data about “x” & “y” 1b. Registry responds with Factory handle 2a. Request to Factory for access and integration from resources Sx and Sy 2b. Factory creates GridDataServices network 2c. Factory returns handle of GDS to client 3a. Client submits sequence of scripts each has a set of queries to GDS with XPath, SQL, etc 3c. Sequences of result sets returned to analyst as formatted binary described in a standard XML notation SOAP/HTTP service creation API interactions Data Registry Data Access & Integration master Client Analyst XML database Relational database GDS GDTS 3b. Client tells analyst GDS 1 Future DAI Services “scientific” Application coding scientific insights Problem Solving Environment Semantic Meta data Application Code

25 Take Home Message In addition to our national services, NeSC has a thriving research programme Foundation departments Technology development (EPCC, NeSC, Globus Alliance) Research scientists Wide breadth of interest Particular focus on scientific data OGSA-DAI is here now Join in making better DAI services & standards Bioinformatics and Astronomy are Priority Application Areas There are many opportunities for collaboration

26 OGSA Infrastructure Architecture OGSI: Interface to Grid Infrastructure Data Intensive Applications for Science X Compute, Data & Storage Resources Distributed Simulation, Analysis & Integration Technology for Science X Data Intensive Users Virtual Integration Architecture Generic Virtual Data Access and Integration Layer Structured Data Integration Structured Data Access Structured Data Relational XML Semi-structured- Transformation Registry Job Submission Data TransportResource Usage Banking BrokeringWorkflow Authorisation

27 ODD-Genes Caveats & Further Work ODD-Genes is a demonstrator Need to develop production applications for both routine statistical processing and data resource discovery and query Need to parameterise routine conditioning appropriately to complete automation ODD-Genes requires GRID infrastructure Participating researchers need to partner with centres who host application front-ends (or, host the infrastructure themselves) However, alternatives often proprietary, expensive, less flexible ODD-Genes requires registration by data-hosts Critical mass of registered data sources.

28 SunDCG - Conditioning Results Results of conditioning can be analysed and investigated Researcher has potentially several views of data to explore, all presented simultaneously in parallel (cp traditional serialised, manual process) Researcher can reproduce this initial condition for repeated analyses Researcher need not perform each step manually and serially, or ask dedicated statistician to do so.

29 “OGSA Data Services” Foster, Tuecke, Unger, editors Describes conceptual model for representing all manner of data sources as Web services Database, filesystems, devices, programs, … Integrates WS-Agreement Data service is an OGSI-compliant Web service that implements one or more of base data interfaces: DataDescription, DataAccess, DataFactory, DataManagement These would be extended and combined for specific domains (including DAIS)

30 OGSA-DAI Approach Reuse existing technologies and standards OGSA, Query languages, Java, transport Build portTypes and services which will enable: controlled exposure of heterogenous data resources on an OGSI- compliant grid access to these resource via common interfaces using existing underlying query mechanisms (ultimately) data integration across distributed data resources OGSA-DAI (the software) seeks to be a reference implementation of the GGF DAIS WG standard Can’t keep up with frequent standard changes, so software releases track specific drafts See http://www.ogsadai.org.uk/ for details.http://www.ogsadai.org.uk/


Download ppt "Research at the National e-Science Centre Dr. Dave Berry Research Manager www.nesc.ac.uk 6 th November 2003."

Similar presentations


Ads by Google