Presentation is loading. Please wait.

Presentation is loading. Please wait.

The OGSA-DAI Project Databases and the Grid Neil Chue Hong Project Manager EPCC, Edinburgh

Similar presentations


Presentation on theme: "The OGSA-DAI Project Databases and the Grid Neil Chue Hong Project Manager EPCC, Edinburgh"— Presentation transcript:

1 http://www.ogsadai.org.uk The OGSA-DAI Project Databases and the Grid Neil Chue Hong Project Manager EPCC, Edinburgh N.ChueHong@epcc.ed.ac.uk

2 http://www.ogsadai.org.uk What is OGSA-DAI?  It is a project: –OGSA Data Access and Integration: funded by the UK eScience Grid Core Programme  It is a vision: –From simple database access to truly virtualised data resources  It is a standard: –The GridDataService Specification from the Data Access and Integration Working Group (DAIS-WG) of the Global Grid Forum (GGF)  It is software that you can use: –Current version is R2.5

3 http://www.ogsadai.org.uk OGSA-DAI Objective  To define: –open standards and –open source based –uniform service interfaces –for accessing heterogeneous data sources –within the Open Grid Services Architecture (OGSA) framework  Why? –Because we are increasingly wanting to integrate different data sources from different organisations together –The Grid, and OGSA, appears to provide a framework for producing software to do this

4 http://www.ogsadai.org.uk Who are we? £3 million, 18 months, started February 2002 Funded by the Grid Core Programme IBM USA Oxford Glasgow Cardiff Southampton London Belfast Daresbury Lab RAL EPCC & NeSC Newcastle IBM Hursley Oracle Manchester Cambridge Hinxton Contributing to the global grid computing community EPCC & NeSC IBM UK IBM USA Manchester e-SC Newcastle e-SC Oracle 373 man months

5 http://www.ogsadai.org.uk What are we doing? Grid Plumbing & Security Infrastructure SchedulingAccounting MonitoringDiagnosisLogging Data Intensive Applications Data & Storage Resources Distributed Scientific Data Mining & Integration Technology

6 http://www.ogsadai.org.uk What are we doing? Grid Plumbing & Security Infrastructure SchedulingAccounting MonitoringDiagnosisLogging Data Intensive Applications Data & Storage Resources Distributed Authorisation Data Access Data Integration Structured Data Scientific Data Mining & Integration Technology

7 http://www.ogsadai.org.uk What are we doing? Grid Plumbing & Security Infrastructure SchedulingAccounting MonitoringDiagnosisLogging Data Intensive Applications Data & Storage Resources Distributed Authorisation Data Access Data Integration Structured Data Scientific Data Mining & Integration Technology Operations Team App. Developers Owners

8 http://www.ogsadai.org.uk What are we doing? Grid Plumbing & Security Infrastructure SchedulingAccounting MonitoringDiagnosisLogging Data Intensive Applications Data & Storage Resources Distributed Authorisation Data Access Data Integration Structured Data Scientific Data Mining & Integration Technology Operations Team App. Developers Owners Data Intensive Application Scientists Data Providers Data Curators Tech. Developers

9 http://www.ogsadai.org.uk What are we doing? Grid Plumbing & Security Infrastructure SchedulingAccounting MonitoringDiagnosisLogging Data Intensive Applications Data & Storage Resources Distributed Authorisation Data Access Data Integration Structured Data Scientific Data Mining & Integration Technology Operations Team App. Developers Owners Data Intensive Application Scientists Data Providers Data Curators Tech. Developers Keep all the groups happy

10 http://www.ogsadai.org.uk Project Requirements  Derived from project requirements survey –see DAIS WG  Driven by Technical Authority and Early Adopters –AstroGrid –MyGrid  Close relationship with many other projects

11 http://www.ogsadai.org.uk DAIS WG  GridDatabaseService Specification –DAIS WG of the GGF –Aim to produce a V1.0 specification by early 2004 –Defines an interface for a GridDatabaseService –May contributors, not just OGSA-DAI Project –OGSA-DAI (the software) seeks to be a reference implementation of this standard But does not necessarily track it exactly just now –Requirements and Overview Informational documents also published

12 http://www.ogsadai.org.uk The OGSA-DAI Approach  Reuse existing technologies and standards –OGSA, Query languages, Java, transport  Three key services: –GridDataService –GridDataServiceFactory –DAIServiceGroupRegistry  Benefits: –Location independence –Hides heterogeneity –Scalable –Flexible –Dynamic

13 http://www.ogsadai.org.uk OGSA-DAI Positioning - Today Location Meta Data Notification OGSA Lifetime Drivers Query (Create Retrieve Update Delete) Data Format OGSA-DAI Basic Services OGSA-DAI Distributed Query Delivery Database, Communication, OS… Technology GDS DAISGRGDSF

14 http://www.ogsadai.org.uk OGSA-DAI in one slide

15 http://www.ogsadai.org.uk OGSA-DAI To Date  Assuming that OGSA becomes the standard framework –Have adopted the OGSA approach  Have first concentrated on data access –Released software has only limited data integration so far –Distributed query processor prototype due in July  Implementation provides focus on basic functionality first –But architecturally we have tried to answer many pertinent questions –Functionality will increase over subsequent releases

16 http://www.ogsadai.org.uk GDS in action Database (Xindice MySQL Oracle DB2) 1a. Request to Registry for sources of data about “x” 1b. Registry responds with Factory handle 2a. Request to Factory for access to database 2b. Factory creates GridDataService to manage access 2c. Factory returns handle of GDS to client 3a. Client queries GDS with SQL, XPath, XQuery etc 3b. GDS interacts with database 3c. Results of query returned to client as XML SOAP/HTTP service creation API interactions Analyst Registry DAISGR Factory GDSF Grid Data Service GDS Consumer OR 3d. Results of query delivered to consumer as XML

17 http://www.ogsadai.org.uk Activities  OGSA-DAI is structured around the concept of activities  This framework allows new functionality to be added easily  Three types of activity at present: –statement (e.g. SQLQuery, Xupdate) –transformation (e.g. XSL translation, compression) –delivery (e.g. GridFTP)  OGSA-DAI provides implementations of common functionality, others can extend

18 http://www.ogsadai.org.uk Documents  Accessing a Grid Data Resource is done using Documents –caveat: this may change  A document allows you to: –define parameters –execute activities –deliver results  Written in XML, normally used by a client. 10 SELECT * FROM littleblackbook WHERE id=?

19 http://www.ogsadai.org.uk OGSA-DAI Core Services  OGSA-DAI Release 2.5 – out now –Java, Tomcat, Globus Toolkit 3 Beta –Supports MySQL, DB2, Xindice; SQL92, XPath, Xupdate  OGSA-DAI Release 3 – end July –Java, Tomcat, Globus Toolkit 3.0 –Supports MySQL, DB2, Oracle, Xindice; SQL92, XPath, Xupdate –Adds Notification, Internationalisation, Transactions, Caching  Continue to track Globus Toolkit 3 releases –Experimental, then production, GT3 grids will help

20 http://www.ogsadai.org.uk Data Resource Implementation Mapping

21 http://www.ogsadai.org.uk Activity Mapping

22 http://www.ogsadai.org.uk  Asynchronous delivery – Pull  Asynchronous delivery – Push Client Consumer DB GDS GDT GDS Instance RaRa Q 1 2 3 RsRs DT GSH/R + data id D + GDH Client Consumer DB GDS GDT GDS Instance RaRa Q + D + GSH/R 1 2 3 RsRs DT GSH/R Asynchronous Delivery

23 http://www.ogsadai.org.uk GDS Client GDS Client 1 Operation GDS Client 2 DB Operation DB 4 Operation DB GDS 3 Operation DB GDS Client 5 Operation DB GDS GDS Composition

24 http://www.ogsadai.org.uk Distributed Query Service  A higher level service: –Extension of Polar* query processor, partitions and schedules queries –Sits on top of OGSA and OGSA-DAI  Defines new portTypes and services –GridDistributedQuery(GDQ) PortType –GridDistributedQueryService(GDQS) – wraps Polar* –GridQueryEvaluatorService(GQES) – perform subqueries  Currently based on OGSA-DAI Release 1.5

25 http://www.ogsadai.org.uk DQS Architecture

26 http://www.ogsadai.org.uk DQP in action

27 http://www.ogsadai.org.uk DQS: the future  The GridDistributedQueryService –is an example of a higher level data integration service which utilises OGSA-DAI core services –Assumes that GDSF, GDQS Factory and client live in different containers –Really requires a well-defined meta-model for the physical schema of a database Being partially addressed in DAIS WG –Shows how a GDS can be both client and service Service hierarchy and composition  DAIT (proposed follow-on to OGSA-DAI) would produce a robust reference implementation of the DQP components

28 http://www.ogsadai.org.uk Projects using OGSA-DAI  Industry: –FirstDIG: business process analysis (with First Transport Group) OGSA-DAI with datamining  Collaborative –Bridges: database integration over six geographically distributed genomics research sites (with IBM UK) OGSA-DAI with DiscoveryLink –eDIKT: porting OGSA-DAI to other platforms OGSA-DAI with performance –DEISA: linking Europe’s HPC centres OGSA-DAI with distributed accounting –MS.Net Grid: porting OGSA-DAI to the.Net framework (with Microsoft Research UK) OGSA-DAI with.Net

29 http://www.ogsadai.org.uk ODD Genes  OGSA-DAI used to query gene expression data resources at GTI and HGU –One data resource: low spatial resolution, high gene resolution –Other resource: high spatial resolution, low gene resolution –Query one database and use data to find correct data resource to run more detailed query and produce visualisation –Simple example of data integration at work Client Query Render GTI GDS EPCC HGU

30 http://www.ogsadai.org.uk Project Timeline Feb ’02May ’02Jul ’02Sep ’02Dec ’02Feb ’03Sep ’03 Ship Release 1 (Jan 15 th 2003) RDB + GT2 / OGSA Prototypes Available XML + OGSA Prototype Available Design Documents & Demos for DAIS WG @ GGF5 XML + OGSA Prototypes for Early Adopters WS + GSI UK support ( > 100 downloads) Tutorial @ GGF7 GGF6 WG Papers & Prototypes today Release 2 Release 3 Phase 2 Starts Phase 1 Starts Release 1.5 (Feb 28 th 2003) OGSADAI Tutorial @ NeSC Early Adopters Workshop @ NeSC Tutorial @ NeSC GT3 A3GT3 Beta GT3 A4GT3 Final May ’03 GT3 A1 GT3 A2 TP5TP4 Release 2.5

31 http://www.ogsadai.org.uk A DAIT for the Future  DAIT (Data Access and Integration Two) –follow on project from OGSA-DAI, funded for two years –continue to research, prototype and productise –release every six months, R4 in December 2003 –R4: support for SQL Server and structured filesystems extended DBMS management functionality (e.g. archive) bulk load operations (where supported) support for DFDL file access triggers exposed through notification –R5 Distributed Query Processing, Distributed Transactions Virtualised views across databases

32 http://www.ogsadai.org.uk Further information  The OGSA-DAI Project Site: –http://www.ogsadai.org.uk  The DAIS-WG site: –http://cs.man.ac.uk/grid-db  OGSA-DAI Users Mailing list –users@ogsadai.org.uk –General discussion on grid data access and integration  Formal support for OGSA-DAI releases –http://www.ogsadai.org.uk/support + support@ogsadai.org.uk  OGSA-DAI training courses –http://www.ogsadai.org.uk/courses/


Download ppt "The OGSA-DAI Project Databases and the Grid Neil Chue Hong Project Manager EPCC, Edinburgh"

Similar presentations


Ads by Google