The OGSA-DAI Project Databases and the Grid Neil Chue Hong Project Manager EPCC, Edinburgh

Slides:



Advertisements
Similar presentations
Tom Sugden EPCC OGSA-DAI Future Directions OGSA-DAI User's Forum GridWorld 2006, Washington DC 14 September 2006.
Advertisements

Delivery of Industrial Strength Middleware Federated Strengths Agility & Coordination Prof. Malcolm Atkinson Director 21 st January 2004.
Experiences with Converting my Grid Web Services to Grid Services Savas Parastatidis & Paul Watson
National e-Science Centre Glasgow e-Science Hub Opening: Remarks NeSCs Role Prof. Malcolm Atkinson Director 17 th September 2003.
Open Grid Service Architecture - Data Access & Integration (OGSA-DAI) Dr Martin Westhead Principal Consultant, EPCC Telephone: Fax:+44.
Databases and the Grid OGSA-DAI Architecture & Status Malcolm Atkinson OGSA-DAI Chief Architect for all members of the OGSA-DAI team Director of National.
UK e-Science Report on OGSA, OGSI & OGSA-DAI Malcolm Atkinson Director of National e-Science Centre 28 th October 2002 Meeting of the UK.
1 OGSA-DAI Platform Dependencies Malcolm Atkinson for OMII SC 18 th January 2005.
Current status of grids: the need for standards Mike Mineter TOE-NeSC, Edinburgh.
18 April 2002 e-Science Architectural Roadmap Open Meeting 1 Support for the UK e-Science Roadmap David Boyd UK Grid Support Centre CLRC e-Science Centre.
OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
An Overview of OGSA-DAI Kostas Tourlas
BiodiversityWorld GRID Workshop NeSC, Edinburgh – 30 June and 1 July 2005 Resource wrappers, web services, grid services Jaspreet Singh School of Computer.
Distributed Heterogeneous Data Warehouse For Grid Analysis
OGSA-DAI Architecture EPCC, University of Edinburgh Amy Krause International Summer School on Grid Computing - July 2003 Using OGSA-DAI.
Slides thanks to Steve Lynden Amy Krause EPCC Distributed Query Processing with OGSA-DQP Principles and Architectures for Structured Data Integration:
NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.
17 July 2006ISSGC06, Ischia, Italy1 Agenda Session 26 – 14:30-16:00 An Overview of OGSA-DAI OGSA-DAI today – and future features How to extend OGSA-DAI.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
1 OGSA-DAI: Status and Future Plans Neil Chue Hong.
OGSA-DAI: Future Work and Wrap-up The OGSA-DAI Team
GRACE Project IST EGAAP meeting – Den Haag, 25/11/2004 Giuseppe Sisto – Telecom Italia Lab.
Database Taskforce and the OGSA-DAI Project Norman Paton University of Manchester.
1 Dr. Markus Hillenbrand, ICSY Lab, University of Kaiserslautern, Germany A Generic Database Web Service for the Venice Service Grid Michael Koch, Markus.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
EdSkyQuery-G Overview Brian Hills, December
Extensible Framework for Data Access & Integration Malcolm Atkinson Director 10 th November 2004.
DAIS Grid1 Database Access and Integration Services on the Grid * * Authors: N. Paton, M. Atkinson, V.
ES Metadata Management Enabling Grids for E-sciencE ES metadata OGSA-DAI NA4 GA Meeting, D. Weissenbach, IPSL, France.
AstroGrid Overview AG-SAG Cambridge IoA 19 th June 2003 Tony Linde AstroGrid Project Manager University of Leicester, Dept. Physics & Astronomy.
Introduction to OGSA-DAI The OGSA-DAI Team
DAIT (DAI Two) NeSC Review 18 March Description and Aims Grid is about resource sharing Data forms an important part of that vision Data on Grids:
OGSA-DAI in OMII-Europe Neil Chue Hong EPCC, University of Edinburgh.
1 HPDC12 Seattle Structured Data and the Grid Access and Integration Prof. Malcolm Atkinson Director 23 rd June 2003.
1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.
1 1 EPCC 2 Curtin Business School & Edinburgh University Management School Michael J. Jackson 1 Ashley D. Lloyd 2 Terence M. Sloan 1 Enabling Access to.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
Grids - the near future Mark Hayes NIEeS Summer School 2003.
OGSA-DAI.
Data access and integration with OGSA-DAI: OGSA-DQP Steven Lynden University of Manchester.
Grid Services I - Concepts
INFSO-RI Enabling Grids for E-sciencE OGSA DAI Data Access and Integration Marek Ciglan Institute of Informatics, Slovac Academy.
The OGSA-DAI Client Toolkit The OGSA-DAI Team
State Key Laboratory of Resources and Environmental Information System China Integration of Grid Service and Web Processing Service Gao Ang State Key Laboratory.
Mike Jackson EPCC OGSA-DAI Architecture + Extensibility OGSA-DAI Tutorial GGF17, Tokyo.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
OGSA-DAI Neil Chue Hong 29 th January 2007 OGF19, Chapel Hill.
Amy Krause EPCC OGSA-DAI An Overview OGSA-DAI Technology Update GGF17, Tokyo (Japan)
IBM & HSBC visit Malcolm Atkinson Director & e-Science Envoy UK National e-Science Centre & e-Science Institute 30 th March 2006.
1 The Challenge of Data Integration Data + Grid = Discovery? Prof. Malcolm Atkinson Director 22 nd January 2003.
1 OGSA-DAI Status Report Neil P Chue Hong 20 th May 2005.
Introduction to OGSA-DAI Neil Chue Hong OGSA-DAI Project Manager 14 th February 2006 GGF16, Athens.
OGSA-DAI & DAIT projects Update for TAG Prof. Malcolm Atkinson Director 30 th October 2003.
Neil Chue Hong Project Manager, EPCC OGSA-DAI Requirements Gathering Exercise 2 nd DIALOGUE workshop eSI, 9-10.
OGSA-DAI Users’ Meeting Introduction Malcolm Atkinson Director 7 th April 2004.
Data Manipulation with Globus Toolkit Ivan Ivanovski TU München,
OGSA-DAI Open Grid Services Architecture – Data Access and Integration NeSC Review 18 March 2004.
Chinese Delegation Visit High Performance Computer Mission UK e-Science & The National e-Science Centre Prof. Malcolm Atkinson Director
Japanese & UK N+N Data, Data everywhere and … Prof. Malcolm Atkinson Director 3 rd October 2003.
1 OGSA-DAI: Service Grids Neil P Chue Hong. 2 Motivation  Access to data is a necessity on the Grid  The ability to integrate different data resources.
Welcome Grids and Applied Language Theory Dave Berry Research Manager 16 th October 2003.
DataGrid is a project funded by the European Commission EDG Conference, Heidelberg, Sep 26 – Oct under contract IST OGSI and GT3 Initial.
OGSA-DQP Steven Lynden University of Manchester. Data access & integration with OGSA-DAI: GGF 17 2 Introduction OGSA-DQP is a service based distributed.
OGSA-DAI Current Version Guy Warner.
OGSA-DAI.
Amy Krause EPCC OGSA-DAI An Overview OGSA-DAI on OMII 2.0 OMII The Open Middleware Infrastructure Institute NeSC,
Introducing SQL Server 2000 Reporting Services
UK e-Science OGSA-DAI November 2002 Malcolm Atkinson
Grid Systems: What do we need from web service standards?
Presentation transcript:

The OGSA-DAI Project Databases and the Grid Neil Chue Hong Project Manager EPCC, Edinburgh

What is OGSA-DAI?  It is a project: –OGSA Data Access and Integration: funded by the UK eScience Grid Core Programme  It is a vision: –From simple database access to truly virtualised data resources  It is a standard: –The GridDataService Specification from the Data Access and Integration Working Group (DAIS-WG) of the Global Grid Forum (GGF)  It is software that you can use: –Current version is R2.5

OGSA-DAI Objective  To define: –open standards and –open source based –uniform service interfaces –for accessing heterogeneous data sources –within the Open Grid Services Architecture (OGSA) framework  Why? –Because we are increasingly wanting to integrate different data sources from different organisations together –The Grid, and OGSA, appears to provide a framework for producing software to do this

Who are we? £3 million, 18 months, started February 2002 Funded by the Grid Core Programme IBM USA Oxford Glasgow Cardiff Southampton London Belfast Daresbury Lab RAL EPCC & NeSC Newcastle IBM Hursley Oracle Manchester Cambridge Hinxton Contributing to the global grid computing community EPCC & NeSC IBM UK IBM USA Manchester e-SC Newcastle e-SC Oracle 373 man months

What are we doing? Grid Plumbing & Security Infrastructure SchedulingAccounting MonitoringDiagnosisLogging Data Intensive Applications Data & Storage Resources Distributed Scientific Data Mining & Integration Technology

What are we doing? Grid Plumbing & Security Infrastructure SchedulingAccounting MonitoringDiagnosisLogging Data Intensive Applications Data & Storage Resources Distributed Authorisation Data Access Data Integration Structured Data Scientific Data Mining & Integration Technology

What are we doing? Grid Plumbing & Security Infrastructure SchedulingAccounting MonitoringDiagnosisLogging Data Intensive Applications Data & Storage Resources Distributed Authorisation Data Access Data Integration Structured Data Scientific Data Mining & Integration Technology Operations Team App. Developers Owners

What are we doing? Grid Plumbing & Security Infrastructure SchedulingAccounting MonitoringDiagnosisLogging Data Intensive Applications Data & Storage Resources Distributed Authorisation Data Access Data Integration Structured Data Scientific Data Mining & Integration Technology Operations Team App. Developers Owners Data Intensive Application Scientists Data Providers Data Curators Tech. Developers

What are we doing? Grid Plumbing & Security Infrastructure SchedulingAccounting MonitoringDiagnosisLogging Data Intensive Applications Data & Storage Resources Distributed Authorisation Data Access Data Integration Structured Data Scientific Data Mining & Integration Technology Operations Team App. Developers Owners Data Intensive Application Scientists Data Providers Data Curators Tech. Developers Keep all the groups happy

Project Requirements  Derived from project requirements survey –see DAIS WG  Driven by Technical Authority and Early Adopters –AstroGrid –MyGrid  Close relationship with many other projects

DAIS WG  GridDatabaseService Specification –DAIS WG of the GGF –Aim to produce a V1.0 specification by early 2004 –Defines an interface for a GridDatabaseService –May contributors, not just OGSA-DAI Project –OGSA-DAI (the software) seeks to be a reference implementation of this standard But does not necessarily track it exactly just now –Requirements and Overview Informational documents also published

The OGSA-DAI Approach  Reuse existing technologies and standards –OGSA, Query languages, Java, transport  Three key services: –GridDataService –GridDataServiceFactory –DAIServiceGroupRegistry  Benefits: –Location independence –Hides heterogeneity –Scalable –Flexible –Dynamic

OGSA-DAI Positioning - Today Location Meta Data Notification OGSA Lifetime Drivers Query (Create Retrieve Update Delete) Data Format OGSA-DAI Basic Services OGSA-DAI Distributed Query Delivery Database, Communication, OS… Technology GDS DAISGRGDSF

OGSA-DAI in one slide

OGSA-DAI To Date  Assuming that OGSA becomes the standard framework –Have adopted the OGSA approach  Have first concentrated on data access –Released software has only limited data integration so far –Distributed query processor prototype due in July  Implementation provides focus on basic functionality first –But architecturally we have tried to answer many pertinent questions –Functionality will increase over subsequent releases

GDS in action Database (Xindice MySQL Oracle DB2) 1a. Request to Registry for sources of data about “x” 1b. Registry responds with Factory handle 2a. Request to Factory for access to database 2b. Factory creates GridDataService to manage access 2c. Factory returns handle of GDS to client 3a. Client queries GDS with SQL, XPath, XQuery etc 3b. GDS interacts with database 3c. Results of query returned to client as XML SOAP/HTTP service creation API interactions Analyst Registry DAISGR Factory GDSF Grid Data Service GDS Consumer OR 3d. Results of query delivered to consumer as XML

Activities  OGSA-DAI is structured around the concept of activities  This framework allows new functionality to be added easily  Three types of activity at present: –statement (e.g. SQLQuery, Xupdate) –transformation (e.g. XSL translation, compression) –delivery (e.g. GridFTP)  OGSA-DAI provides implementations of common functionality, others can extend

Documents  Accessing a Grid Data Resource is done using Documents –caveat: this may change  A document allows you to: –define parameters –execute activities –deliver results  Written in XML, normally used by a client. 10 SELECT * FROM littleblackbook WHERE id=?

OGSA-DAI Core Services  OGSA-DAI Release 2.5 – out now –Java, Tomcat, Globus Toolkit 3 Beta –Supports MySQL, DB2, Xindice; SQL92, XPath, Xupdate  OGSA-DAI Release 3 – end July –Java, Tomcat, Globus Toolkit 3.0 –Supports MySQL, DB2, Oracle, Xindice; SQL92, XPath, Xupdate –Adds Notification, Internationalisation, Transactions, Caching  Continue to track Globus Toolkit 3 releases –Experimental, then production, GT3 grids will help

Data Resource Implementation Mapping

Activity Mapping

 Asynchronous delivery – Pull  Asynchronous delivery – Push Client Consumer DB GDS GDT GDS Instance RaRa Q RsRs DT GSH/R + data id D + GDH Client Consumer DB GDS GDT GDS Instance RaRa Q + D + GSH/R RsRs DT GSH/R Asynchronous Delivery

GDS Client GDS Client 1 Operation GDS Client 2 DB Operation DB 4 Operation DB GDS 3 Operation DB GDS Client 5 Operation DB GDS GDS Composition

Distributed Query Service  A higher level service: –Extension of Polar* query processor, partitions and schedules queries –Sits on top of OGSA and OGSA-DAI  Defines new portTypes and services –GridDistributedQuery(GDQ) PortType –GridDistributedQueryService(GDQS) – wraps Polar* –GridQueryEvaluatorService(GQES) – perform subqueries  Currently based on OGSA-DAI Release 1.5

DQS Architecture

DQP in action

DQS: the future  The GridDistributedQueryService –is an example of a higher level data integration service which utilises OGSA-DAI core services –Assumes that GDSF, GDQS Factory and client live in different containers –Really requires a well-defined meta-model for the physical schema of a database Being partially addressed in DAIS WG –Shows how a GDS can be both client and service Service hierarchy and composition  DAIT (proposed follow-on to OGSA-DAI) would produce a robust reference implementation of the DQP components

Projects using OGSA-DAI  Industry: –FirstDIG: business process analysis (with First Transport Group) OGSA-DAI with datamining  Collaborative –Bridges: database integration over six geographically distributed genomics research sites (with IBM UK) OGSA-DAI with DiscoveryLink –eDIKT: porting OGSA-DAI to other platforms OGSA-DAI with performance –DEISA: linking Europe’s HPC centres OGSA-DAI with distributed accounting –MS.Net Grid: porting OGSA-DAI to the.Net framework (with Microsoft Research UK) OGSA-DAI with.Net

ODD Genes  OGSA-DAI used to query gene expression data resources at GTI and HGU –One data resource: low spatial resolution, high gene resolution –Other resource: high spatial resolution, low gene resolution –Query one database and use data to find correct data resource to run more detailed query and produce visualisation –Simple example of data integration at work Client Query Render GTI GDS EPCC HGU

Project Timeline Feb ’02May ’02Jul ’02Sep ’02Dec ’02Feb ’03Sep ’03 Ship Release 1 (Jan 15 th 2003) RDB + GT2 / OGSA Prototypes Available XML + OGSA Prototype Available Design Documents & Demos for DAIS GGF5 XML + OGSA Prototypes for Early Adopters WS + GSI UK support ( > 100 downloads) GGF7 GGF6 WG Papers & Prototypes today Release 2 Release 3 Phase 2 Starts Phase 1 Starts Release 1.5 (Feb 28 th 2003) OGSADAI NeSC Early Adopters NeSC NeSC GT3 A3GT3 Beta GT3 A4GT3 Final May ’03 GT3 A1 GT3 A2 TP5TP4 Release 2.5

A DAIT for the Future  DAIT (Data Access and Integration Two) –follow on project from OGSA-DAI, funded for two years –continue to research, prototype and productise –release every six months, R4 in December 2003 –R4: support for SQL Server and structured filesystems extended DBMS management functionality (e.g. archive) bulk load operations (where supported) support for DFDL file access triggers exposed through notification –R5 Distributed Query Processing, Distributed Transactions Virtualised views across databases

Further information  The OGSA-DAI Project Site: –  The DAIS-WG site: –  OGSA-DAI Users Mailing list –General discussion on grid data access and integration  Formal support for OGSA-DAI releases – +  OGSA-DAI training courses –