Www.neresc.ac.uk OGSA-DQP - A Service-Based Distributed Query Processor for The Grid Arijit Mukherjee University of Newcastle Arijit Mukherjee University.

Slides:



Advertisements
Similar presentations
A Lightweight Platform for Integration of Mobile Devices into Pervasive Grids Stavros Isaiadis, Vladimir Getov University of Westminster, London {s.isaiadis,
Advertisements

Large-Scale, Adaptive Fabric Configuration for Grid Computing Peter Toft HP Labs, Bristol June 2003 (v1.03) Localised for UK English.
Pricing for Utility-driven Resource Management and Allocation in Clusters Chee Shin Yeo and Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS)
…to Ontology Repositories Mathieu dAquin Knowledge Media Institute, The Open University From…
Tom Sugden EPCC OGSA-DAI Future Directions OGSA-DAI User's Forum GridWorld 2006, Washington DC 14 September 2006.
OGSA-DAI Activities OGSA-DAI for Developers GridWorld 2006, Washington DC 11 September 2006.
Self-managing Grid Services for Efficient Data Management Anastasios Gounaris (University of Cyprus, University of Manchester) CoreGRID Summer School Budapest,
Designing Services for Grid-based Knowledge Discovery A. Congiusta, A. Pugliese, Domenico Talia, P. Trunfio DEIS University of Calabria ITALY
EGEE is a project funded by the European Union under contract IST R-GMA status and plans Abdeslem DJAOUI / RAL GRIDPP10 meeting at CERN, 3.
Experiences with Converting my Grid Web Services to Grid Services Savas Parastatidis & Paul Watson
December 2009 Data Integration in Grid Environments Alex Poulovassilis, Birkbeck, U. of London.
Dynasoar Dynamic Deployment of Web Services on a Grid or the Internet or Why its good to be Jobless Paul Watson School of Computing Science.
The National Grid Service Mike Mineter.
Distributed Query Processing Donald Kossmann University of Heidelberg
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
AHM2006, RSSM: A Rough Sets based Service Matchmaking Algorithm Bin Yu and Maozhen Li School of Engineering and Design.
Service-Based Distributed Query Processing on the Grid M.Nedim Alpdemir Department of Computer Science University of Manchester.
A Peer-to-Peer Database Server based on BitTorrent John Colquhoun Paul Watson John Colquhoun Paul Watson.
Data services on the NGS.
Grid Database Projects Paul Watson, Newcastle Norman Paton, Manchester.
Grid Services and Microsoft.NET The MS.NETGrid Project Dr. Mike Jackson EPCC All Hands Meeting.
Enterprise Java and Data Services Designing for Broadly Available Grid Data Access Services.
Open Grid Service Architecture - Data Access & Integration (OGSA-DAI) Dr Martin Westhead Principal Consultant, EPCC Telephone: Fax:+44.
The National Grid Service and OGSA-DAI Mike Mineter
Eldas 1.0 Enterprise Level Data Access Services Design Issues, Implementation and Future Development Davy Virdee.
M.Nedim Alpdemir, Anastasios Gounaris¹, Arijit Mukherjee², Desmond Fitzgerald, Norman W. Paton¹, Paul Watson², Rizos Sakellariou¹, Alvaro A.A. Fernandes¹,
Current status of grids: the need for standards Mike Mineter TOE-NeSC, Edinburgh.
16-17 October 2003 Grids and Applied Language Theory: Declarative Grid Service Orchestration with OGSA-DQP (A A A Fernandes) 1 Declarative Grid Service.
Architectural Constraints on Current Bioinformatics Integration Systems Norman Paton Department of Computer Science University of Manchester Manchester,
OMII-UK Steven Newhouse, Director. © 2 OMII-UK aims to provide software and support to enable a sustained future for the UK e-Science community and its.
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
ICS 434 Advanced Database Systems
Database System Concepts and Architecture
A Prototype Implementation of a Framework for Organising Virtual Exhibitions over the Web Ali Elbekai, Nick Rossiter School of Computing, Engineering and.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 31 Slide 1 Service-centric Software Engineering 1.
Grid-Enabling Data: Sticking Plaster, Sellotape, & Chewing Gum? Colin C. Venters National Centre for e-Social Science University.
Institute for Software Science – University of ViennaP.Brezany 1 Databases and the Grid Peter Brezany Institute für Scientific Computing University of.
Variability Oriented Programming – A programming abstraction for adaptive service orientation Prof. Umesh Bellur Dept. of Computer Science & Engg, IIT.
Slides thanks to Steve Lynden Amy Krause EPCC Distributed Query Processing with OGSA-DQP Principles and Architectures for Structured Data Integration:
14-18 March 2004 EDBT'04 : Service-Based Distributed Query Processing for the Grid (M N Alpdemir) 1 Title, places, people, funding, projects Manchester.
TECHNIQUES FOR OPTIMIZING THE QUERY PERFORMANCE OF DISTRIBUTED XML DATABASE - NAHID NEGAR.
1 Chapter 3 Database Architecture and the Web Pearson Education © 2009.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
Database Taskforce and the OGSA-DAI Project Norman Paton University of Manchester.
1 Dr. Markus Hillenbrand, ICSY Lab, University of Kaiserslautern, Germany A Generic Database Web Service for the Venice Service Grid Michael Koch, Markus.
DAIS Grid1 Database Access and Integration Services on the Grid * * Authors: N. Paton, M. Atkinson, V.
Expressing Workflows Using Grid Enabled Computer Algebra Systems Palaiseau, France, January 19-21, 2009 Alexandru CÂRSTEA Georgiana MACARIU Marc FRÎNCU.
Grid Workload Management Massimo Sgaravatto INFN Padova.
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
Wrapping Scientific Applications As Web Services Using The Opal Toolkit Wrapping Scientific Applications As Web Services Using The Opal Toolkit Sriram.
OGSA-DAI.
Data access and integration with OGSA-DAI: OGSA-DQP Steven Lynden University of Manchester.
A Dynamic Service Deployment Infrastructure for Grid Computing or Why it’s good to be Jobless Paul Watson School of Computing Science.
OGSA-DAI Neil Chue Hong 29 th January 2007 OGF19, Chapel Hill.
OGSA-DQP:Service-Based Distributed Query Processing on the Grid M.Nedim Alpdemir Department of Computer Science University of Manchester.
1 OGSA-DAI Status Report Neil P Chue Hong 20 th May 2005.
A PPARC funded project Common Execution Architecture Paul Harrison IVOA Interoperability Meeting Cambridge MA May 2004.
Data Manipulation with Globus Toolkit Ivan Ivanovski TU München,
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
OGSA-DQP Steven Lynden University of Manchester. Data access & integration with OGSA-DAI: GGF 17 2 Introduction OGSA-DQP is a service based distributed.
OGSA-DAI 简介及其它在 China-VO DAS 系统中的应用 杨阳 中国虚拟天文台研发团队 Chinese Virtual Observatory.
Advanced Databases COMP3017 Dr Nicholas Gibbins
ACGT Architecture and Grid Infrastructure Juliusz Pukacki ‏ EGEE Conference Budapest, 4 October 2007.
OGSA-DAI.
A Grid Data Integration Service (OGSA-DQP) Paul Watson, University of Newcastle-upon-Tyne based on the work of… Norman Paton, Tasos Gounaris,
Amy Krause EPCC OGSA-DAI An Overview OGSA-DAI on OMII 2.0 OMII The Open Middleware Infrastructure Institute NeSC,
Ruslan Fomkin and Tore Risch Uppsala DataBase Laboratory
1st International Conference on Semantics, Knowledge and Grid
Grid Systems: What do we need from web service standards?
Presentation transcript:

OGSA-DQP - A Service-Based Distributed Query Processor for The Grid Arijit Mukherjee University of Newcastle Arijit Mukherjee University of Newcastle

2 Motivation behind OGSA-DQP 1)High level Data Access and Integration services are needed if data-intensive distributed applications running on heterogeneous platforms are to benefit from the Grid. 2)Emerging standards for Data Access - OGSA-DAI supports exposure of data resources onto Grids. 3)DQP is an approach to deliver (1) given the availability of (2) 2

3 Service-based in what sense? OGSA-DQP is service-based in two orthogonal senses - Supports querying over data storage and analysis services factored out as services Hence resource virtualisation via SOA Construction of the distributed query plan and their execution over the grid are factored out as services 3

4 OGSA-DQP Goals To benefit from homogeneous access to heterogeneous data sources [OGSA-DAI]. To benefit from Grid abstractions for on-demand allocation of resources required for a task [Condor, OMII, GT*]. To provide transparent, implicit support for parallelism and distribution. [Polar*] To orchestrate the composition of data retrieval and analysis services. To expose this orchestration capability as a Grid data service. 4

5 OGSA-DQP Approach OGSA-DQP uses a middleware approach. It can be seen as a mediator over OGSA-DAI wrappers. It promises bottom-lines regarding: efficiency: leave to schedule in parallel; effectiveness: leave to to orchestrate your services; usability: use it as a Grid data service. DBMS data OGSA-DQP DBMS data QueryResults OGSA-DAI 5

6 OGSA-DQP Innovations OGSA-DQP dynamically allocates evaluators to do work on behalf of the mediator. This allows for runtime circumstances to be taken into account when the optimiser decides how to partition and schedule. OGSA-DQP uses a parallel physical algebra: most mediator-based query processors do not. 6

7 OGSA-DQP Architecture Extends the OGSA-DAI with two new services Grid Distributed Query Service Exposed to client Finds and retrieves service descriptions Parses, compiles, optimizes, schedules the query execution plans over a union of distributed data resources Query Evaluation Service Not exposed to the client Implements the physical query algebra Implements the query execution model and semantics Evaluates a partition of the query execution plan generated by the GDQS Interacts with other QESs/GDSs/Web Services 7

8 Example Query Plan select p.proteinId, Blast(p.sequence) from p in protein, t in proteinTerm where t.termId=GO: and p.proteinId=t.proteinId select (proteinId, sequence) table_scan (proteinTerms) (termed=ABCD) table_scan (proteins) select (proteinId, sequence) scan (proteinTerms) (termed=ABCD) scan (proteins) select (p.proteinId, blast) operation call (blast(p.sequence)) join (p.proteinId=t.proteinId) select (proteinId) select (p.proteinId, blast) operation_call (blast(p.sequence)) hash_join (p.proteinId=t.proteinId) select (proteinId) (a) Single-node logical plan (b) Single-node physical plan select (p.proteinId, blast) operation_call (blast(p.sequence)) exchange hash_join (p.proteinId=t.proteinId) select (proteinId) select (proteinId, sequence) table_scan (proteinTerms) (termed=ABCD) table_scan (proteins) 4, 5 3, 6 2, 3 6 (c) partitioned plan 8

9 Another Example 9

10 OGSA-DQP: Query Evaluation Query installation stage: As many QE services are utilised as there are partitions specified. Each partition is sent to the QE service it is scheduled for. Query evaluation stage: Each QES evaluates its partition using an iterator model. Queries execute under pipelined and partitioned parallelism. Results are conveyed to client. 10

11 OGSA-DQP Execution Flow GDQS GDS-1 GDS-2 Web Service Client Resource list wsdl schema OQL Parser Logical Optimizer Physical Optimizer Scheduler Partitioner Polar* Query Optimizer Engine query schema QES Partition1 Partition3 partition2 results 11

12 What we provide Resource virtualisation through a service-oriented architecture: Data Resource Discovery using service registries; Computational Resource Discovery via Index Services (not implemented yet); Reliance on GDSs for metadata and data access Coarse-grained services with document-oriented interfaces By acquiring and manipulating data in a data-flow architecture that is constructed dynamically, OGSA-DQP constructs, on-the-fly, a lightweight Distributed Query Processing Engine. 12

13 Timeline Release 1.0 in September 2003 Improved Release 2.0 in July 2004 (based on GT3.2 and OGSA-DAI 4.0) Around 700 downloads New Release coming soon! Whats new: Based on OGSA-DAI R7.0 GDQS closer to OGSA-DAI GQES refactored as QES (WS-I) 13

14 Working on… More friendly (!) - Use SQL More portable - Support Cygwin, Solaris for the compiler/optimiser Cygwin - DONE Better performance – we are working on it Some bottlenecks removed More functional - Semi-structured data; Streams. Working on incorporating XML DBs More dynamic - Use Index Services; dynamically install services QES is DynaSOAr-READY More application test-beds - Sensor networks. More adaptive - Queries may be long running, environment is constantly changing - static optimisation is likely to become stale fast. Monitor, assess and respond (e.g., switch operators/ algorithms, spawn more copies, relocate). Ongoing More widely deployable - As OGSA-DAI Introduce Virtual Machines Looking into it. 14

15 Where to find out more Papers - M N Alpdemir, A Mukherjee, A Gounaris, A A A Fernandes, N W Paton, P Watson, J Smith. Service Based Distributed Querying on the Grid. 1 st International Conference on Service Oriented Computing, 2003, LNCS 2910 M N Alpdemir, A Mukherjee, A Gounaris, A A A Fernandes, N W Paton, P Watson, J Smith. OGSA-DQP: A Service for Distributed Querying on the Grid, in Proceedings of the Advances in Database Technology - EDBT 2004, LNCS 2992 M N Alpdemir, A Mukherjee, A Gounaris, A A A Fernandes, N W Paton, P Watson, J Smith. An Experience Report on Designing and Building OGSA-DQP: A Service Based Distributed Query Processor for the Grid. GGF9 Workshop on Designing and Building Grid Services, M N Alpdemir, A Mukherjee, A Gounaris, N W Paton, P Watson, A A A Fernandes, J Smith. OGSA-DQP: A Service-Based Distributed Query Processor for the Grid. 2nd UK e-Science All Hands Meeting, J Smith, A Gounaris, P Watson, N W Paton, A A A Fernandes, R Sakellariou. Distributed Query Processing on the Grid. GRID 2002, LNCS 2536 (papers available at ) Software

16 Peoples and Partners Prof. Paul Watson Dr. Jim Smith Arijit Mukherjee Prof. Norman Paton Dr. Alvaro AA Fernandez Dr. Rizos Sakellariou Anastasios Gounaris Steven Lynden 16

Thank You ?