Presentation is loading. Please wait.

Presentation is loading. Please wait.

System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.

Similar presentations


Presentation on theme: "System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June."— Presentation transcript:

1 System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June 10 th, 2003 Presentation by Jeong-Hun Shin

2 Korea Advanced Institute of Science and Technology System Software Laboratory | 2 Contact Contact (1997) directed by Robert Zemeckis story by Carl Sagan

3 Korea Advanced Institute of Science and Technology System Software Laboratory | 3 Ancestors said that

4 Korea Advanced Institute of Science and Technology System Software Laboratory | 4 SETI@home The Search for ExtraTerrestrial Intelligence http://setiathome.ssl.berkeley.edu/

5 Korea Advanced Institute of Science and Technology System Software Laboratory | 5 Folding@home to understand protein folding, protein aggregation, and related diseases

6 Korea Advanced Institute of Science and Technology System Software Laboratory | 6 Contents Integration of databases into the Grid Database requirements of Grid applications Requirements above existing systems The Grid and databases: the current state Integrating databases across the Grid Federating database systems across the Grid

7 Korea Advanced Institute of Science and Technology System Software Laboratory | 7 Integration of databases into the Grid The Grid Publication of data in a more open manner New results from separate, distributed sources If the Grid is to support a wider range of applications e.g., applications in the life/earth sciences, business applications Difficulties in integrating databases into the Grid Two main dimensions of complexity Differences between server products within a database paradigm Variety of database paradigms Tradeoff: generic middleware for federating Grid-enabled DBs Full functionality of different database paradigms Common solutions to reduce effort

8 Korea Advanced Institute of Science and Technology System Software Laboratory | 8 How can DB’s be integrated into the Grid? How to integrate existing DBMS’s into the Grid? Short-term solution: holding limitations cf. Long-term solution Developments to the Grid middleware and DB server components Three main questions What are the requirements of Grid-enabled databases? How far do existing Grid middleware and database servers go towards meeting these requirements? How might the requirements be more fully met?

9 Korea Advanced Institute of Science and Technology System Software Laboratory | 9 DB requirements of Grid applications Prerequisite Requires the functionality provided by current DBMS query, update, indexing, transaction, recovery, replication, security, … Grid-enabled DBMS from scratch is not desirable. New facilities are added by enhancing existing DBMS. The most commonly used DBMS are commercial: not open-source  Enhancement by external wrapping of the DBMS Two categories of requirements Generic across all Grid application components Database specific

10 Korea Advanced Institute of Science and Technology System Software Laboratory | 10 Requirements above existing systems Scalability Extremely demanding performance and capacity Low response times for complex queries Support for high access throughput Handling unpredictable usage Difficulty in predicting the types of accesses Current DBMS: Little support for controlling the sharing of finite resources

11 Korea Advanced Institute of Science and Technology System Software Laboratory | 11 Requirements (cont’d) Metadata-driven access Current use: relatively simple As the Grid expands into new applications: more sophisticated metadata systems and tools  Semantic Grid Two-step access to data Search of metadata catalogs to locate the DB’s  Data access Need for a standardized interface for all DBS Multiple database federation Open publication of data  Advances in applications combining info from multiple data sets Federation middleware w/ standardized interface Higher-level problem of the semantic integration of multiple DB’s

12 Korea Advanced Institute of Science and Technology System Software Laboratory | 12 The Grid and databases: the current state Globus The dominant middleware used for building computational grids Monitoring and Discovery Service (MDS): Grid information service Globus Resource Allocation Manager (GRAM): resource management Grid Security Infrastructure (GSI) Limitations and possibilities No direct support for database integration GSI can provide a single sign-on capability GridFTP can be used for bulk database loading/bulk transfer of query results MDS and GRAM can be used to locate and run DB federation middleware

13 Korea Advanced Institute of Science and Technology System Software Laboratory | 13 Integrating databases into the Grid Service-based framework Individual operations offered by the services would be standardized Standardization would be done by adding wrapper code to map the service operation interface to the vendor specific interface Advantage: each DBS can provide a metadata service Information on the range of services and operations DBS with a Grid-enabled service interface Client Metadata Query Transaction Notification Bulk loading Scheduling Accounting DBMS DBS Service Interface onto the Grid ServicesInterface code

14 Korea Advanced Institute of Science and Technology System Software Laboratory | 14 Roles of service wrapper Metadata Access to technical metadata about the DBS and the set of services e.g., logical/physical name of the DBS and contents, ownership, ver., … Query Various DBMS  definition of type and level of query language Transaction Transactions involving only a single DBS Allows a DBS to participate in app-wide distributed transactions Bulk loading For large amounts of data: optimized for transfer of large datasets

15 Korea Advanced Institute of Science and Technology System Software Laboratory | 15 Roles of service wrapper (cont’d) Notification Allows clients to register some interest in a set of data Receives a message when a change occurred Scheduling Allows users to schedule the use of the DBS Accounting Information for accounting and payment scheme Monitors performance against agreed service levels Enables users to be charged for resource usage

16 Korea Advanced Institute of Science and Technology System Software Laboratory | 16 Federating DBS across the Grid Grid application interfacing directly to a set of DBS Great application complexity Duplication of effort Client Metadata Query Transaction Notification Bulk loading Scheduling Accounting DBS Metadata Query Transaction Notification Bulk loading Scheduling Accounting DBS

17 Korea Advanced Institute of Science and Technology System Software Laboratory | 17 Federating DBS across the Grid (cont’d) Virtual database system on the Grid Reduces to federating each of the individual services Same interface as the DBS w/ Grid-enabled service interface Possibility for federating services of both “real” DBS and Virtual DBS Metadata Query Transaction Notification Bulk loading Scheduling Accounting DBS Metadata Query Transaction Notification Bulk loading Scheduling Accounting DBS Service Federation Middleware Metadata Query Transaction Notification Bulk loading Scheduling Accounting Virtual DBS Service Interface onto the Grid Client

18 Korea Advanced Institute of Science and Technology System Software Laboratory | 18 Creation of Virtual DBS Types of the creation of a Virtual DBS A user decides to create a Virtual DBS Services take a set of DBS and create a Virtual DBS Challenge Full standardization of all services is impossible The resulting heterogeneity causes problems Automatic creation of a Virtual DBS The tool queries the metadata service their functionality and interface Integration of the service is impossible if no options are available Service federation middleware Complexity varies from service to service In general, increases along the degree of heterogeneity of the service

19 Korea Advanced Institute of Science and Technology System Software Laboratory | 19 Summary A set of requirements for Grid databases Existing Grid middleware does not meet them A set of services should be offered by a Grid-integrated DBS Service-based approach Independent of any particular implementation technology Simplifies the task of writing applications that need to combine information from more than one DBS Virtual DBS Federating DBS across the Grid

20 System Software Laboratory Thank you for your attention!


Download ppt "System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June."

Similar presentations


Ads by Google