System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.

Slides:



Advertisements
Similar presentations
© 2007 Open Grid Forum Grids in the IT Data Center OGF 21 - Seattle Nick Werstiuk October 16, 2007.
Advertisements

Distributed Data Processing
Copyright Hub Software Engineering Ltd 2010All rights reserved Hub Document Manager Product Overview.
Database Architectures and the Web
Connect. Communicate. Collaborate Click to edit Master title style MODULE 1: perfSONAR TECHNICAL OVERVIEW.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Institute for Software Science – University of ViennaP.Brezany 1 Databases and the Grid Peter Brezany Institute für Scientific Computing University of.
Distributed Systems Architectures
Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © Chapter 1, pp For educational use only.
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
Components and Architecture CS 543 – Data Warehousing.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Institute for Scientific Computing – University of ViennaP.Brezany 1 Databases and the Grid Peter Brezany Institute für Scientific Computing University.
Chapter 1 Introduction to Databases
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 12 Slide 1 Distributed Systems Design 1.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Database Environment 1.  Purpose of three-level database architecture.  Contents of external, conceptual, and internal levels.  Purpose of external/conceptual.
By N.Gopinath AP/CSE. Why a Data Warehouse Application – Business Perspectives  There are several reasons why organizations consider Data Warehousing.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
Database Architectures and the Web
Cracow Grid Workshop 2003 Institute of Computer Science AGH A Concept of a Monitoring Infrastructure for Workflow-Based Grid Applications Bartosz Baliś,
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 12 Slide 1 Distributed Systems Architectures.
1 Introduction to Middleware. 2 Outline What is middleware? Purpose and origin Why use it? What Middleware does? Technical details Middleware services.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Middleware for FIs Apeego House 4B, Tardeo Rd. Mumbai Tel: Fax:
OGSA-DAI in OMII-Europe Neil Chue Hong EPCC, University of Edinburgh.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
Oracle's Distributed Database Bora Yasa. Definition A Distributed Database is a set of databases stored on multiple computers at different locations and.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Distributed Databases
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Chapter 2 Database Environment.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Introduction: Databases and Database Systems Lecture # 1 June 19,2012 National University of Computer and Emerging Sciences.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Distributed Systems Architectures Chapter 12. Objectives  To explain the advantages and disadvantages of different distributed systems architectures.
Distributed Systems Architectures. Topics covered l Client-server architectures l Distributed object architectures l Inter-organisational computing.
Building a Data Warehouse
Chapter 1 Characterization of Distributed Systems
Databases (CS507) CHAPTER 2.
Workload Management Workpackage
Database Architectures and the Web
The Client/Server Database Environment
Globus —— Toolkits for Grid Computing
Grid Computing.
Grid Portal Services IeSE (the Integrated e-Science Environment)
The Client/Server Database Environment
CSC 480 Software Engineering
Database Architectures and the Web
Data Warehouse.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
Database Management Systems
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2: Database System Concepts and Architecture
Ch 4. The Evolution of Analytic Scalability
TRIP WIRE INTRUSION DETECTION SYSYTEM Presented by.
Distributed Databases
Database Environment Transparencies
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Data Warehousing Concepts
Global Grid Forum (GGF) Orientation
Database System Concepts and Architecture
Grid Computing Software Interface
Presentation transcript:

System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June 10 th, 2003 Presentation by Jeong-Hun Shin

Korea Advanced Institute of Science and Technology System Software Laboratory | 2 Contact Contact (1997) directed by Robert Zemeckis story by Carl Sagan

Korea Advanced Institute of Science and Technology System Software Laboratory | 3 Ancestors said that

Korea Advanced Institute of Science and Technology System Software Laboratory | 4 The Search for ExtraTerrestrial Intelligence

Korea Advanced Institute of Science and Technology System Software Laboratory | 5 to understand protein folding, protein aggregation, and related diseases

Korea Advanced Institute of Science and Technology System Software Laboratory | 6 Contents Integration of databases into the Grid Database requirements of Grid applications Requirements above existing systems The Grid and databases: the current state Integrating databases across the Grid Federating database systems across the Grid

Korea Advanced Institute of Science and Technology System Software Laboratory | 7 Integration of databases into the Grid The Grid Publication of data in a more open manner New results from separate, distributed sources If the Grid is to support a wider range of applications e.g., applications in the life/earth sciences, business applications Difficulties in integrating databases into the Grid Two main dimensions of complexity Differences between server products within a database paradigm Variety of database paradigms Tradeoff: generic middleware for federating Grid-enabled DBs Full functionality of different database paradigms Common solutions to reduce effort

Korea Advanced Institute of Science and Technology System Software Laboratory | 8 How can DB’s be integrated into the Grid? How to integrate existing DBMS’s into the Grid? Short-term solution: holding limitations cf. Long-term solution Developments to the Grid middleware and DB server components Three main questions What are the requirements of Grid-enabled databases? How far do existing Grid middleware and database servers go towards meeting these requirements? How might the requirements be more fully met?

Korea Advanced Institute of Science and Technology System Software Laboratory | 9 DB requirements of Grid applications Prerequisite Requires the functionality provided by current DBMS query, update, indexing, transaction, recovery, replication, security, … Grid-enabled DBMS from scratch is not desirable. New facilities are added by enhancing existing DBMS. The most commonly used DBMS are commercial: not open-source  Enhancement by external wrapping of the DBMS Two categories of requirements Generic across all Grid application components Database specific

Korea Advanced Institute of Science and Technology System Software Laboratory | 10 Requirements above existing systems Scalability Extremely demanding performance and capacity Low response times for complex queries Support for high access throughput Handling unpredictable usage Difficulty in predicting the types of accesses Current DBMS: Little support for controlling the sharing of finite resources

Korea Advanced Institute of Science and Technology System Software Laboratory | 11 Requirements (cont’d) Metadata-driven access Current use: relatively simple As the Grid expands into new applications: more sophisticated metadata systems and tools  Semantic Grid Two-step access to data Search of metadata catalogs to locate the DB’s  Data access Need for a standardized interface for all DBS Multiple database federation Open publication of data  Advances in applications combining info from multiple data sets Federation middleware w/ standardized interface Higher-level problem of the semantic integration of multiple DB’s

Korea Advanced Institute of Science and Technology System Software Laboratory | 12 The Grid and databases: the current state Globus The dominant middleware used for building computational grids Monitoring and Discovery Service (MDS): Grid information service Globus Resource Allocation Manager (GRAM): resource management Grid Security Infrastructure (GSI) Limitations and possibilities No direct support for database integration GSI can provide a single sign-on capability GridFTP can be used for bulk database loading/bulk transfer of query results MDS and GRAM can be used to locate and run DB federation middleware

Korea Advanced Institute of Science and Technology System Software Laboratory | 13 Integrating databases into the Grid Service-based framework Individual operations offered by the services would be standardized Standardization would be done by adding wrapper code to map the service operation interface to the vendor specific interface Advantage: each DBS can provide a metadata service Information on the range of services and operations DBS with a Grid-enabled service interface Client Metadata Query Transaction Notification Bulk loading Scheduling Accounting DBMS DBS Service Interface onto the Grid ServicesInterface code

Korea Advanced Institute of Science and Technology System Software Laboratory | 14 Roles of service wrapper Metadata Access to technical metadata about the DBS and the set of services e.g., logical/physical name of the DBS and contents, ownership, ver., … Query Various DBMS  definition of type and level of query language Transaction Transactions involving only a single DBS Allows a DBS to participate in app-wide distributed transactions Bulk loading For large amounts of data: optimized for transfer of large datasets

Korea Advanced Institute of Science and Technology System Software Laboratory | 15 Roles of service wrapper (cont’d) Notification Allows clients to register some interest in a set of data Receives a message when a change occurred Scheduling Allows users to schedule the use of the DBS Accounting Information for accounting and payment scheme Monitors performance against agreed service levels Enables users to be charged for resource usage

Korea Advanced Institute of Science and Technology System Software Laboratory | 16 Federating DBS across the Grid Grid application interfacing directly to a set of DBS Great application complexity Duplication of effort Client Metadata Query Transaction Notification Bulk loading Scheduling Accounting DBS Metadata Query Transaction Notification Bulk loading Scheduling Accounting DBS

Korea Advanced Institute of Science and Technology System Software Laboratory | 17 Federating DBS across the Grid (cont’d) Virtual database system on the Grid Reduces to federating each of the individual services Same interface as the DBS w/ Grid-enabled service interface Possibility for federating services of both “real” DBS and Virtual DBS Metadata Query Transaction Notification Bulk loading Scheduling Accounting DBS Metadata Query Transaction Notification Bulk loading Scheduling Accounting DBS Service Federation Middleware Metadata Query Transaction Notification Bulk loading Scheduling Accounting Virtual DBS Service Interface onto the Grid Client

Korea Advanced Institute of Science and Technology System Software Laboratory | 18 Creation of Virtual DBS Types of the creation of a Virtual DBS A user decides to create a Virtual DBS Services take a set of DBS and create a Virtual DBS Challenge Full standardization of all services is impossible The resulting heterogeneity causes problems Automatic creation of a Virtual DBS The tool queries the metadata service their functionality and interface Integration of the service is impossible if no options are available Service federation middleware Complexity varies from service to service In general, increases along the degree of heterogeneity of the service

Korea Advanced Institute of Science and Technology System Software Laboratory | 19 Summary A set of requirements for Grid databases Existing Grid middleware does not meet them A set of services should be offered by a Grid-integrated DBS Service-based approach Independent of any particular implementation technology Simplifies the task of writing applications that need to combine information from more than one DBS Virtual DBS Federating DBS across the Grid

System Software Laboratory Thank you for your attention!