Www.walsaip.uprm.edu Supported By Hash-Based Algorithms For Operator Load-Balancing In Database Middleware Systems Angel L. Villalain-Garcia – M.S. StudentProf.

Slides:



Advertisements
Similar presentations
Database System Concepts and Architecture
Advertisements

Database Architectures and the Web
Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Distributed DBMS© M. T. Özsu & P. Valduriez Ch.6/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
Chapter 3 Database Architectures and the Web Pearson Education © 2009.
Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
Institute for Software Science – University of ViennaP.Brezany 1 Databases and the Grid Peter Brezany Institute für Scientific Computing University of.
Object Oriented Databases Abhishek Khanolkar. Agenda: Service and SOA. Object oriented Databases. Different Architectures MOCHA,SODA. Compare. Conclusion.
Distributed components
Web-based Distributed Flexible Manufacturing System (FMS) Monitoring and Control Student: Wei Liu Instructor: Dr. Chang Apr. 23, 2003.
Distributed Database Management Systems
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Peer-to-Peer Databases David Andersen Advanced Databases.
Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume Pierre and Maarten van Steen Vrije Universiteit, Amsterdam,
© Copyright 2000 M. Rodriguez-Martinez, All Rights Reserved Automatic Deployment of Application-Specific Metadata and Code in MOCHA Manuel Rodriguez-Martinez.
TECHNIQUES FOR OPTIMIZING THE QUERY PERFORMANCE OF DISTRIBUTED XML DATABASE - NAHID NEGAR.
Chapter 3 Database Architectures and the Web Pearson Education © 2009.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
SAR Imaging Radar System A fundamental problem in designing a SAR Image Formation System is finding an optimal estimator as an ideal.
Introduction Using time property and location property from lost items’ pictures, we construct the Lost and Found System which combined with image search.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
Database Architectures and the Web Session 5
01/02/20031 Welcome to Distributed Databases Course.
Chapter 2 Database System Architecture. An “architecture” for a database system. A specification of how it will work, what it will “look like.” The “ANSI/SPARC”
Yaxin Hu 2.8 SUMMARY. Outline Summary of different sections of chapter 2 Recent research papers Future works/predictions.
Supported By NSF Grant CNS This work centers on the design and development of a Java-based XML information representation.
Active Monitoring in GRID environments using Mobile Agent technology Orazio Tomarchio Andrea Calvagna Dipartimento di Ingegneria Informatica e delle Telecomunicazioni.
Fundamentals of Database Chapter 7 Database Technologies.
Master Thesis Defense Jan Fiedler 04/17/98
CS480 Computer Science Seminar Introduction to Microsoft Solutions Framework (MSF)
Architectural Blueprints The “4+1” View Model of Software Architecture
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
Client-Server Processing, Parallel Database Processing and Distributed Database Systems. KEVIN ROBERTS ANIKET MURLIDHARAN.
Session-8 Data Management for Decision Support
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Integrated Maximum Flow Algorithm for Optimal Response Time Retrieval of Replicated Data Nihat Altiparmak, Ali Saman Tosun The University of Texas at San.
Intro – Part 2 Introduction to Database Management: Ch 1 & 2.
Workshop on Future Learning Landscapes: Towards the Convergence of Pervasive and Contextual computing, Global Social Media and Semantic Web in Technology.
UPRM-TVE: Terrain Visualization Explorer Understanding the dynamics of the hydrological phenomena associated to wetlands requires analyzing models built.
 proposed work This project aims to design and develop a framework for terrain visualization flexible enough to allow arbitrary visualization of terrain.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 14 Database Connectivity and Web Technologies.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Non-Traditional Databases. Reading 1. Scientific data management at the Johns Hopkins institute for data intensive engineering and science Yanif Ahmad,
Supported By Understanding the dynamics of the hydrological phenomena associated to wetlands requires analyzing data gathered from.
WS-DREAM: A Distributed Reliability Assessment Mechanism for Web Services Zibin Zheng, Michael R. Lyu Department of Computer Science & Engineering The.
© Chinese University, CSE Dept. Distributed Systems / Distributed Systems Topic 1: Characterization of Distributed & Mobile Systems Dr. Michael R.
Query Execution on NetTraveler Angel L. Villalaín-García Manuel Rodríguez-Martínez University of Puerto Rico - Mayaguez Campus.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Database Architectures and the Web
Distributed Databases
Grid Computing.
Data and Applications Security Developments and Directions
Database Architectures and the Web
The Globus Toolkit™: Information Services
Lecture 1: Multi-tier Architecture Overview
Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy
Architectures of distributed systems Fundamental Models
Architectures of distributed systems Fundamental Models
GATES: A Grid-Based Middleware for Processing Distributed Data Streams
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Architectures of distributed systems Fundamental Models
Database System Architectures
Data and Applications Security Developments and Directions
The Gamma Database Machine Project
Data and Applications Security Developments and Directions
Presentation transcript:

Supported By Hash-Based Algorithms For Operator Load-Balancing In Database Middleware Systems Angel L. Villalain-Garcia – M.S. StudentProf. Manuel Rodriguez – Advisor ADM Group, ECE Department, University of Puerto Rico, Mayagüez Campus Scientific applications are becoming highly dependant on Wide Area Network environments to access data and computing resources. Traditional database middleware systems have succeeded on integrating heterogeneous data sources but fail on scaling to WANs because they are focused on centralized architectures with static characteristics. Our goal is to design a distributed database middleware system that takes into consideration the dynamic characteristics of each host on the WAN to efficiently run queries on it. We call this system NetTraveler Our goal with NetTraveler is to design an efficient query execution environment for distributed and parallel query execution to improve response time based on the characteristics of the hosts on the WAN. Project Description 1 WAN Scenario 2 Query Service Broker (QSB) Coordination of the Execution Process. Interface exposed to the client P2P behavior Information Gateway (IG) Interface for DB access Registration Server (RS) Coordinates federations and acts as a Catalog Manager. Data Synchronization Server (DSS) Acts as a data source when needed. Used for recovery of queries.Used for recovery of queries. Data Processing Server (DPS) Interface for grid services and sensors NetTraveler Architecture 3 Proposed Work 4 Traditional DBMS Plan Centralized Query Optimizer Distributed and Parallel DBMS Plan Decentralized Query Optimizer References 8 1.David J. DeWitt, Shahram Ghandeharizadeh, Donovan A. Schneider, Allan Bricker, Hui-I Hsiao, and Rick Rasmussen. “The gamma database machine project”. pages 609–626, Sumit Ganguly, Waqar Hasan, and Ravi Krishnamurthy. Query optimization for parallel execution. In SIGMOD ’92: Proceedings of the 1992 ACM SIGMOD international conference on Management of data, pages 9–18, New York, NY, USA, ACM Press. 3.Wei Hong and Michael Stonebraker. Optimization of parallel query execution plans in xprs. In PDIS ’91: Proceedings of the first international conference on Parallel and distributed information systems, pages 218–225, Los Alamitos, CA, USA, IEEE Computer Society Press. 4.Manuel Rodriguez-Martinez, Nick Roussopoulos, “MOCHA: A Self-Extensible Database Middleware System for Distributed Data Sources”, In Proceeding of the ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, pp , May Waqar Hasan,Daniela Florescu, Patrick Valduriez, Open Issues in Parallel Query Optimization,SIGMOD Rec., Vol. 25 No. 3 pp , September Methodology 5 Use of different Java technology for implementing our proposed architecture: Web Services, Axis Data access using JDBC, Hibernate Future Work & Results 7 Study Scheduling Effect and improvements Hash Join operators and functionalities Second Phase Optimizer Conventional DBMS are usually modelled as a centralized architecture with several pipelined steps. One important step is the Query Optimizer (QO). A QO is the process that determines the needed actions to solve a query. Image below represents the most important aspects of a QO and its output. Due to several reasons as cited on [5] centralized QO fails to scale to WAN environment. With that in mind we proposed the idea of a decentralized QO that could create plans that exploits parallel and distributed execution on WAN environment (see image below). Both the query optimizer and the query parallelization step would need information of the available resources on each host. Our basic assumption is that data sources are replicated, we just need to select the better host for each step of the query execution. The result is explained on the next figure. Q QSB 1 QSB 2 QSB 4 QSB 3 IG 3 IG 2 DSS 1 Client IG 1 R Q1Q1 Q2Q2 Q3Q3 Q1Q1 Q2Q2 Q3Q3 R1R1 R1R1 R2R2 R3R3 R2R2 R3R3 R Project Status 6 Currently the system provides an environment for remote query execution and the orchestration of the several services to bring up the result for the client in the way explain by the graphic shown below. Q QSB 1 QSB 2 QSB 4 QSB 3 IG 3 IG 2 DSS 1 Client IG 1 R QQQ RRR R Actual systems support for parallel query execution of operations and static distribution of the load across participating sites.