Institut für Scientific Computing – Universität WienP.Brezany Optimization of Distributed Queries Univ.-Prof. Dr. Peter Brezany Institut für Scientific.

Slides:



Advertisements
Similar presentations
V. Megalooikonomou Distributed Databases (based on notes by Silberchatz,Korth, and Sudarshan and notes by C. Faloutsos at CMU) Temple University – CIS.
Advertisements

Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Distributed Query Processing –An Overview
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Distributed DBMS© M. T. Özsu & P. Valduriez Ch.6/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Transaction.
Advanced Database Systems September 2013 Dr. Fatemeh Ahmadi-Abkenari 1.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
1 Distributed Databases Chapter Two Types of Applications that Access Distributed Databases The application accesses data at the level of SQL statements.
1 Distributed Databases CS347 Lecture 14 May 30, 2001.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Distributed Database Management Systems
Institut für Scientific Computing – Universität WienP.Brezany Fragmentation Univ.-Prof. Dr. Peter Brezany Institut für Scientific Computing Universität.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
CMSC724: Database Management Systems Instructor: Amol Deshpande
Institut für Scientific Computing – Universität WienP.Brezany Parallel Databases Univ.-Prof. Dr. Peter Brezany Institut für Scientific Computing Universität.
Query Optimization. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
1 Distributed Databases Chapter What is a Distributed Database? Database whose relations reside on different sites Database some of whose relations.
16.5 Introduction to Cost- based plan selection Amith KC Student Id: 109.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
L Distributed Query Optimization Algorithms -- 1 Distributed Query Optimization Algorithms v System R and R* v Hill Climbing and SDD-1.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Distributed Databases
Distributed DBMS © M. T. Özsu & P. Valduriez Ch.7/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
Database Design – Lecture 16
Session-9 Data Management for Decision Support
1 6. Distributed Query Optimization Chapter 9 Optimization of Distributed Queries.
Query Optimization. Query Optimization Query Optimization The execution cost is expressed as weighted combination of I/O, CPU and communication cost.
Overview of Query Processing
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
TRANSACTIONS. Objectives Transaction Concept Transaction State Concurrent Executions Serializability Recoverability Implementation of Isolation Transaction.
Session-8 Data Management for Decision Support
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
12.1Database System Concepts - 6 th Edition Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Join Operation Sorting 、 Other.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Distributed DBMS© M. T. Özsu & P. Valduriez Ch.8/1 Outline Introduction Background Distributed Database Design Database Integration Semantic Data Control.
Computing & Information Sciences Kansas State University Tuesday, 03 Apr 2007CIS 560: Database System Concepts Lecture 29 of 42 Tuesday, 03 April 2007.
DDBMS Distributed Database Management Systems Fragmentation
Query optimization in distributed database systems.
Databases Illuminated
Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.
PMIT-6101 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Lecture 1- Query Processing Advanced Databases Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Chapter 12 Query Processing. Query Processing n Selection Operation n Sorting n Join Operation n Other Operations n Evaluation of Expressions 2.
Lecture 15- Parallel Databases (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
PMIT-6101 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 6 th Edition Chapter 18: Parallel Databases Introduction I/O Parallelism Interquery Parallelism.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Paper_topic: Parallel Matrix Multiplication using Vertical Data.
Query Processing and Query Optimization Database System Implementation CSE 507 Some slides adapted from Silberschatz, Korth and Sudarshan Database System.
Query Processing and Optimization Muheet Ahmed Butt.
Chapter 13: Query Processing
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
CS742 – Distributed & Parallel DBMSPage 3. 1M. Tamer Özsu Outline Introduction & architectural issues Data distribution  Distributed query processing.
Database Management System
Interquery Parallelism
Chapter 12: Query Processing
Chapter 15 QUERY EXECUTION.
Outline Introduction Background Distributed DBMS Architecture
Advance Database Systems
Distributed Database Management Systems
Distributed Database Management Systems
Presentation transcript:

Institut für Scientific Computing – Universität WienP.Brezany Optimization of Distributed Queries Univ.-Prof. Dr. Peter Brezany Institut für Scientific Computing Universität Wien Tel Sprechstunde: Di, LV-Portal:

Institut für Scientific Computing – Universität WienP.Brezany 2 Layers of Query Processing

Institut für Scientific Computing – Universität WienP.Brezany 3 Query Optimization Process

Institut für Scientific Computing – Universität WienP.Brezany 4 Search Space For a given query, the search space can be defined as the set of equivalent operator trees, that can be produced using transformation rules. It is useful to concentrate on join trees, operator trees whose operators are join or Cartesian product. This is because permutations of the join order have the most important effect on performance of relational queries. Next example illustrates 3 equivalent join trees, which are obtained by exploiting the associativity of binary operators. Join tree (c) which starts with a Cartesian product may have a much higher cost than other join trees.

Institut für Scientific Computing – Universität WienP.Brezany 5 Search Space - Example Example: SELECTENAME, RESP FROMEMP, ASG, PROJ WHEREEMP.ENO=ASG.ENO ANDASG.PNO=PROJ.PNO

Institut für Scientific Computing – Universität WienP.Brezany 6 Search Space –Shape of the Join Tree A linear tree – at least one operand of each operand node is a base relation. A bushy tree is more general and may have operators whose both operands are intermediate operators. In a distributed environment, bushy trees are useful in exhibiting parallelism.

Institut für Scientific Computing – Universität WienP.Brezany 7 Distributed Cost Model An optimizer‘s cost model includes: –Cost functions to predict the cost of operators –Statistics and base data and formulas to evaluate the sizes of intermediate results. Cost Functions can be expressed with respect to either the total time or the response time.The total time is the sum of all time (cost) components, the response time is the elapsed time from the initiation to the completion of the query. –Total_time = T CPU * #insts + T I/O * #I/Os + T MSG * #msgs + T TR * #bytes »T CPU – the time of a CPU instruction »T I/O - the time of a disk I/O »T MSG - the fixed time of initiating and receiving a message »T TR - the time it takes to transmit a data unit from one site to another Costs are generally expressed in terms of time units, which in turn, can be translated into other units (e.g., dollars).

Institut für Scientific Computing – Universität WienP.Brezany 8 Cost Function (cont.) When the response time of the query is the objective function of the optimizer, parallel local processing and parallel communications must also be considered. –Response_time = T CPU * seq_#insts + T I/O * seq_#I/Os + T MSG * seq_#msgs + T TR * seq_#bytes Example: Site 1 Site 2 Site 3 x units y units Most early distributed DBMSs designed for wide area networks have ignored the local processing cost and concentrate on minimizing the communication cost. Total_time = 2 * T MSG + T TR * (x +y) Respone_time = max { T MSG + T TR * x, T MSG + T TR * y } since the transfers can be done in parallel.

Institut für Scientific Computing – Universität WienP.Brezany 9 Cost Function (cont.) Minimizing response time is achieved by increasing the degree of parallel execution. This does not imply that the total time is also minimized. On contrary, it can increase the total time, for example by having more parallel local processing (often includes synchronization overhead) and transmissions. Minimizing the total time implies that the utilization of the resources improves, thus increasing the system throughput. In practice, a compromise between the total and response times is desired.

Institut für Scientific Computing – Universität WienP.Brezany 10 Database Statistics The main factor affecting the performance is the size of the intermediate relations that are produced during the execution. When a subsequent operation is located at a different site, the intermediate relation must be transmitted over the network.  It is of prime interest to estimate the size of the intermediate results in order to minimize the size of data transfers. The estimation is based on statistical information about the base relations and formulas to predict the cardinalities of the results of the relational operations.  the more precise statistics being the more costly. For a relation R defined over the attributes A = { A 1, A 2,..., A n } and fragmented as R 1, R 2,..., R r, the statistical data are the following:

Institut für Scientific Computing – Universität WienP.Brezany 11 Database Statistics (cont.)

Institut für Scientific Computing – Universität WienP.Brezany 12 Database Statistics (cont.)

Institut für Scientific Computing – Universität WienP.Brezany 13 Database Statistics (cont.)

Institut für Scientific Computing – Universität WienP.Brezany 14 Database Statistics (cont.)

Institut für Scientific Computing – Universität WienP.Brezany 15 Centralized Query Optimization - INGRES Algorithm Why reviewing centralized optimizations? –A distributed query is is translated into local queries, each of which is processed in a centralized way. –Distributed techniques are extensions of centralized ones –Centralized optimization is a simpler problem; the minimization of communication costs makes distributed query optimization more complex. INGRES is a popular relational DB system and it has a distributed version whose optimization algorithms are extensions of the centralized version.

Institut für Scientific Computing – Universität WienP.Brezany 16 INGRES Algorithm It uses a dynamic query optimization algorithm that recursively breaks-up a calculus query into smaller pieces. It combines calculus-algebra decomposition and optimization. A query is first decomposed into a subsequence of queries having a unique relation in common. Then each monorelation query is processed by a „one- variable query processor“ (OVQP). The OVQP optimizes the access to a single relation by selecting the best access method to that relation (e.g., index, sequential scan).

Institut für Scientific Computing – Universität WienP.Brezany 17 INGRES Algorithm (cont.) By q: q i-1  q i we denote a query q decomposed into 2 subqueries q i-1 and q i, where q i-1 is executed first and its result is consumed by q i. Given an n-relation query q, the INGRES query processor decomposes q into n subqueries q 1  q 2 ...  q i. This decomposition uses two basic techniques: detachment and substitution. Detachment is used first; it breaks q into q ´  q ´´, based on a common relation that is the result of q ´. A more detailed explanation and examples follow.

Institut für Scientific Computing – Universität WienP.Brezany 18 INGRES Algorithm - Detachment If the query q is expressed in SQL is of the form

Institut für Scientific Computing – Universität WienP.Brezany 19 Database Example

Institut für Scientific Computing – Universität WienP.Brezany 20 INGRES Algorithm – Detachment - Example Running Example for INGRES: To illustrate the detachment technique, we apply it to the following query: „Names of employees working on the CAD/CAM project“ This query can be expressed by the following query q 1 on our example engineering DB.

Institut für Scientific Computing – Universität WienP.Brezany 21 INGRES Algorithm - Substitution

Institut für Scientific Computing – Universität WienP.Brezany 22 INGRES Algorithm – Substitution - Example OVQP = one-variable query processor

Institut für Scientific Computing – Universität WienP.Brezany 23 INGRES Algorithm (formalized)

Institut für Scientific Computing – Universität WienP.Brezany 24 JOIN ORDERING IN FRAGMENT QUERIES Ordering joins is an important aspect of centralized query optimization. Join ordering in a distributed context is even more important since joins between fragments may increase the communication time. 2 basic approaches exist to order joins in fragment queries: 1.direct optimization of the ordering of joins (e.g. in the Distributed INGRES algorithm). 2.replacement of joins by combination of semijoins in order to minimize communication costs.

Institut für Scientific Computing – Universität WienP.Brezany 25 Join Ordering Note: To simplify notation, we use the term relation to designate a fragment stored at a particular site. Let‘s the query is R ⋈ S, where R and S are relations stored at different sites and ⋈ denotes the join operator. The obvious choice is to send the smaller relation to the site of the larger one. RS if size(R) < size(S) if size(R) > size(S) More interesting is the case where there are more than 2 relations to join. The objective of the join ordering algorithm is to transmit smaller operands. The difficulty: the join operations may reduce or increase the size of intermediate results  estimating the size of joint results is mandatory, but difficult.

Institut für Scientific Computing – Universität WienP.Brezany 26 Join Ordering - Example Example: Consider the following query expressed in relat. alg.: PROJ ⋈ PNO EMP ⋈ ENO ASG whose join graph is below: ASG EMPPROJ Site 2 Site 3Site 1 ENO PNO This query can be executed in at least 5 different ways. We describe them by the programs introduced in the next slide.

Institut für Scientific Computing – Universität WienP.Brezany 27 Join Ordering – Example (cont.)

Institut für Scientific Computing – Universität WienP.Brezany 28 Semijoin Based Algorithms The join of 2 relations R and S over attribute A, stored at sites 1 and 2 respectively, can be computed by replacing one or both operand relations by a semijoin with the other relation, using the following rules: Strategy 1 Strategy 1:

Institut für Scientific Computing – Universität WienP.Brezany 29 Horizontal fragmentation

Institut für Scientific Computing – Universität WienP.Brezany 30 Vertical fragmentation

Institut für Scientific Computing – Universität WienP.Brezany 31 Hybrid (Mixed) Fragmentation In most cases a simple horizontal or vertical fragmentation will not be sufficient to satisfy the requirements of user applications. In this case a vertical fragmentation may be followed by a horizontal one, or vice versa, producing a tree-structured partitioning. Hybrid fragmentation Reconstruction of hybrid fragmentation

Institut für Scientific Computing – Universität WienP.Brezany 32 Hybrid Fragmentation (cont.) Fragment H 1 Fragment H 2 V 2 Fragment H 2 V 1 H 1 Fragment H 2 V 1 H 2 Fragment H 2 V 1 H 3 Relation A

Institut für Scientific Computing – Universität WienP.Brezany 33 Distributed INGRES Algorithm Only horizontal fragmentation is handled. General and broadcast (the same data unit can be transmitted from one site to all the other sites in a simple transfer) network topology considered. E.g., broadcasting is used to replicate fragments and then to maximize the degree of parallelism. The input is a query expressed in tuple relational calculus and schema information + the network type + the location and size of each fragment. The algorithm is executed by the site, called the master site, where the query is initiated. The algorithm called D-INGRES-QOA is given in Algorithm 9.3. in the next slide.

Institut für Scientific Computing – Universität WienP.Brezany 34 Distributed INGRES Algorithm MRQ‘_list

Institut für Scientific Computing – Universität WienP.Brezany 35 Distributed INGRES Algorithm (cont.) All monorelation queries (e.g. selection and projection) that can be detached are first processed locally [Step (1)] Reduction algorithm is applied to the original query [Step (2)]. (Reduction is a technique that isolates all irreducible subqueries and monorelation subqueries by detachment.) Monorelation queries are ignored because they have already been processed in step (1). Thus the REDUCE procedure produces a sequence of irreducible subqueries q 1  q 2 ...  q n, with at most one relation in common between two consecutive subqueries. Our Running Example in the previous slides, which illustrated the detachment technique, also illustrates what the REDUCE procedure would produce. Based on the list of irreducible queries isolated in step (2) and the size of each fragment, the next subquery MRQ‘, which has at least 2 variables, is chosen at Step (3.1) and Steps (3.2), (3.3), and 3.4 are applied to it. Step 3.2 selects the best strategy to process the query MRQ‘. This strategy is described by a list of pairs (F, S), in which F is a fragment to transfer to the processing site S. Step 3.3 transfers all the fragments to their processing sites. Step 3.4 executes MRQ‘. If there are remaining subqueries, the algorithm goes back to step (3) and performs the next iteration. Otherwise, the algorithm terminates.

Institut für Scientific Computing – Universität WienP.Brezany 36 Distributed INGRES Algorithm (cont.) Optimization occurs in steps (3.1) and (3.2). The algorithm has produced subqueries with several components and their dependency order (similar to one given by a relational algebra tree). At step (3.1) a simple choice for the next subquery is to take the next one having no predecessor and involving the smaller fragments. This minimize the size of the intermediate results. E.g., if a query q has the subqueries q 1, q 2, and q 3, with dependencies q 1  q 3, q 2  q 3, and if the fragments referred to by q 1 are smaller than those referred to by q 2, then q 1 is selected. The subquery selected must then be executed. Since the relation involved in a subquery may be stored at different sites and even fragmented, the subquery may nevertheless be further subdivided.

Institut für Scientific Computing – Universität WienP.Brezany 37 Distributed INGRES Algorithm (cont.) - Example Assume that relations EMP, ASG, and PROJ of the query of our Running Example are stored as follows, where relation EMP is fragmented.

Institut für Scientific Computing – Universität WienP.Brezany 38 Distributed INGRES Algorithm (cont.) At step (3.2), the next optimization problem is to determine how to execute the subquery by selecting the fragments that will be moved and the sites where the processing will take place. For an n-relation subquery, fragments from n-1 relations must be moved to the site(s) of fragments of the remaining relation, say R p, and then replicated there. Also, the remaining relation may be further partitioned into k „equalized“ fragments in order to increase parallelism. This method is called fragment-and-replicate and performs a substitution of fragments rather than of tuples as in centralized INGRES. The selection of the remaining relation and of the number of processing sites k on which it should be partitioned is based on the objective function and the topology of the network. (Replication is cheaper in broadcast networks than in point-to-point networks).

Institut für Scientific Computing – Universität WienP.Brezany 39 Distributed INGRES Algorithm (cont.) - Example

Institut für Scientific Computing – Universität WienP.Brezany 40 Architecture of Distributed DBMS Revisited Components of a Distributed DBMS Detailed Model of the Distributed Execution Monitor

Institut für Scientific Computing – Universität WienP.Brezany 41 Architecture Revisited (cont.) The Transaction Manager (TM) is responsible for coordinating the execution of the DB operations on behalf of an application. The Scheduler (SC) is responsible for the implementation of a specific concurrency control algorithm for synchronizing access to the database. Each transaction originates at one site – its originating site. The execution of the database operations of a transaction is coordinated by the TM at that transaction‘s originating site. The TMs implement an interface for the applications programs which consists of 5 commands: 1.Begin_transaction. This is an indicator to the TM that a new transaction is starting. The TM does some bookkeeping, such as recording the transaction‘s name, the originating application, etc. 2.Read. If the data item x is stored locally, ist value is read and returned to the transaction. Otherwise, the TM selects one copy of x and requests ist copy to be returned. 3.Write. The TM coordinates the updating ofn x‘s value at each site where it resides. 4.Commit. The TM coordinates the physical updating of all databases that contain copies of each data item for which a previous write was issued. 5.Abort. The TM makes sure that no effects of the transaction are reflected in the DB. In providing these services, a TM can communicate with SCs and data processors at the same or at different sites.