The Volcano Optimizer Generator Extensibility and Efficient Search.

Slides:



Advertisements
Similar presentations
Chapter 15 Algorithms for Query Processing and Optimization Copyright © 2004 Pearson Education, Inc.
Advertisements

The Volcano/Cascades Query Optimization Framework
The design process IACT 403 IACT 931 CSCI 324 Human Computer Interface Lecturer:Gene Awyzio Room:3.117 Phone:
Introduction to Software Architecture. What is Software Architecture?  It is the body of methods and techniques that help us to manage the complexities.
ACS-4902 Ron McFadyen Chapter 15 Algorithms for Query Processing and Optimization.
ICS (072)Query Processing and Optimization 1 Chapter 15 Algorithms for Query Processing and Optimization ICS 424 Advanced Database Systems Dr.
ICS (072)Database Systems Background Review 1 Database Systems Background Review Dr. Muhammad Shafique.
16.5 Introduction to Cost- based plan selection Amith KC Student Id: 109.
Software Architecture Patterns (2). what is architecture? (recap) o an overall blueprint/model describing the structures and properties of a "system"
Outline Chapter 1 Hardware, Software, Programming, Web surfing, … Chapter Goals –Describe the layers of a computer system –Describe the concept.
Chapter 19 Query Processing and Optimization
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
1 Introduction Introduction to database systems Database Management Systems (DBMS) Type of Databases Database Design Database Design Considerations.
Query Processing Presented by Aung S. Win.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Recommender Systems on the Web: A Model-Driven Approach Gonzalo Rojas – Francisco Domínguez – Stefano Salvatori Department of Computer Science University.
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Chapter 8 Architecture Analysis. 8 – Architecture Analysis 8.1 Analysis Techniques 8.2 Quantitative Analysis  Performance Views  Performance.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
Fundamentals of Information Systems, Fifth Edition
COMP 410 & Sky.NET May 2 nd, What is COMP 410? Forming an independent company The customer The planning Learning teamwork.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
TM 7-1 Copyright © 1999 Addison Wesley Longman, Inc. Physical Database Design.
Access Path Selection in a Relational Database Management System Selinger et al.
Query Optimization. Query Optimization Query Optimization The execution cost is expressed as weighted combination of I/O, CPU and communication cost.
Query Optimization (CB Chapter ) CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems: An Application Oriented.
Querying Structured Text in an XML Database By Xuemei Luo.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Michael Soffner A Variability Model for Query Optimizers Michael Soffner 1, Norbert Siegmund 1, Marko Rosenmüller 1, Janet Siegmund 1, Thomas.
Lecture 4 - Query Optimization Advanced Databases Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “QUERY OPTIMIZATION” Academic Year 2014 Spring.
Querying Large Databases Rukmini Kaushik. Purpose Research for efficient algorithms and software architectures of query engines.
Query Processing and Optimization
The Volcano Query Optimization Framework S. Sudarshan (based on description in Prasan Roy’s thesis Chapter 2)
1.file. 2.database. 3.entity. 4.record. 5.attribute. When working with a database, a group of related fields comprises a(n)…
 Three-Schema Architecture Three-Schema Architecture  Internal Level Internal Level  Conceptual Level Conceptual Level  External Level External Level.
1 Introduction to Software Engineering Lecture 1.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
DATABASE MANAGEMENT SYSTEM ARCHITECTURE
Data Structures Using C++ 2E
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Session 1 Module 1: Introduction to Data Integrity
Ranking of Database Query Results Nitesh Maan, Arujn Saraswat, Nishant Kapoor.
Class Diagrams. Terms and Concepts A class diagram is a diagram that shows a set of classes, interfaces, and collaborations and their relationships.
Oracle Business Intelligence Foundation - Commonly Used Features in Repository.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
Chapter 13 Query Optimization Yonsei University 1 st Semester, 2015 Sanghyun Park.
Chapter 13: Query Processing
IIS 645 Database Management Systems DDr. Khorsheed Today’s Topics 1. Course Overview 22. Introduction to Database management 33. Components of Database.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Query Processing and Query Optimization Database System Implementation CSE 507 Slides adapted from Silberschatz, Korth and Sudarshan Database System Concepts.
Managing Data Resources File Organization and databases for business information systems.
CHAPTER 19 Query Optimization. CHAPTER 19 Query Optimization.
Chapter 14: Query Optimization
CSC 427: Data Structures and Algorithm Analysis
Database System Implementation CSE 507
Prepared by : Ankit Patel (226)
Query Optimization for Object-Relational Database Systems
Database Performance Tuning and Query Optimization
The design process Software engineering and the design process for interactive systems Standards and guidelines as design rules Usability engineering.
Chapter 15 QUERY EXECUTION.
Physical Database Design
Software Design Lecture : 14.
Advance Database Systems
Chapter 11 Database Performance Tuning and Query Optimization
A Semantic Peer-to-Peer Overlay for Web Services Discovery
Query Processing.
Presentation transcript:

The Volcano Optimizer Generator Extensibility and Efficient Search

Background Emerging database applications demand –new functionality –high performance Volcano Project –Provides efficient, extensible tools for query and request processing. –For object-oriented and scientific database systems

Introduction Performance must not be sacrificed –Data volumes stored in database system continue to grow, need to support this –In order to overcome acceptance problems –Additional software layers counter- balanced by performance

New Optimizer Generator Search engine more extensible and powerful Effective support for non-trivial cost models and for physical properties such as sort order. Combines dynamic programming

Properties New Optimizer Usability as a stand-alone tool More efficient resource usage –optimization time, memory consumption Extensible support for physical properties –Sort order, compression status

Properties of New Optimizer Permit use of heuristics –Guide the search and prune futile parts Support flexible cost models that permit generating dynamic plans –for incompletely specified queries Data model independence

Generator Paradigm

Design Principles Query processing based on algebraic techniques –use transformations and cost-based mapping of logical algebra to algorithms Rules –identified as general concept to specify knowledge about patterns in a concise and modular fashion knowledge of algebraic laws as required for equivalence transformations

Design Principles Optimizer choices represented as algebraic equivalences in generator’s input –no intermediate levels –search engine applies them suitably Compiled rule set Dynamic programming

Optimizer Operation User queries specified as algebra expression of logical operators Goal : Mapping of logical algebra to physical algebra –Transformation, Implementation Rules (Pattern match, condition) –multiple logical operators to single physical operator (join followed by projection)

Optimizer Operation –Physical property vector used to summarize physical property of intermediate results Enforcers (sorting, decompress) –physical algebra that do not correspond with any logical operators –purpose is to enforce physical properties

Properties Properties describe results –Logical properties (schema, size..) –Physical properties (sort order…) Physical properties summarized in a physical property vector –optimizer implementor specifies

Optimizer Operation Applicability Functions determine whether or not algorithm or enforcer can deliver logical expression w/ physical properties that satisfy physical property vector determine the physical property vectors that the algorithm’s inputs must satisfy Cost function Cost : abstract data type estimate algorithm or enforcer’s cost

Optimizer Operation Property functions –determines logical and physical properties of logical and physical algebra expression –one per each logical operator, algorithm, enforcer

Optimizer Input Optimizer Implementor provides –A set of logical operators –algebraic transformation rules (condition code) –a set of algorithms and enforcers –implementation rules (condition code) –ADT cost (functions for arithmetic and comparison) –ADT physical property –applicability function –cost function –property function

The Search Engine Search engine and algorithms are central components of query optimizer Search engine used with all optimizer Search engine linked automatically with pattern matching and rule application code generated from data model description.

Dynamic Programming Extends to general algebraic query and request optimization and combines it with a top-down, goal- oriented control strategy for algebras in which the number of possible plans exceeds practical limits of pre-computation. Derives equivalent expressions and plans only for those partial queries that are considered as parts of larger subqueries. Directed Dynamic programming - goal driven, backward chaining

Dynamic Programming Partial optimization results used in later optimization decisions. Reinitialized for each query currently Prevent redundant optimization by capturing logical expressions and plans in hash table.

FindBestPlan Logical expression, physical properties, and cost limit as input First find in Hash table –plan satisfying physical property vector –return plan (cost limit?) + cost OR failure If expression not optimized before, optimization begins

Optimizer Moves Transformation rule Algorithm that delivers logical expression w/ desired physical properties Enforcer to permit additional algorithm choices

Search Most promising move pursued Exhaustive search currently –in future subset of moves will be selected, determined and ordered by another function provided by the optimizer implementor Cost limit used to improve search –branch&bound pruning –passed down in the optimization of subexpressions

Transformation Rule New expression formed Optimized with FindBestPlan Hash table

Algorithm Cost calculated by algorithm’s cost function Applicability function determines the physical property vectors for inputs Costs and optimal plans found by calling FindestPlan

Enforcer Cost estimated by cost provided by optimizer implementor Modify physical property vector Optimize with FindBestPlan Store interesting facts in hash table –possible future use

Functionality and Extensibility Distinction btw logical expressions and physical expressions Ability to specify physical properties -> drive optimization Algorithm is driven top-down Cost is more general Allow implementation of other search strategies

Search Efficiency and Effectiveness Much more effective and efficient compared to earlier prototype