Relational Operator Evaluation. Overview Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g.,

Slides:



Advertisements
Similar presentations
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Evaluation of Relational Operations Chapter 12, Part A.
Advertisements

Evaluation of Relational Operators CS634 Lecture 11, Mar Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 14, Part B.
Implementation of Other Relational Algebra Operators, R. Ramakrishnan and J. Gehrke1 Implementation of other Relational Algebra Operators Chapter 12.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
Database Management Systems, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
1 Overview of Query Evaluation Chapter Objectives  Preliminaries:  Core query processing techniques  Catalog  Access paths to data  Index matching.
SPRING 2004CENG 3521 Query Evaluation Chapters 12, 14.
Implementation of Relational Operations CS186, Fall 2005 R&G - Chapter 14 First comes thought; then organization of that thought, into ideas and plans;
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Query Evaluation Chapter 12.
1 Implementation of Relational Operations Module 5, Lecture 1.
Evaluation of Relational Operators 198:541. Relational Operations  We will consider how to implement: Selection ( ) Selects a subset of rows from relation.
1  Simple Nested Loops Join:  Block Nested Loops Join  Index Nested Loops Join  Sort Merge Join  Hash Join  Hybrid Hash Join Evaluation of Relational.
SPRING 2004CENG 3521 Join Algorithms Chapter 14. SPRING 2004CENG 3522 Schema for Examples Similar to old schema; rname added for variations. Reserves:
Implementation of Relational Operations CS 186, Spring 2006, Lecture 14&15 R&G Chapters 12/14 First comes thought; then organization of that thought, into.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Query Evaluation Chapter 12.
Overview of Query Evaluation R&G Chapter 12 Lecture 13.
Implementation of Relational Operations R&G - Chapters 12 and 14.
1 Evaluation of Relational Operations: Other Techniques Chapter 12, Part B.
1 Evaluation of Relational Operations Yanlei Diao UMass Amherst March 01, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Evaluation of Relational Operations. Relational Operations v We will consider how to implement: – Selection ( ) Selects a subset of rows from relation.
1 Implementation of Relational Operations: Joins.
Overview of Implementing Relational Operators and Query Evaluation
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Query Evaluation Chapter 12.
1 Overview of Query Evaluation Chapter Overview of Query Evaluation  Plan : Tree of R.A. ops, with choice of alg for each op.  Each operator typically.
CPSC 404, Laks V.S. Lakshmanan1 Evaluation of Relational Operations: Other Operations Chapter 14 Ramakrishnan & Gehrke (Sections ; )
Lec3/Database Systems/COMP4910/031 Evaluation of Relational Operations Chapter 14.
Database Systems/comp4910/spring20031 Evaluation of Relational Operations Why does a DBMS implements several algorithms for each algebra operation? What.
Implementing Natural Joins, R. Ramakrishnan and J. Gehrke with corrections by Christoph F. Eick 1 Implementing Natural Joins.
1 Database Systems ( 資料庫系統 ) December 3, 2008 Lecture #10.
Introduction We’ve covered the basic underlying storage, buffering, and indexing technology. – Now we can move on to query processing. Some database operations.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Overview of Implementing Relational Operators and Query Evaluation Chapter 12.
Introduction to Query Optimization, R. Ramakrishnan and J. Gehrke 1 Introduction to Query Optimization Chapter 13.
1 Database Systems ( 資料庫系統 ) December 19, 2004 Lecture #12 By Hao-hua Chu ( 朱浩華 )
CPSC 404, Laks V.S. Lakshmanan1 Evaluation of Relational Operations – Join Chapter 14 Ramakrishnan and Gehrke (Section 14.4)
Query Execution. Where are we? File organizations: sorted, hashed, heaps. Indexes: hash index, B+-tree Indexes can be clustered or not. Data can be stored.
Implementation of Relational Operations R&G - Chapter 14 First comes thought; then organization of that thought, into ideas and plans; then transformation.
Database Management Systems 1 Raghu Ramakrishnan Evaluation of Relational Operations Chpt 14.
Hash Tables and Query Execution March 1st, Hash Tables Secondary storage hash tables are much like main memory ones Recall basics: –There are n.
Implementation of Database Systems, Jarek Gryz1 Evaluation of Relational Operations Chapter 12, Part A.
Query Execution Query compiler Execution engine Index/record mgr. Buffer manager Storage manager storage User/ Application Query update Query execution.
Alon Levy 1 Relational Operations v We will consider how to implement: – Selection ( ) Selects a subset of rows from relation. – Projection ( ) Deletes.
1 Database Systems ( 資料庫系統 ) Chapter 12 Overview of Query Evaluation November 22, 2004 By Hao-hua Chu ( 朱浩華 )
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Evaluation of Relational Operations Chapter 14, Part A (Joins)
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Relational Operator Evaluation. Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin)
Implementation of Relational Operations Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY courtesy of Joe Hellerstein, Mike Franklin, and etc for.
Introduction to Query Optimization
Evaluation of Relational Operations
Evaluation of Relational Operations: Other Operations
Relational Operations
CS222P: Principles of Data Management Notes #11 Selection, Projection
Database Applications (15-415) DBMS Internals- Part VI Lecture 15, Oct 23, 2016 Mohammad Hammoud.
Introduction to Database Systems
Implementation of Relational Operations
Overview of Query Evaluation
Overview of Query Evaluation
Overview of Query Evaluation
Overview of Query Evaluation
Implementation of Relational Operations
CS222: Principles of Data Management Notes #11 Selection, Projection
Evaluation of Relational Operations: Other Techniques
Implementation of Relational Operations
Overview of Query Evaluation
Evaluation of Relational Operations: Other Techniques
CS222/CS122C: Principles of Data Management UCI, Fall 2018 Notes #10 Selection, Projection Instructor: Chen Li.
Presentation transcript:

Relational Operator Evaluation

Overview

Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g., SAP admin) DBA, Tuner Hardware [Processor(s), Disk(s), Memory] Operating System Concurrency ControlRecovery Storage Subsystem Indexes Query Processor Application

Overview of query processing Parser Query Optimizer Statistics Cost Model QEP Parsed Query Database High Level Query Query Result Query Evaluator Plan Generator Plan cost Estimator Catalog Manager

Relational Operations We will consider how to implement: –Selection Selects a subset of rows from relation. –Projection Deletes unwanted columns from relation. –Join Allows us to combine two relations. –Set-difference Tuples in reln. 1, but not in reln. 2. –Union Tuples in reln. 1 and in reln. 2. –Aggregation (SUM, MIN, etc.) and GROUP BY Since each op returns a relation, ops can be composed! After we cover the operations, we will discuss how to optimize queries formed by composing them.

Schema for Examples Sailors (sid: integer, sname: string, rating: integer, age: real) Reserves (sid: integer, bid: integer, day: dates, rname: string) Similar to old schema; rname added for variations. Reserves: –Each tuple is 40 bytes long, 100 tuples per page, 1000 pages. Sailors: –Each tuple is 50 bytes long, 80 tuples per page, 500 pages.

Overview of operation evaluation Common techniques for implementing algorithms for relational operators: –Indexing: Can use WHERE conditions to retrieve small set of tuples (selections, joins) –Iteration: Sometimes, faster to scan all tuples even if there is an index. (And sometimes, we can scan the data entries in an index instead of the table itself.) –Partitioning: By using sorting or hashing, we can partition the input tuples and replace an expensive operation by similar operations on smaller inputs. * Watch for these techniques as we discuss query evaluation!

Access Paths An access path is a method of retrieving tuples: –File scan, or index that matches a selection (in the query) A tree index matches (a conjunction of) terms that involve only attributes in a prefix of the search key. –E.g., Tree index on matches the selection a=5 AND b=3, and a=5 AND b>6, but not b=3. A hash index matches (a conjunction of) terms that has a term attribute = value for every attribute in the search key of the index. –E.g., Hash index on matches a=5 AND b=3 AND c=5; but it does not

A Note on Complex Selections Selection conditions are first converted to conjunctive normal form (CNF): –(day<8/9/94 OR bid=5 OR sid=3 ) AND (rname=‘Paul’ OR bid=5 OR sid=3) We only discuss case with no ORs; see text if you are curious about the general case. (day<8/9/94 AND rname= ‘ Paul ’ ) OR bid=5 OR sid=3

Selection Simple selection –No index, Unsorted data File scan –No index, Sorted data O(log 2 M) –B+tree index or Hash Index General Selection Condition SELECT * FROM Reserves R WHERE R.rname < ‘ C% ’

Using a Index for Selections Cost depends on #qualifying tuples, and clustering. –Cost of finding qualifying data entries (typically small) plus cost of retrieving records (could be large w/o clustering). –In example, assuming uniform distribution of names, about 10% of tuples qualify (100 pages, tuples). With a clustered index, cost is little more than 100 I/Os; if unclustered, upto I/Os! Important refinement for unclustered indexes: –1. Find qualifying data entries. –2. Sort the rid’s of the data records to be retrieved. –3. Fetch rids in order. This ensures that each data page is looked at just once (though # of such pages likely to be higher than with clustering).

General Selections First approach: Find the most selective access path, retrieve tuples using it, and apply any remaining terms that don’t match the index: –Most selective access path: An index or file scan that we estimate will require the fewest page I/Os. –Terms that match this index reduce the number of tuples retrieved; other terms are used to discard some retrieved tuples, but do not affect number of tuples/pages fetched. –Consider day could be used; day<8/9/94 must then be checked.

Intersection of Rids Second approach (if we have 2 or more matching indexes that use Alternatives (2) or (3) for data entries): –Get sets of rids of data records using each matching index. Then intersect these sets of rids (we’ll discuss intersection soon!) –Retrieve the records and apply any remaining terms. –Consider day<8/9/94 AND bid=5 AND sid=3. If we have a B+ tree index on day and an index on sid, both using Alternative (2), we can retrieve rids of records satisfying day<8/9/94 using the first, rids of recs satisfying sid=3 using the second, intersect, retrieve records and check bid=5.

summary