Query Processing Reading: CB, Chaps 5 & 23. Dept of Computing Science, University of Aberdeen2 In this lecture you will learn the basic concepts of Query.

Slides:



Advertisements
Similar presentations
Query Processing Chapter 21 in Textbook.
Advertisements

SQL -I Reading: C&B, Chaps 6, 7, 8 & 9. Dept. of Computing Science, University of Aberdeen2 In this lecture you will learn The basic concepts and principles.
Dept. of Computing Science, University of Aberdeen1 Writing SELECT SQL Queries Nigel Beacham based on materials.
Spatial Databases Reading: None. Dept. of Computing Science, University of Aberdeen2 In this lecture you will learn the need for spatial databases some.
Transaction Management Reading: CB, Ch. 22. Dept of Computing Science, University of Aberdeen2 In this lecture you will learn the problems of concurrency.
Logical Database Design Reading: C&B, Chap 17. Dept. of Computer Science, University of Aberdeen2 In this lecture you will learn What is logical database.
Access Control & Views Reading: C&B, Chap 7. Dept of Computing Science, University of Aberdeen2 In this lecture you will learn the principles of object.
Database Design: ER Modelling
Database Design: Normalization
File Organization & Indexing Reading: C&B, Ch 18 & 23.
SQL-II Reading: C&B, Chap 6, 7, 8 & 9. Dept. of Computing Science, University of Aberdeen2 In this lecture you will learn how to sort and group query.
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
1 CSE 480: Database Systems Lecture 22: Query Optimization Reference: Read Chapter 15.6 – 15.8 of the textbook.
Relational Algebra 1 Chapter 5.1 V3.0 Napier University Dr Gordon Russell.
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
Query Optimization. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
Chapter 19 Query Processing and Optimization
Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
Chapter 6 SQL: Data Manipulation Cont’d. 2 ANY and ALL u ANY and ALL used with subqueries that produce single column of numbers u ALL –Condition only.
©Silberschatz, Korth and Sudarshan14.1Database System Concepts 3 rd Edition Chapter 14: Query Optimization Overview Catalog Information for Cost Estimation.
Propositional Calculus Math Foundations of Computer Science.
Relational Algebra.
CS 255: Database System Principles slides: From Parse Trees to Logical Query Plans By:- Arunesh Joshi Id:
Query Processing Presented by Aung S. Win.
Relational Model & Relational Algebra. 2 Relational Model u Terminology of relational model. u How tables are used to represent data. u Connection between.
CS 255: Database System Principles slides: From Parse Trees to Logical Query Plans By:- Arunesh Joshi Id:
Database Management 9. course. Execution of queries.
Query Optimization Chap. 19. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying where.
Chapter 7 Relational Algebra. Topics in this Chapter Closure Revisited The Original Algebra: Syntax and Semantics What is the Algebra For? Further Points.
Copyright © Curt Hill Query Evaluation Translating a query into action.
Bayu Adhi Tama, ST., MTI. Introduction Relational algebra and relational calculus are formal languages associated with the relational.
CPS216: Data-Intensive Computing Systems Introduction to Query Processing Shivnath Babu.
Query Processing Bayu Adhi Tama, MTI. 1 ownerNoclient © Pearson Education Limited 1995, 2005.
Propositional Calculus CS 270: Mathematical Foundations of Computer Science Jeremy Johnson.
Chapter 5 Relational Algebra and Relational Calculus Pearson Education © 2009.
Chapter 18 Query Processing. 2 Chapter - Objectives u Objectives of query processing and optimization. u Static versus dynamic query optimization. u How.
1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra.
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 6 The Relational Algebra and Relational Calculus.
Query Processing – Implementing Set Operations and Joins Chap. 19.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
Chapter 18 Query Processing and Optimization. Chapter Outline u Introduction. u Using Heuristics in Query Optimization –Query Trees and Query Graphs –Transformation.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
 CONACT UC:  Magnific training   
LECTURE THREE RELATIONAL ALGEBRA 11. Objectives  Meaning of the term relational completeness.  How to form queries in relational algebra. 22Relational.
Relational Algebra COMP3211 Advanced Databases Nicholas Gibbins
©Silberschatz, Korth and Sudarshan2.1Database System Concepts - 6 th Edition Chapter 8: Relational Algebra.
Query Processing and Query Optimization Database System Implementation CSE 507 Slides adapted from Silberschatz, Korth and Sudarshan Database System Concepts.
Chapter 14: Query Optimization
CSE202 Database Management Systems
Query Processing and Optimization, and Database Tuning
COMP3017 Advanced Databases
Database System Implementation CSE 507
Module 2: Intro to Relational Model
Database Management System
Relational Algebra - Part 1
Chapter 12: Query Processing
Chapter 2: Intro to Relational Model
Overview of Query Optimization
CS405G: Introduction to Database Systems
Chapter 2: Intro to Relational Model
Overview of Query Evaluation
Chapter 2: Intro to Relational Model
Example of a Relation attributes (or columns) tuples (or rows)
Chapter 2: Intro to Relational Model
Unit Relational Algebra 1
Presentation transcript:

Query Processing Reading: CB, Chaps 5 & 23

Dept of Computing Science, University of Aberdeen2 In this lecture you will learn the basic concepts of Query Processing how high level SQL queries are decomposed, analysed and executed how to express basic SQL queries in Relational Algebra why Relational Algebra is useful in query processing the strategies query optimisers use to generate query execution plans

Dept of Computing Science, University of Aberdeen3 Query Processing Overview Objective: Provide correct answer to query (almost) as efficiently as possible Metadata Results TablesIndexes Client Server Execute Query Interpret Query SQL Query

Dept of Computing Science, University of Aberdeen4 We Are Here!

Dept of Computing Science, University of Aberdeen5 Query Processing Operations Query processing involves several operations: Lexical & syntactic analysis - transform SQL into an internal form Normalisation - collecting AND and OR predicates Semantic analysis - i.e. does the query make sense ? Simplification - e.g. remove common or redundant sub- expressions Generating an execution plan - query optimisation Executing the plan and returning results to the client To describe most of these, we need to use Relational Algebra

Dept of Computing Science, University of Aberdeen6 Introducing Relational Algebra What is relational algebra (RA) and why is it useful ? – RA is a symbolic formal way of describing relational operations – RA says how, as well as what (order is important) – Can use re-write rules to simplify and optimise complex queries... Maths example: – a + bx + cx 2 + dx 3 ; 3 adds, 3 multiplies, 2 powers; – a + x(b + x(c + xd)); 3 adds, 3 multiplies.

Dept of Computing Science, University of Aberdeen7 Basic Relational Algebra Operators The basic RA operators are: – Selection σ; Projection π; Rename ρ SQL: SELECT Lname FROM Staff RA: π Lname (Staff) SQL: SELECT Lname AS Surname FROM Staff RA: ρ Surname (Lname) π Lname (Staff) SQL: SELECT Lname AS Surname FROM Staff WHERE Salary>1000 RA: ρ Surname (Lname) π Lname σ Salary>1000 (Staff)

Dept of Computing Science, University of Aberdeen8 Further Relational Algebra Notation L R - natural join L P R - theta join with predicate P = L.a Θ R.b L x R - Cartesian product L U R - union L R - intersection P Q - conjunction (AND) P Q - disjunction (OR) ~ P - negation (NOT)

Dept of Computing Science, University of Aberdeen9 Query Processing Example Example: find all managers who work at a London Branch: SELECT * FROM Staff S, Branch B WHERE S.BrNo = B.BrNo AND S.Posn = 'Boss' AND B.City = 'London'; There are at least 3 ways of writing this in RA notation: –σ S.Posn=Boss B.City=London S.BrNo=B.BrNo (SxB) –σ S.Posn=Boss B.City=London (S B) –(σ S.Posn=Boss (S)) (σ B.City='London' (B)) One of these will be the most efficient - but which??

Dept of Computing Science, University of Aberdeen10 Lexical & Syntactical Analysis & Query Trees Lexical & syntactical analysis involves: – identifying keywords & literals – identifying table names & aliases – mapping aliases to table names – identifying column names – checking columns exist in tables The output of this phase is a relational algebra tree (RAT) X SB σ A^B^C Result

Dept of Computing Science, University of Aberdeen11 Semantic Analysis Does the query make sense? – Is the query legal SQL? – Is the RAT connected? - if not, query is incomplete! Can the query be simplified? - for example: – σ A^A (R) = σ A (R) (quite often with views) – σ AvA (R) = σ A (R) – σ A^~A (R) = Empty set (no point executing) – σ Av~A (R) = R (tautology: always true)

Dept of Computing Science, University of Aberdeen12 Normalisation & Normal Forms Normalisation re-writes the WHERE predicates as either: – disjunctive normal form: σ (A^B)vC = σ DvC – conjunctive normal form: σ (A^B)vC = σ (AvC)^(BvC) = σ D^E Why is this useful ? - sometimes a query might best be split into subqueries (remember set operations?): Disjunctions suggest union: σ AvB (R) = σ A (R) U σ B (R) Conjunctions suggest intersection: σ A^B (R) = σ A (R) σ B (R)

Dept of Computing Science, University of Aberdeen13 Some RA Equivalences Rules (Re-Write Rules) There are many equivalence rules (see CB p ). Here are a few: σ A^B (R) = σ A (σ B (R)) (cascade rule) σ A (σ B (R)) = σ B (σ A (R)) (commutivity) π A π B (R) = π A (R) (if A is a subset of B) σ P (π A (R)) = π A (σ P (R)) (if P uses cols in A) σ P (R x S) = R P S (if P = L.a Θ R.b) σ P (R S) = σ P (R) S (if P uses cols in R) Usually, its obvious which form is more efficient?

Dept of Computing Science, University of Aberdeen14 Generating Query Plans Most RDBMSs generate candidate query plans by using RA re-write rules to generate alternate RATs and to move operations around each tree: For complex queries, there may be a very large number of candidate plans...

Dept of Computing Science, University of Aberdeen15 Heuristic Query Optimisation Rules To avoid considering all possible plans, many DBMSs use heuristic rules: – keep together selections (σ ) on the same table – perform selections as early as possible – re-write selection on a cartesian product as a join – perform small joins first – keep together projections (π ) on the same relation – apply projections as early as possible – if duplicates are to be eliminated, use a sort algorithm

Dept of Computing Science, University of Aberdeen16 Cost-Based Query Optimisation Remember, accessing disc blocks is expensive! Ideally, the query optimiser should take into account: – the size (cardinality) of each table – which tables have indexes – the type of each index - clustered, non-clustered – which predicates can be evaluated using an index – how much memory query will need - and for how long – whether the query can be split over multiple CPUs