# CS4432: Database Systems II Query Operator & Algebraic Expressions 1.

## Presentation on theme: "CS4432: Database Systems II Query Operator & Algebraic Expressions 1."— Presentation transcript:

CS4432: Database Systems II Query Operator & Algebraic Expressions 1

Why SQL 2 SQL is a very-high-level language. Say “what to do” rather than “how to do it.” Avoid a lot of data-manipulation details needed in procedural languages like C++ or Java. Database management system figures out “best” way to execute query. Called “query optimization.”

Query Processing 3 SELECT pNumber, count(*) AS CNT FROM Student WHERE sNumber > 1 GROUP BY pNumber; SQL Query Query Plans

Query Example 4 SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C

5 How do we execute query? - Form Cartesian product of all tables in FROM-clause - Select tuples that match WHERE-clause - Project columns that occur in SELECT-clause One idea

6 R X SR.AR.BR.CS.CS.DS.E a 1 10 10 x 2 a 1 10 20 y 2. C 2 10 10 x 2. Bingo! Got one... SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C

7 But ? Performance would be unacceptable! We need a better approach for reasoning about queries, their execution orders and their respective costs

8 Formal Relational Query Languages Relational Algebra: More operational, very useful for representing execution plans. Operators working on relations

Core Relational Algebra (Recap) 9 Union, intersection, and difference. Usual set operations, both operands have the same relation schema. Selection: picking certain rows. Projection: picking certain columns. Products and joins: compositions of relations. Renaming of relations and attributes. Grouping and Aggregation: Grouping matching tuples Duplicate Elimination: eliminates identical copies except one Sorting: Orders tuples based on a given criteria

Relational Algebra Express Query Plans 10  B,D  R.A = “ c ”  S.E=2  R.C=S.C  X R S

Recap on Relational Algebra & Operators 11

Algebra Behind the Query Language Relational Algebra Set of operators that operate on relations Operator semantics based on Set or Bag theory Relational algebra form underlying basis (and optimization rules) for SQL 12 SELECT pNumber, count(*) AS CNT FROM Student WHERE sNumber > 1 GROUP BY pNumber;

Relational Algebra Basic operators Set Operations (Union: ∪, Intersection: ∩,difference: – ) Select: σ Project: π Cartesian product: x rename: ρ More advanced operators, e.g., grouping and joins The operators take one or two relations as inputs and produce a new relation as an output One input  unary operator, two inputs  binary operator 13

Union over sets:  Consider two relations R and S that are union-compatible (same schema) AB 12 34 R AB 12 34 56 S AB 12 34 56 R  SR  S 14 Binary Op.

Difference over sets: – R – S are the tuples that appear in R and not in S R & S must be union-compatible Defined as: R – S = {t | t ∈ R and t ∈ S} AB 12 34 R AB 12 56 S AB 34 R – S 15 Binary Op. R-S ≠ S-R

Intersection over sets: ∩ Consider two Relations R and S that are union- compatible AB 12 34 R AB 12 34 56 S AB 12 34 R ∩ S 16 Binary Op.

Selection: σ Select: σ c (R): c is a condition on R’s attributes Select subset of tuples from R that satisfy selection condition c ABC 125 346 127 R σ (C ≥ 6) (R) ABC 346 127 17 Unary Op.

Selection: Example R σ ((A=B) ^ (D>5)) (R) 18 σ (D > C) (R)

Project: π π A1, A2, …, An (R), with A1, A2, …, An  attributes A R returns all tuples in R, but only columns A1, A2, …, An A1, A2, …, An are called Projection List ABC 125 346 127 128 R π A, C (R) AC 15 36 17 18 19 Unary Op.

Extended Projection: π L (R) Example π C, V  A, X  C*3+B (R) 20 ABC 125 346 127 128 R CVX 5117 6322 7123 8126 Rename column A to V Compute this expression and call it X

Cross Product (Cartesian Product): X R S R X S 21 Each tuple in R joined with each tuple is S R x S = {t q | t ∈ R and q ∈ S} Binary Op.

Natural Join: R ⋈ S R S R ⋈ S 22 Implicit condition (R.B = S.B and R.D = S.D) Binary Op. An implicit equality condition on the common columns

Theta Join: R ⋈ C S A join based on any arbitrary condition C It is defined as : R ⋈ C S = (σ C (R X S)) AB 12 32 R DC 23 45 45 S R ⋈ R.A>=S.C S ABDC 3223 23 Recommendation: Always use Theta join (more explicit and more clear) Binary Op.

Duplicate Elimination:  (R) Delete all duplicate records Convert a Bag (allows duplicates) to a Set (does not allow duplicates) R AB 12 34 12 12  (R) AB 12 34 24 Unary Op.

Grouping & Aggregation operator:  Grouing & Aggregate operation in relational algebra  g1,g2, …gm, F1(A1), F2(A2), …Fn(An) (R) 25 Unary Op. Group by these columns (can be empty) Aggregation functions applied over each group avg: average value min: minimum value max: maximum value sum: sum of values count: number of values

Grouping & Aggregation Operator: Example  sum(c) (R) R S  branch_name,sum(balance) (S) 26

Assignment Operator:  Write query as a sequence of line consisting of: Series of assignments Result expression containing the final answer May use a variable multiple times in subsequent expressions Example: R1  ( σ ((A=B) ^ (D>5)) (R – S)) ∩ W R2  R1 ⋈ (R.A = T.C) T Result  R1 U R2 27

Banking Example branch (branch_name, branch_city, assets) customer (customer_name, customer_street, customer_city) account (account_number, branch_name, balance) loan (loan_number, branch_name, amount) depositor (customer_name, account_number) borrower (customer_name, loan_number) 28

Example Queries Find customer names having account balance below 100 or above 10,000 π customer_name (depositor ⋈ π account_number (σ balance 10,000 (account))) 29

Example Queries 30

Example Queries (Cont’d) 31

Example Queries Find customers’ names who have neither accounts nor loans π customer_name (customer) - (π customer_name (borrower) U π customer_name (depositer)) 32

Example Queries 33 For branches that gave loans > 100,000 or hold accounts with balances >50,000, report the branch name along whether it is reported because of a loan or an account R1  π branch_name, ‘Loan’ As Type (σ amount >100,000 (loan)) R2  π branch_name, ‘Account’ As Type (σ balance > 50,000 (account))) Result  R1 U R2

Example Queries Find customer names having loans with sum > 20,000 π customer_name (σ sum > 20,000 (  customer_name, sum  sum(amount) (loan ⋈ borrower))) 34

Example Queries Find the branch name with the largest number of accounts R1   branch_name, countAccounts  count(account_number) (account) R2   Max  max(countAccounts) (R1) Result  π branch_name (R1 ⋈ countAccounts = Max R2) 35

Summary of Relational-Algebra Operators Set operators Union, Intersection, Difference Selection & Projection & Extended Projection Joins Natural, Theta, Outer join Rename & Assignment Duplicate elimination Grouping & Aggregation 36

Download ppt "CS4432: Database Systems II Query Operator & Algebraic Expressions 1."

Similar presentations