Presentation is loading. Please wait.

Presentation is loading. Please wait.

Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.

Similar presentations


Presentation on theme: "Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational."— Presentation transcript:

1 Relational Algebra Chapter 4 - part I

2 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational model supports simple powerful QLs:  Strong formal foundation based on logic.  Allows for much optimization.  Query Languages != Programming languages!  QLs not expected to be “Turing complete”.  QLs not intended to be used for complex calculations.  QLs support easy, efficient access to large data sets.

3 3 Formal Relational Query Languages  Two mathematical Query Languages form the basis for “real” languages (e.g. SQL) and for implementation:  Relational Algebra :  More operational, very useful for representing execution plans.  Relational Calculus :  Lets users describe what they want, rather than how to compute it. (Non-operational, rather declarative.)

4 4 Preliminaries  A query is applied to relation instances.  The result of a query is also a relation instance.  Schemas of input relations for a query are fixed ; (but query will run regardless of instance!)  The schema for result of given query is also fixed! Determined by definition of query language.

5 5 Preliminaries  Positional vs. named-field notation:  Positional field notation e.g., S.1  Named field notation e.g., S.sid  Pros/Cons:  Positional notation easier for formal definitions, named-field notation more readable.  Both used in SQL  Assume that names of fields in query results are `inherited’ from names of fields in query input relations.

6 6 Example Instances R1 S1 S2 Sailors Reserves

7 7  Basic operations:  Selection ( ) Selects a subset of rows from relation.  Projection ( ) Deletes unwanted columns from relation.  Cartesian-product ( ) Allows us to combine two relations.  Set-difference ( ) Tuples in reln. 1, but not in reln. 2.  Union ( ) Tuples in reln. 1 and in reln. 2.  Additional operations:  Intersection, join, division, renaming: Not essential, but (very!) useful.  Since each operation returns a relation, operations can be composed ! (Algebra is “closed”.) Relational Algebra

8 8 Selection S2 Sailors

9 9 Selection   condition (R)  Selects rows that satisfy selection condition. attribute op constant attribute op attribute Op is {, =, =, =}  No duplicates in result!  Schema of result identical to schema of (only) input relation.

10 10 Selection  Result relation can be input for another relational algebra operation! ( Operator composition. )

11 11 Projection S2 Sailors

12 12 Projection   projectlist (R)  Deletes attributes that are not in projection list.  Schema of result = contains fields in projection list  Projection operator has to eliminate duplicates ! (Why??)  Note: real systems typically don’t do duplicate elimination unless the user explicitly asks for it. (Why not?)

13 13 Union, Intersection, Set-Difference  All of these operations take two input relations, which must be union-compatible :  Same number of fields.  `Corresponding’ fields have same type.  What is the schema of result?

14 14 Example Instances : Union S1 S2

15 15 Union, Intersection, Set-Difference  All of these operations take two input relations, which must be union- compatible :  Same number of fields.  `Corresponding’ fields have the same type.  What is the schema of result?

16 16 Difference Operation S1 S2

17 17 Union, Intersection, Set-Difference  All of these operations take two input relations, which must be union- compatible :  Same number of fields.  `Corresponding’ fields have the same type.  What is the schema of result?

18 18 Intersection Operation S1 S2

19 19 Union, Intersection, Set-Difference  All of these operations take two input relations, which must be union- compatible :  Same number of fields.  `Corresponding’ fields have the same type.  What is the schema of result?

20 20 Cross-Product (Cartesian Product)  S1 R1 : Each row of S1 is paired with each row of R1. R1 S1 Sailors Reserves

21 21 Cross-Product (Cartesian Product)  S1 R1 : Result schema has one field per field of S1 and R1, with field names `inherited’ if possible.  Conflict : Both S1 and R1 have a field called sid. * Renaming operator :

22 22 Why we need a Join Operator ?  In many cases, Join = Cross-Product + Select + Project  However : Cross-product is too large to materialize Apply Select and Project "On-the-fly"

23 23 Condition Join / Theta Join  Condition Join :  Result schema same as that of cross-product.  Fewer tuples than cross-product, more efficient.

24 24 EquiJoin  Equi-Join : A special case of condition join where the condition c contains only equalities.  Result schema similar to cross-product, but only one copy of fields for which equality is specified.  An extra project: PROJECT ( THETA-JOIN)

25 25 Natural Join  Natural Join : Equijoin on all common fields.

26 26 Division  Not supported as a primitive operator, but useful for expressing queries like: Find sailors who have reserved all boats.

27 27 Division  Let A have 2 fields x and y ; B have only field y :  A/B =  A/B contains all x tuples (sailors) such that for every y tuple (boat) in B, there is an xy tuple in A.  If set of y values (boats) associated with an x value (sailor) in A contains all y values in B, then x value is in A/B.  A/B is the largest relation instance Q such that Q  B  A.

28 28 Division Example e.g., A = all parts supplied by suppliers, B= relation parts A/B = suppliers who supply all parts listed in B

29 29 Examples of Division A/B A B1 B2 B3 A/B1A/B2A/B3

30 30 Expressing A/B Using Basic Operators  Idea : For A/B, compute all x values that are not `disqualified’ by some y value in B.  x value is disqualified if by attaching y value from B,  we obtain an xy tuple that is not in A. Disqualified x values: A/B:

31 31 A few example queries

32 32 Find names of sailors who’ve reserved boat #103 Solution 1 Solution 2 Solution 3

33 33 Find names of sailors who’ve reserved a red boat R1 S1 Sailors Reserves bidbnamecolor 101Interlakeblue 103Clipperred Boats B1

34 34 Find names of sailors who’ve reserved a red boat  Information about boat color only available in Boats; so need an extra join: A more efficient solution: * A query optimizer can find this given the first solution!

35 35 Find sailors who’ve reserved a red or a green boat  Can identify all red or green boats, then find sailors who’ve reserved one of these boats: Can also define Tempboats using union! (How?) What happens if is replaced by in this query?

36 36 Find sailors who’ve reserved a red and a green boat  Previous approach won’t work!  Must identify sailors who reserved red boats, sailors who’ve reserved green boats, then find their intersection

37 37 Find the names of sailors who’ve reserved all boats  Uses division; schemas of the input relations to / must be carefully chosen: v To find sailors who’ve reserved all ‘Interlake’ boats:.....

38 Relational Algebra : Some More Operators Beyond Chapter 4

39 39 Generalized Projection   F1, F2, … (R)  sname, (rating*2 as myrating) (Sailors)

40 40 Aggregation operators  MIN, MAX, COUNT, SUM, AVG  AGG B (R) considers only non-null values of R. R AB 12 34 1null 13 MIN B (R) 2 MAX B (R) 4 COUNT B (R) 3 SUM B (R) 9 AVG B (R) 3 COUNT * (R) 4

41 41 Aggregation Operators  MIN, MAX, SUM, AVG must be on any 1 attribute. COUNT can be on any 1 attribute or COUNT * (R)  An aggregation operator returns a bag, not a single value ! But SQL allows treatment as a single value. AB 34 σ B=MAX B (R) (R)

42 42 Grouping Operator:  GL, AL (R)   GL, AL (R) groups all attributes in GL, and performs the aggregation specified in AL. titleyearstarName SW177HF Matrix99KR 6D&7N93HF SW279HF Speed94KR StarsIn  starName, MIN (year)→year, COUNT(title) →num (StarsIn) starNameyearnum HF773 KR942

43 43 Summary  The relational model has rigorously defined query languages that are simple and powerful.  Relational algebra is operational; useful as internal representation for query evaluation plans.  Several ways of expressing a given query; a query optimizer should choose most efficient version.


Download ppt "Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational."

Similar presentations


Ads by Google