Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.

Similar presentations


Presentation on theme: "1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database."— Presentation transcript:

1 1 Relational Algebra and Calculus Chapter 4

2 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database  Relational model supports simple, powerful QLs:  Strong formal foundation based on logic  Allows for much optimization  Query Languages != programming languages!  QLs not expected to be “Turing complete”  QLs not intended to be used for complex calculations  QLs support easy, efficient access to large data sets

3 3 Formal Relational Query Languages  Two mathematical Query Languages form the basis for “real” languages (e.g. SQL), and for implementation:  Relational Algebra : More operational, very useful for representing execution plans  Relational Calculus : Lets users describe what they want, rather than how to compute it (Non- operational, declarative )

4 4 Example Instances R1 S1 S2  “Sailors” and “Reserves” relations for our examples

5 5 Relational Algebra  Basic operations:  Selection ( ) Selects a subset of rows from relation  Projection ( ) Deletes unwanted columns from relation  Cross-product ( ) Allows us to combine two relations  Set-difference ( ) Tuples in reln. 1, but not in reln. 2  Union ( ) Tuples in reln. 1 and in reln. 2  Additional operations:  Intersection, join, division, renaming: Not essential, but (very!) useful.  Since each operation returns a relation, operations can be composed ! (Algebra is “closed”)

6 6 Projection  Deletes attributes that are not in projection list  Schema of result contains exactly the fields in the projection list, with the same names that they had in the (only) input relation  Projection operator has to eliminate duplicates ! (Why??)  Note: real systems typically don’t do duplicate elimination unless the user explicitly asks for it (Why not?)

7 7 Selection  Selects rows that satisfy selection condition  No duplicates in result! (Why?)  Schema of result identical to schema of (only) input relation  Result relation can be the input for another relational algebra operation! ( Operator composition. )

8 8 Union, Intersection, Set-Difference  All of these operations take two input relations, which must be union-compatible :  Same number of fields  `Corresponding’ fields have the same type  What is the schema of result?

9 9 Cross-Product  Each row of S1 is paired with each row of R1  Result schema has one field per field of S1 and R1, with field names `inherited’ if possible  Conflict : Both S1 and R1 have a field called sid  Renaming operator :

10 10 Joins  Condition Join :  Result schema same as that of cross-product  Fewer tuples than cross-product, might be able to compute more efficiently

11 11 Joins  Equi-Join : A special case of condition join where the condition c contains only equalities  Result schema similar to cross-product, but only one copy of fields for which equality is specified  Natural Join : Equijoin on all common fields

12 12 Division  Not supported as a primitive operator, but useful for expressing queries like: Find sailors who have reserved all boats  Let A have 2 fields, x and y ; B have only field y :  A/B =  i.e., A/B contains all x tuples (sailors) such that for every y tuple (boat) in B, there is an xy tuple in A  Or : If the set of y values (boats) associated with an x value (sailor) in A contains all y values in B, the x value is in A/B

13 13 Examples of Division A/B A B1 B2 B3 A/B1A/B2A/B3

14 14 Find names of sailors who’ve reserved boat #103  Solution 1:  Solution 2 :  Solution 3 :

15 15 Find names of sailors who’ve reserved a red boat  Information about boat color only available in Boats; so need an extra join:  A more efficient solution: A query optimizer can find this, given the first solution!

16 16 Find names of sailors who’ve reserved a red or a green boat  Can identify all red or green boats, then find sailors who’ve reserved one of these boats:  Can also define Tempboats using union! (How?)  What happens if is replaced by in this query?

17 17 Find names of sailors who’ve reserved a red and a green boat  Previous approach won’t work! Must identify sailors who’ve reserved red boats, sailors who’ve reserved green boats, then find the intersection (note that sid is a key for Sailors):

18 18 Find names of sailors who’ve reserved all boats  Uses division; schemas of the input relations to / must be carefully chosen:  To find sailors who’ve reserved all ‘Interlake’ boats:.....

19 19 Relational Calculus  Comes in two flavors: Tuple relational calculus (TRC) and Domain relational calculus (DRC)  Calculus has variables, constants, comparison ops, logical connectives and quantifiers  TRC : Variables range over (i.e., get bound to) tuples.  DRC : Variables range over domain elements (= field values)  Both TRC and DRC are simple subsets of first-order logic  Expressions in the calculus are called formulas  An answer tuple is essentially an assignment of constants to variables that make the formula evaluate to true

20 20 Tuple Relational Calculus  Query has the form: { T | p ( T )}  p(T) denotes a formula in which tuple variable T appears  Answer is the set of all tuples T for which the formula p ( T ) evaluates to true.  Formula is recursively defined:  start with simple atomic formulas (get tuples from relations or make comparisons of values)  build bigger and better formulas using the logical connectives

21 21 TRC Formulas  An Atomic formula is one of the following: R  Rel R.a op S.b R.a op constant op is one of  A formula can be:  an atomic formula  where p and q are formulas  where variable R is a tuple variable

22 22 Free and Bound Variables  The use of quantifiers and in a formula is said to bind X in the formula  A variable that is not bound is free  Let us revisit the definition of a query:  { T | p ( T )} There is an important restriction — the variable T that appears to the left of `|’ must be the only free variable in the formula p(T) — in other words, all other tuple variables must be bound using a quantifier

23 23 Selection and Projection  Find names and ages of sailors with rating above 7 {S |S  Sailors  S.rating > 7} {S |  S1  Sailors(S1.rating > 7  S.sname = S1.sname  S.age = S1.age)} –Modify this query to answer: Find sailors who are older than 18 or have a rating under 9, and are called ‘Bob’ Useful convention: S is a tuple variable of 2 fields (i.e. {S} is a projection of sailors), since only 2 fields are ever mentioned and S is never used to range over any relations in the query Find all sailors with rating above 7

24 24 Find sailors rated > 7 who’ve reserved boat #103 Note the use of  to find a tuple in Reserves that `joins with’ the Sailors tuple under consideration {S | S  Sailors  S.rating > 7   R(R  Reserves  R.sid = S.sid  R.bid = 103)} Joins

25 25 Joins (continued) {S | S  Sailors  S.rating > 7   R(R  Reserves  R.sid = S.sid   B(B  Boats  B.bid = R.bid  B.color = ‘red’))} Find sailors rated > 7 who’ve reserved a red boat  Observe how the parentheses control the scope of each quantifier’s binding. (Similar to SQL!)

26 26 Division (makes more sense here???)  Find all sailors S such that for each tuple B in Boats there is a tuple in Reserves showing that sailor S has reserved it Find sailors who’ve reserved all boats (hint, use  ) {S | S  Sailors   B  Boats (  R  Reserves (S.sid = R.sid  B.bid = R.bid))}

27 27 Division – a trickier example… {S | S  Sailors   B  Boats ( B.color = ‘red’   R(R  Reserves  S.sid = R.sid  B.bid = R.bid))} Find sailors who’ve reserved all Red boats {S | S  Sailors   B  Boats ( B.color  ‘red’   R(R  Reserves  S.sid = R.sid  B.bid = R.bid))} Alternatively…

28 28 Unsafe Queries, Expressive Power   syntactically correct calculus queries that have an infinite number of answers! Unsafe queries.  e.g.,  Solution???? Don’t do that!  Expressive Power (Theorem due to Codd):  every query that can be expressed in relational algebra can be expressed as a safe query in DRC / TRC; the converse is also true.  Relational Completeness : Query language (e.g., SQL) can express every query that is expressible in relational algebra (actually, SQL is more powerful, as we will see…)

29 29 Summary  The relational model has rigorously defined query languages that are simple and powerful  Relational algebra is more operational; useful as internal representation for query evaluation plans  Several ways of expressing a given query; a query optimizer should choose the most efficient version

30 30 Summary  Relational calculus is non-operational, and users define queries in terms of what they want, not in terms of how to compute it. (Declarativeness)  Algebra and safe calculus have same expressive power, leading to the notion of relational completeness


Download ppt "1 Relational Algebra and Calculus Chapter 4. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database."

Similar presentations


Ads by Google