Presentation is loading. Please wait.

Presentation is loading. Please wait.

Relational Algebra Wrap-up and Relational Calculus Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 11, 2003.

Similar presentations


Presentation on theme: "Relational Algebra Wrap-up and Relational Calculus Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 11, 2003."— Presentation transcript:

1 Relational Algebra Wrap-up and Relational Calculus Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 11, 2003 Some slide content courtesy of Susan Davidson & Raghu Ramakrishnan

2 2 Relational Algebra  Relational algebra operations operate on relations and produce relations (“closure”) f: Relation -> Relationf: Relation x Relation -> Relation  Six basic operations:  Projection   (R)  Selection   (R)  UnionR 1 [ R 2  DifferenceR 1 – R 2  ProductR 1 £ R 2  (Rename)   (R)  And some other useful ones:  JoinR 1 ⋈  R 2  SemijoinR 1 ⊲  R 2  IntersectionR 1 Å R 2  DivisionR 1 ¥ R 2

3 3 Example Data Instance sidname 1Jill 2Qun 3Nitin 4Marty fidname 1Ives 2Saul 8Roth sidexp-gradecid 1A550-0103 1A700-1003 3A 3C500-0103 4C cidsubjsem 550-0103DBF03 700-1003AIS03 501-0103ArchF03 fidcid 1550-0103 2700-1003 8501-0103 STUDENT Takes COURSE PROFESSOR Teaches

4 4 Natural Join and Intersection Natural join: special case of join where  is implicit – attributes with same name must be equal: STUDENT ⋈ Takes ´ STUDENT ⋈ STUDENT.sid = Takes.sid Takes Intersection: as with set operations, derivable from difference A-B B-A A B A Å B ≡ (A [ B) – (A – B) – (B – A) ≡ (A - B) – (B - A)

5 5 Division  A somewhat messy operation that can be expressed in terms of the operations we have already defined  Used to express queries such as “The fid's of faculty who have taught all subjects”  Paraphrased: “The fid’s of professors for which there does not exist a subject that they haven’t taught”

6 6 Division Using Our Existing Operators  All possible teaching assignments: Allpairs:  NotTaught, all (fid,subj) pairs for which professor fid has not taught subj:  Answer is all faculty not in NotTaught:  fid,subj (PROFESSOR £  subj (COURSE)) Allpairs -  fid,subj (Teaches ⋈ COURSE)  fid (PROFESSOR) -  fid (NotTaught) ´  fid (PROFESSOR) -  fid (  fid,subj (PROFESSOR £  subj (COURSE)) -  fid,subj (Teaches ⋈ COURSE))

7 7 Division: R 1  R 2  Requirement: schema(R 1 ) ¾ schema(R 2 )  Result schema: schema(R 1 ) – schema(R 2 )  “Professors who have taught all courses”:  What about “Courses that have been taught by all faculty”?  fid (  fid,subj ( Teaches ⋈ COURSE)   subj (COURSE))

8 8 The Big Picture: SQL to Algebra to Query Plan to Web Page SELECT * FROM STUDENT, Takes, COURSE WHERE STUDENT.sid = Takes.sID AND Takes.cID = cid STUDENT Takes COURSE Merge Hash by cid Optimizer Execution Engine Storage Subsystem Web Server / UI / etc Query Plan – an operator tree

9 9 Hint of Future Things: Optimization Is Based on Algebraic Equivalences  Relational algebra has laws of commutativity, associativity, etc. that imply certain expressions are equivalent in semantics  They may be different in cost of evaluation!  c Ç d (R) ´  c (R) [  d (R)  c (R 1 £ R 2 ) ´ R 1 ⋈ c R 2  c Ç d (R) ´  c (  d (R))  Query optimization finds the most efficient representation to evaluate (or one that’s not bad)

10 10 Relational Calculus: A Logical Way of Expressing Query Operations  First-order logic (FOL) can also be thought of as a query language, and can be used in two ways:  Tuple relational calculus  Domain relational calculus  Difference is the level at which variables are used: for attributes (domains) or for tuples  The calculus is non-procedural (declarative) as compared to the algebra  More like what we’ll see in SQL  More convenient to express certain things

11 11 Domain Relational Calculus Queries have form: { | p} Predicate: boolean expression over x 1,x 2, …, x n  Precise operations depend on the domain and query language – may include special functions, etc.  Assume the following at minimum:  RX op Y X op constconst op X where op is , , , , ,  x i,x j,… are domain variables domain variables predicate

12 12 More Complex Predicates Starting with these atomic predicates, build up new predicates by the following rules:  Logical connectives: If p and q are predicates, then so are p  q, p  q,  p, and p  q  (x>2)  (x<4)  (x>2)   (x>0)  Existential quantification: If p is a predicate, then so is  x.p   x. (x>2)  (x<4)  Universal quantification: If p is a predicate, then so is  x.p   x.x>2   x.  y.y>x

13 13 Some Examples  Faculty ids  Course names for courses with students expecting a “C”  Courses taken by Jill

14 14 Logical Equivalences  There are two logical equivalences that will be heavily used:  p  q   p  q (Whenever p is true, q must also be true.)   x. p(x)   x.  p(x) (p is true for all x)  The second can be a lot easier to check!

15 15 Free and Bound Variables  A variable v is bound in a predicate p when p is of the form  v… or  v…  A variable occurs free in p if it occurs in a position where it is not bound by an enclosing  or   Examples:  x is free in x>2  x is bound in  x.x>y

16 16 Can Rename Bound Variables Only  When a variable is bound one can replace it with some other variable without altering the meaning of the expression, providing there are no name clashes  Example:  x.x>2 is equivalent to  y.y>2  Otherwise, the variable is defined outside our “scope”…

17 17 Safety  Pitfall in what we have done so far – how do we interpret: { |   STUDENT}  Set of all binary tuples that are not students: an infinite set (and unsafe query)  A query is safe if no matter how we instantiate the relations, it always produces a finite answer  Domain independent: answer is the same regardless of the domain in which it is evaluated  Unfortunately, both this definition of safety and domain independence are semantic conditions, and are undecidable

18 18 Safety and Termination Guarantees  There are syntactic conditions that are used to guarantee “safe” formulas  The definition is complicated, and we won’t discuss it; you can find it in Ullman’s Principles of Database and Knowledge- Base Systems  The formulas that are expressible in real query languages based on relational calculus are all “safe”  Many DB languages include additional features, like recursion, that must be restricted in certain ways to guarantee termination and consistent answers

19 19 Mini-Quiz How do you write:  Which students have taken more than one course from the same professor?  What is the highest course number offered?

20 20 Translating from RA to DRC  Core of relational algebra: , , , x, -  We need to work our way through the structure of an RA expression, translating each possible form.  Let TR[e] be the translation of RA expression e into DRC.  Relation names: For the RA expression R, the DRC expression is { |  R}

21 21 Selection: TR[   R]  Suppose we have   (e’), where e’ is another RA expression that translates as: TR[e’]= { | p}  Then the translation of  c (e’) is { | p  ’} where  ’ is obtained from  by replacing each attribute with the corresponding variable  Example: TR[  #1=#2  #4>2.5 R] (if R has arity 4) is { |  R  x 1 =x 2  x 4 >2.5}

22 22 Projection: TR[  i 1,…,i m (e)]  If TR[e]= { | p} then TR[  i 1,i 2,…,i m (e)]= { |  x j 1,x j 2, …, x j k.p}, where x j 1,x j 2, …, x j k are variables in x 1,x 2, …, x n that are not in x i 1,x i 2, …, x i m  Example: With R as before,  #1,#3 (R)={ |  x 2,x 4.  R}

23 23 Union: TR[R 1  R 2 ]  R 1 and R 2 must have the same arity  For e 1  e 2, where e 1, e 2 are algebra expressions TR[e 1 ]={ |p} and TR[e 2 ]={ |q}  Relabel the variables in the second: TR[e 2 ]={ |q’}  This may involve relabeling bound variables in q to avoid clashes TR[e 1  e 2 ]={ |p  q’}.  Example: TR[R 1  R 2 ] = { |  R 1   R 2

24 24 Other Binary Operators  Difference: The same conditions hold as for union If TR[e 1 ]={ |p} and TR[e 2 ]={ |q} Then TR[e 1 - e 2 ]= { |p  q}  Product: If TR[e 1 ]={ |p} and TR[e 2 ]={ |q} Then TR[e 1  e 2 ]= { | p  q}  Example: TR[R  S]= { |  R   S }

25 25 Summary  Can translate relational algebra into (domain) relational calculus.  Given syntactic restrictions that guarantee safety of DRC query, can translate back to relational algebra  These are the principles behind initial development of relational databases  SQL is close to calculus; query plan is close to algebra  Great example of theory leading to practice!

26 26 Limitations of the Relational Algebra / Calculus Can’t do:  Aggregate operations  Recursive queries  Complex (non-tabular) structures  Most of these are expressible in SQL, OQL, XQuery – using other special operators  Sometimes we even need the power of a Turing- complete programming language


Download ppt "Relational Algebra Wrap-up and Relational Calculus Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems September 11, 2003."

Similar presentations


Ads by Google