Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra.

Similar presentations


Presentation on theme: "1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra."— Presentation transcript:

1 1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra

2 2 What is the Relational Algebra? Answer: A collection of operations that can be applied to Relations yielding new Relations What’s the idea behind the Relational Algebra?  Define a complete universe of operations on relations  Define notion of Relationally Complete: A system that can do anything that can be done with the Relational Algebra

3 3 What are the Operations? Original Operations (as defined by Codd):  SELECT or RESTRICT(  )  PROJECT  RENAME  Set Operations UNION, INTERSECTION, and MINUS or DIFFERENCE  CARTESIAN PRODUCT  Joins JOIN or THETA JOIN, EQUIJOIN, NATURAL JOIN  DIVISION

4 4 What are the Operations? Additional Operations:  AGGREGATE  OUTER JOIN (and OUTER UNION)  EXTEND (not in book)  Recursive Closure

5 5 SELECT  (R)  is a predicate (Boolean condition) on the attributes of the relation R  Result is a relation with just those tuples of R that satisfy Examples:   (DNO = 5 AND SALARY > 30000) (EMPLOYEE)

6 6 Notes for SELECT Booleans AND, OR, NOT have usual interpretation  (  (R)) =  (  (R)) =  ( AND ) (R)

7 7 PROJECT  (R)  is a list of some subset of the attributes of R  Result is a relation with only those columns named in the attribute list Order of columns is as given in the attribute list Examples:   ( EMPLOYEE )

8 8 Notes for PROJECT Duplicates are eliminated  The number of rows after a projection is always less than or equal to the number of rows in the original relation   (  ( R )) =  ( R )

9 9 Sequences of Relational Operations and RENAME R1  Relational Expression  Defines an intermediate relation R1  Columns named are determined by the expression R1(A 1, …, A n )  Relational Expression  Columns are named A 1, …, A n Book defines RENAME operation:   S(B1, …, Bn) (R), or   (B1, …, Bn) (R), or   S (R)

10 10 Set Theory A relation is a set of tuples Two relations R(A1,…, An) and S(B1,…, Bn) are union compatible if dom(Ai) = dom(Bi) for all I  Concept is that the tuples of R and S have the same type If two relations are union compatible, we can define their UNION(  ), INTERSECTION(  ), and DIFFERENCE (MINUS, -) Attribute Names are determined by attribute names of the first relation

11 11 More Set Theory Usual Set Theory identities hold (possibly with appropriate attribute renaming): R  S = S  R, (R  S)  T = R  (S  T) R  S = S  R, (R  S)  T = R  (S  T) R - (S  T) = (R - S)  (R - T) R - (S  T) = (R - S)  (R - T)

12 12 CARTESIAN PRODUCT Given R(A 1,…, A m ) and S(B 1,…, B n ) the Cartesian Product R  S is the table with attributes (A 1,…, A m, B 1,…, B n ) and one row for every combination of a row in R and a row in S  This assumes that the A i and B j are distinct

13 13 Example Get all female employees who have dependents, together with their dependent’s names: FEMALE_EMPS   (EMPLOYEE) EMPNAMES   ( FEMALE_EMPS ) EMP_DEPENDENTS  EMPNAMES  DEPENDENT ACTUAL_DEPENDENTS   (EMP_DEPENDENTS) RESULT   ( ACTUAL_DEPENDENTS ) See Figure 6.5, Text Book

14 14 Joins Join two tables  Generalization of Cartesian Product JOIN(R, S, )  Same as SELECT(, R  S)  usually has form and and … and is of form A i  B j  is a comparison operator This general kind of join is called a  -JOIN (THETA JOIN)

15 15 More types of Joins EQUIJOIN:  -JOIN where all comparisons are for equality (=)  Note: EQUIJOIN has redundant attributes NATURAL JOIN  Standard Definition: EQUIJOIN with same named attributes, eliminating redundant attributes Non-standard: include renaming of attributes  Notation: R*S Examples:  PROJ_DEPT  PROJECT * DEPARTMENT  DEPT_LOCS  DEPARTMENT * DEPT_LOCATIONS

16 16 Division Used for universal quantification  E.g. Find all employees that work on all projects that … Given relations R( X ), S( Y ) with X  Y Let Z = X - Y, that is Z is the set of attributes of R that are not attributes of S T( Z ) is the set of all tuples t T such that for every t S in S there is a tuple t R in R such that t R [ Z ] = t T and t R [ Y ] = t S Alternately, T is the biggest table such that T  S  R Written as T  R  S

17 17 Picture of Division RAB a1b1 a2b1 a3b1 a4b1 a1b2 a3b2 a2b3 a3b3 a4b3 a1b4 a2b4 a3b4 SA a1 a2 a3 TB b1 b4 T  R  ST  R  S

18 18 Minimum Set of Operations We have more operations than we (minimally) need Examples:  Join can be defined using  (Cartesian product) and  (selection)  Divide: T1   Z (R) T2   Z ((S  T1) - R) T  T1 - T2

19 19 Aggregation and Grouping Aggregation or Summarization Functions:  SUM, AVERAGE, MIN, MAX, COUNT, and others Grouping of tuples  Group all tuples that have the same value in some subset of the columns E.g. group all employees in the same department Aggregation and Grouping cannot be expressed with the prior set of operations

20 20 Aggregate Function Operation AGGREGATE(,, R)  is list of pairs is an aggregation function is an attribute of R  is a list of attributes that group the tuples of R  The result is a relation with one attribute for each grouping attribute plus one attribute for each function Book notation:  (R)

21 21 Example: Get Number of Employees and Average Salary by Department  AGGREGATE( DNO, COUNT SSN, AVERAGE SALARY, EMPLOYEE)

22 22 Notes on Aggregation Duplicates are not eliminated before applying the aggregation function  This gives functions like SUM and AVERAGE their normal interpretation The result of aggregation is a relation, even if it consists of a single value  E.g. get the average salary: AGGREGATE(, AVERAGE SALARY, EMPLOYEE) Yields a table with one tuple with one attribute

23 23 Outer Join A JOIN eliminates tuples in one table that have no match in the other table  Example: Natural Join (R*S)  Tuples with NULL join attributes are also eliminated An OUTER JOIN keeps unmatched tuples in either R, S or both  Additional attributes are padded with null attributes  LEFT (RIGHT) OUTER JOINs keep the unmatched tuple in the first (second) table being joined

24 24 Outer Join Example:  DEPARTMENT (LEFT OUTER JOIN) DEPT_LOCATIONS would preserve departments that had no associated location Notes:  An OUTER JOIN can (almost) be constructed from the original operations It’s the union of the standard join and the unmatched rows extended with nulls

25 25 Outer Union Union of two relations which are not union compatible Outer Union of R( X, Y ) and S( X, Z ) is T( X, Y, Z )  Tuples are matched if the common attributes match

26 26 EXTEND Extend a table with additional attributes EXTEND(R,, )  Add a column to R with name and value  is an expression using the attributes of R EXTEND is not expressible using the original operations EXTEND provides a mechanism for performing arithmetic using attributes that is otherwise missing  Could be expressed as a join if our Universe contained the appropriate (infinite) relations containing results of computations

27 27 Recursive Closure Examples:  Find all employees who work for (either directly or indirectly) a specific manager  Find all the constituent parts of a given part Including parts of subassemblies, etc. etc. Relational Algebra can express any fixed depth of recursion The SQL3 standard includes a syntax for recursive closure  No standard syntax as part of the relational algebra

28 28 Examples of Relational Algebra See Examples Section 6.5 of Text Book


Download ppt "1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra."

Similar presentations


Ads by Google