Presentation is loading. Please wait.

Presentation is loading. Please wait.

CM036: Advanced Database Lecture 3 Relational Algebra and SQL.

Similar presentations


Presentation on theme: "CM036: Advanced Database Lecture 3 Relational Algebra and SQL."— Presentation transcript:

1 CM036: Advanced Database Lecture 3 Relational Algebra and SQL

2 CM036: Advanced DatabasesLecture 3: Relational Languages 2 Content 1 Models, Languages and their Use in databases 2 The Relational Algebra (RA) 3 Simulating RA operations in SQL

3 CM036: Advanced DatabasesLecture 3: Relational Languages 3 1.1 Relational and other data models Relational models (relational algebra, relational calculus) – most of the contemporary RDBMS are based on them Tree models (hierarchical, object-relational) – both legacy systems and new systems use them Object-oriented models (ODMG) – recent development, still not widely employed Note: XML native databases have some similarities with the hierarchical database systems (legacy systems), but they have more elaborated model and query languages, which are close to OQL (the standard query language of object- oriented databases)

4 CM036: Advanced DatabasesLecture 3: Relational Languages 4 1.2 Relational Languages and their Use Data Manipulation Language (DML) Use: Populates, updates, and queries relational DB Example: relational algebra, SQL DML Data Definition Language (DDL) Use: Specifies the data structures and defines the relational schema Example: domain calculus, SQL DDL Data Control Language (DCL) Use: Specifies operation permissions, resource access discipline and user profiles Example: SQL DCL, LDAP Note: Contemporary relational languages often incorporate some object- relational features of the model – e.g. Oracle 8i SQL has types, Oracle 9i SQL has type inheritance

5 CM036: Advanced DatabasesLecture 3: Relational Languages 5 1.3 Can DB live without formal model? The answer is NO for several reasons: as we will see, SQL has ambiguities, while the relational algebra is unambiguous – so it can provide semantic interpretation for SQL; Moreover, because of the same reason, SQL cannot be executed directly, it needs to be translated into a realistic structure of operations first, which can then be interpreted; Finally, if we want to control the execution of the SQL statements, we need to know how it works.

6 CM036: Advanced DatabasesLecture 3: Relational Languages 6 2 The Relational Algebra Proposed by Codd in 1970 as a formal data model. Describes the relations and the operations to manipulate relations Relational operations in relational algebra transform either a single relation (unary operation), or a pair (binary operation) into another relation Can also be used to specify retrieval requests (queries). Query result is also in the form of a relation. Relational Operations: RESTRICT (  ) and PROJECT (  ) unary operations. Set operations: UNION (  ), INTERSECTION (  ), DIFFERENCE (—), CARTESIAN PRODUCT (  ). JOIN operations ( ⋈) are binary. Other relational operations: DIVISION, OUTER JOIN, AGGREGATE FUNCTIONS.

7 CM036: Advanced DatabasesLecture 3: Relational Languages 7

8 8 2.1 RESTRICT  RESTRICT operation (called also SELECT - denoted by  ): Selects the tuples (rows) from a relation R that satisfy a certain selection condition c on the attributes of R :  c (R) Resulting relation includes each tuple in R whose attribute values satisfy c, i.e. it has the same attributes as R Examples:  DNO=4 (EMPLOYEE)  SALARY>30000 (EMPLOYEE)  (DNO=4 AND SALARY>25000) OR DNO=5 (EMPLOYEE)

9 CM036: Advanced DatabasesLecture 3: Relational Languages 9 PROJECT operation (denoted by  ): Keeps only certain attributes (columns) from a relation R specified in an attribute list L:  L (R) Resulting relation has only those attributes of R specified in L Example:  FNAME,LNAME,SALARY (EMPLOYEE) The PROJECT operation eliminates duplicate tuples in the resulting relation so that it remains a true set (no duplicate elements). Example:  SEX,SALARY (EMPLOYEE) If several male employees have salary 30000, only a single tuple is kept in the resulting relation. 2.2 PROJECT 

10 CM036: Advanced DatabasesLecture 3: Relational Languages 10

11 CM036: Advanced DatabasesLecture 3: Relational Languages 11 2.3 Combining Operations Because of closure, several operations can be combined to form a relational algebra expression. For example, the names and salaries of employees in Department 4:  FNAME,LNAME,SALARY (  DNO=4 (EMPLOYEE)) Alternatively, we could specify explicit intermediate relations for each step: TEMP   DNO=4 (EMPLOYEE) R   FNAME,LNAME,SALARY (TEMP) Attributes can optionally be renamed in a left-hand-side relation (this may be required for some operations that will be presented later), e.g. R ( FIRSTNAME,LASTNAME,SALARY )   FNAME,LNAME,SALARY (TEMP)

12 CM036: Advanced DatabasesLecture 3: Relational Languages 12

13 CM036: Advanced DatabasesLecture 3: Relational Languages 13 2.4 Set Operations Binary operations from set theory: UNION: R 1  R 2,INTERSECTION: R 1  R 2, DIFFERENCE: R 1 — R 2, For , , —, the operand relations R 1 (A 1,..., A n ) and R 2 (B 1,..., B n ) must have the same number of attributes, and the domains of attributes must be compatible; that is, dom(A i )=dom(B i ) for i=1, 2,..., n. This condition is called union compatibility. The resulting relation for , , or — has the same attribute names as the first operand relation R 1 (by convention).

14 CM036: Advanced DatabasesLecture 3: Relational Languages 14

15 CM036: Advanced DatabasesLecture 3: Relational Languages 15 CARTESIAN PRODUCT R(A 1, A 2,..., A m, B 1,..., B n )  R 1 (A 1, A 2,..., A m )  R 2 (B 1,..., B n ) A tuple t exists in R for each combination of tuples t 1 from R 1 and t 2 from R 2 such that: t [A 1, A 2,..., A m ] = t 1 and t [B 1, B 2,..., B n ] = t 2 The resulting relation R has m 1 + m 2 columns If R 1 has n 1 tuples and R 2 has n 2 tuples, then R will have n 1 *n 2 tuples. More Set Operations ABCDE 123 234 345 ab bc X 123ab123bc234ab234bc345ab345bc 3 attributes + 2 attributes = 5 attributes 3 tuples * 2 tuples = 6 tuples = R1R2 ABCDER

16 CM036: Advanced DatabasesLecture 3: Relational Languages 16 Obviously, CARTESIAN PRODUCT is useless if alone, since it generates all possible combinations. It can combine related tuples from two relations in a more informative way if followed by the appropriate RESTRICT operation Example: Retrieve a list of the names of dependents for each female employee FEMALE_EMPS   SEX=‘F’ (EMPLOYEE) EMPNAMES   FNAME,LNAME,SSN (FEMALE_EMPS) EMP_DEPENDENTS  EMPNAMES  DEPENDENT ACTUAL_DEPENDENTS   SSN=ESSN (EMP_DEPENDENTS) RESULT   FNAME,LNAME,DEPENDENT_NAME (ACTUAL_DEPENDENTS) CARTESIAN PRODUCT – continued

17 CM036: Advanced DatabasesLecture 3: Relational Languages 17

18 CM036: Advanced DatabasesLecture 3: Relational Languages 18 2.5 JOIN Operations THETA JOIN: Similar to a CARTESIAN PRODUCT followed by a RESTRICT. The condition c is called a join condition. R(A 1, A 2,..., A m, B 1, B 2,..., B n )  R 1 (A 1, A 2,..., A m ) ⋈ c R 2 (B 1, B 2,..., B n ) EQUIJOIN: The join condition c includes equality comparisons involving attributes from R 1 and R 2. That is, c is of the form: (A i =B j ) AND... AND (A h =B k ); 1<i,h<m, 1<j,k<n In the above EQUIJOIN operation: A i,..., A h are called the join attributes of R 1 B j,..., B k are called the join attributes of R 2 Example: Retrieve each department's name and its manager's name: T  DEPARTMENT ⋈ MGRSSN=SSN EMPLOYEE RESULT   DNAME,FNAME,LNAME (T)

19 CM036: Advanced DatabasesLecture 3: Relational Languages 19 DEPT_MGR   DNAME,…,MGRSSN,…LNAME,…,SSN… (DEPARTMENT ⋈ MGRSSN=SSN EMPLOYEE)

20 CM036: Advanced DatabasesLecture 3: Relational Languages 20 In an EQUIJOIN R  R 1 ⋈ c R 2, the join attribute of R 2 appear redundantly in the result relation R. In a NATURAL JOIN, the join attributes of R 2 are eliminated from R. The equality is implied and there is no need to specify it. The form of the operator is R  R 1 * R 2 Example: Retrieve each project's details along with the details of its department: Step 1: (Rename DNUMBER to DNUM) DEPT (DNAME, DNUM, MGRSSN, MGRSTARTDATE)   DNAME, DNUMBER, MGRSSN, MGRSTARTDATE (DEPARTMENT) Step 2: (Now both DEPT and PROJECT have DNUM) PROJ_DEPT  PROJECT * DEPT Natural Join

21 CM036: Advanced DatabasesLecture 3: Relational Languages 21

22 CM036: Advanced DatabasesLecture 3: Relational Languages 22 Example: Retrieve each employee’s name and the name of the department he/she works for: T  EMPLOYEE ⋈ DNO=DNUMBER DEPARTMENT RESULT   FNAME,LNAME,DNAME (T) Multiple Join JOIN ATTRIBUTESRELATIONSHIP EMPLOYEE.SSN = DEPARTMENT.MGRSSN EMPLOYEE manages the DEPARTMENT EMPLOYEE.DNO = DEPARTMENT.DNUMBER EMPLOYEE works in the DEPARTMENT

23 CM036: Advanced DatabasesLecture 3: Relational Languages 23 A relation can have a set of join attributes to join it with itself : JOIN ATTRIBUTESRELATIONSHIP EMPLOYEE(1).SUPERSSN= EMPLOYEE(2) supervises EMPLOYEE(2).SSNEMPLOYEE(1) One can think of this as joining two distinct copies of the relation, although only one relation actually exists In this case, renaming can be useful Example: Retrieve each employee’s name and the name of his/her supervisor: SUPERVISOR(SSSN,SFN,SLN)   SSN,FNAME,LNAME (EMPLOYEE) T  EMPLOYEE ⋈ SUPERSSN = SSSN SUPERVISOR RESULT   FNAME,LNAME,SFN,SLN (T) Self Join

24 CM036: Advanced DatabasesLecture 3: Relational Languages 24 All the operations discussed so far can be described as a sequence of only the operations RESTRICT, PROJECT, UNION, SET DIFFERENCE, and CARTESIAN PRODUCT. Hence, the set { , , ,—,  } is called a complete set of relational algebra operations. Any query language equivalent to these operations is called relationally complete. For database applications, additional operations are needed that were not part of the original relational algebra. These include: 1. Aggregate functions and grouping. 2. OUTER JOIN and OUTER UNION. 2.6 Complete Set of Operations

25 CM036: Advanced DatabasesLecture 3: Relational Languages 25 2.7 More Relational Operations OUTER JOIN In a regular EQUIJOIN or NATURAL JOIN operation, tuples in R 1 or R 2 that do not have matching tuples in the other relation do not appear in the result. Some queries require all tuples in R 1 (or R 2 or both) to appear in the result When no matching tuples are found, nulls are placed for the missing attributes LEFT OUTER JOIN: R 1 R 2 lets every tuple in R 1 appear in the result RIGHT OUTER JOIN : R 1 R 2 lets every tuple in R 2 appear in the result FULL OUTER JOIN: R 1 R 2 lets every tuple in both R 1 and R 2 appear in the result

26 CM036: Advanced DatabasesLecture 3: Relational Languages 26 TEMP  EMPLOYEE SSN=MGRSSN DEPARTMENT RESULT   FNAME,MINIT,LNAME,DNAME (TEMP)

27 CM036: Advanced DatabasesLecture 3: Relational Languages 27 3 Simulating RA operations All relational algebra operations can be simulated in SQL using SELECT statements only. For this purpose the following variations of different parameters in the SELECT clauses can be used: The list of attributes to be selected in the SELECT clause The number of tables to be looked into in the FROM clause The conditions specified in the WHERE clause The resulting relation printed out or transferred row-by-row to some variables if INTO clause is present (e.g. using PL/SQL). There is no structural correspondence between the expressions of relational algebra and the SQL expressions; in a single SQL statement the following combinations are possible: single unary operator applied to one single relation single binary operator applied to pair of relations sequence of unary operators applied inner side out to one relation and all the intermediate results combinations of binary and unary operators applied to several relations and the intermediate results according nesting rules

28 CM036: Advanced DatabasesLecture 3: Relational Languages 28 3.1 Summary of RA vs. SQL


Download ppt "CM036: Advanced Database Lecture 3 Relational Algebra and SQL."

Similar presentations


Ads by Google