Presentation is loading. Please wait.

Presentation is loading. Please wait.

YV - Relational Model and Algebra 110 Κεφάλαιο 3 ΣΧΕΣΙΑΚΟ ΜΟΝΤΕΛΟ.

Similar presentations


Presentation on theme: "YV - Relational Model and Algebra 110 Κεφάλαιο 3 ΣΧΕΣΙΑΚΟ ΜΟΝΤΕΛΟ."— Presentation transcript:

1 YV - Relational Model and Algebra 110 Κεφάλαιο 3 ΣΧΕΣΙΑΚΟ ΜΟΝΤΕΛΟ

2 YV - Relational Model and Algebra 111 DATABASE SYSTEMS: The Relational Model and Relational Database Systems OUTLINE –Informal and Formal Definition of the Model Structures, Constraints, Operations –Relational Algebra –Relational Calculus –The languages SQL and QBE –Views - Integrity Constraints using SQL –Normalization and Relational Database Design –Relational Database Systems

3 YV - Relational Model and Algebra 112 Relational Model: Informal Definition Proposed in 1970 by E.F. Codd (“A relational model for large shared data banks”, CACM), as a theory of a database model Spurred tremendous research in the database field and became the most popular logical data model - many relational DBMSs are today available on nearly all platforms A relational database is a set of relations RELATION: A table of values. Each column in the table has a header, called an attribute (field). Each row in the table is called a tuple (record) and stands for an entity or a relationship.

4 YV - Relational Model and Algebra 113 Formal Definition STRUCTURES –Only one kind: relations (which have a name) A domain D is a set of values, D= {d 1, d 2,..., d n } e.g., DOMAIN OF NAMES = the set of all names DOMAIN of WEIGHT = the set of all weights CHAR STRINGS from 1 to 10 in length, etc. An attribute A names a property of interest in a relation and takes its values from some associated domain D(A). e.g., EMPLOYEE_NAME, WEIGHT, etc. Attributes are the column names (headers) in a relation (Notation: R.A, or R[A] where R is the relation name)

5 YV - Relational Model and Algebra 114 Structure Definitions (2) A relation schema R is the name and attributes of a relation, with the underlying domains for the attributes. When obvious, the domains are ignored Notation: R(A 1, A 2,... A n ) e.g., STUDENT(Name, SSN, BirthDate, Address) The degreee n of a relation R is the number of attributes in R A database schema S is a set of relation schemas. Notation: S = {R 1, R 2,... R m } e.g., COMPANY = { EMPLOYEE, PROJECT,... }

6 YV - Relational Model and Algebra 115 Structure Definitions (3) -- A tuple t of a relation R(A 1, A 2,... A n ) is an (ordered) set of values t =, where each v i is an element of the domain D(A i ). -- A relation instance r(R), simply, relation, is a set of tuples r(R) = { t 1, t 2,... t k } alternatively, it is a subset of the Cartesian product r(R)  D(A 1 ) x D(A 2 ) x... x D(A n ) -- The cardinality of R is the number of tuples in r(R), it is denoted by CARD R -- A relational database is a set of relations (instances)

7 YV - Relational Model and Algebra 116 Characteristics of Relations ORDERING of attributes in a relational schema is essential ORDERING of tuples in a relation is not important Every tuple is stored only ONCE in a relation (it is a set) A value may appear MULTIPLE TIMES in a column and is considered ATOMIC (indivisible) - at times this is referred as First Normal Form (1-NF) relation A special value, called NULL, is used to represent values that are inapplicable or unknown to the database –e.g, the PhoneNumber value of someone without a phone, Address value for someone who did not supply his address Notation: component value of a tuple t, t[A i ] = v i

8 YV - Relational Model and Algebra 117 Constraints in the Model CONSTRAINTS –Three kinds of inherent to the model constraints: KEY, ENTITY INTEGRITY, and REFERENTIAL INTEGRITY. –Three basic explicit constraints: DOMAIN, COLUMN and USER-DEFINED (some other explicit constraints, like the Functional Dependencies, will be discussed later.) KEY CONSTRAINTS: The various keys, as defined for Entities and Relationships, hold in the Relational Model. –Note that a key is a property of the relational schema (not a property of the relation)

9 YV - Relational Model and Algebra 118 Inherent Constraints (1) –A set of attributes SK of a relation schema R for which each tuple in any relation instance r(R) must have unique value(s) is a superkey. That is, for distinct t 1 and t 2, t 1 [SK]   t 2 [SK] For instance, SSN of EMPLOYEE, NAME and ADDRESS of EMPLOYEE, SSN and NAME of EMPLOYEE, etc. –A candidate key K is a minimal superkey (that is, no subset of the attributes in K is a superkey). K is also called key. For instance, SSN is a candidate key for EMPLOYEE, but the combination {SSN, NAME} is not. –A primary key PK is one of the candidate keys that is agreed to serve as an identifier for the relation (primary keys are usually distinguished by underlining) For instance, SSN is the primary key of relation EMPLOYEE.

10 YV - Relational Model and Algebra 119 Inherent Constraints (2) ENTITY INTEGRITY: The primary key attributes PK in a relation schema R cannot have NULL values in any tuple of a relation instance r(R). t[PK]  NULL, for all t in r(R) –The reason behind the above constraint is that a primary key is used to identify a tuple in a relation. –Note that more attributes in R may be constrained to have no NULLS by explicit constraints.

11 YV - Relational Model and Algebra 120 Inherent Constraints (3) REFERENTIAL INTEGRITY: These constraints involve TWO relations and are used to specify a relationship among tuples of the two relations. They are also called foreign keys. –A foreign key FK is a set of one or more attributes of a relation R1 that forms a primary key for another relation R2. A tuple t 1 in r(R 1 ) is said to reference tuple t 2 in r(R 2 ), IF t 1 [FK] = t 2 [FK] For instance, for the relation WORKING-ON the attribute SSN is a foreign key (it is the primary key of EMPLOYEE).

12 YV - Relational Model and Algebra 121 Explicit Constraints (1) DOMAIN CONSTRAINTS: They are the rules defined in the domain definition and inherited by columns (attributes) based on that domain. –A domain can be defined, together with all its integrity rules (e.g., the domain of integers having all rules that apply to integers). They are usually the basic data types. –The ideal support is through strong data typing (very rare) COLUMN CONSTRAINTS: They are additional to the domain constraints and maintain values in a column. –Column rules go beyond the rules inherited by the domain (e.g., the column of small integers or integers between 1 and 10, etc. that further restrict the domain of integers.) –In many systems, support is given with a CHECK option

13 YV - Relational Model and Algebra 122 Explicit Constraints (2) USER-DEFINED CONSTRAINTS: Any integrity rule, not among the ones discussed before, is classified as user- defined. To enforce certain business rules, integrity constraints of arbitrary complexity are required. Such constraints are expressed either procedurally or declaratively (preferred way) Several mechanisms can be used to implement the enforcement of such rules: stored procedures, triggers, methods (for object-oriented systems) Generally, relational DBMS are weak in enforcing rules

14 YV - Relational Model and Algebra 123 The COMPANY Database in the Relational Model - Schema EMPLOYEE ( SSN, Name, BirthDate, Address, Sex, Salary, SupSSN, DNumber) DEPARTMENT ( DNumber, DName, MgrSSN, MgrStartDate) PROJECT ( PNumber, PName, Location, DNumber) DEPT_LOCATION ( DNumber, DLocation ) WORKS_ON ( SSN, PNumber, HoursPW) DEPENDENT ( SSN, DependName, Sex, BirthDate, Relationship)

15 YV - Relational Model and Algebra 124 The COMPANY Database in the Relational Model - Instance. DEPARTMENT EMPLOYEE DEPT_LOCATION

16 YV - Relational Model and Algebra 125 The COMPANY Database in the Relational Model - Instance. WORKS_ON PROJECT DEPENDENT

17 YV - Relational Model and Algebra 126 Definition of the Model: Operations OPERATIONS –We can distinguish them in (a) UPDATE, (b) RETRIEVAL UPDATE operations on Relations –INSERT a tuple –DELETE a tuple –MODIFY a tuple Integrity constraints should not be violated by the execution of any update operation. For this, updates may propagate to cause other updates automatically. –e.g., when a tuple of EMPLOYEE is deleted, all tuples in WORKING_ON which have the same value for SSN are also deleted (non-existent employees cannot work on projects!)

18 YV - Relational Model and Algebra 127 Operations: Relational Languages RETRIEVAL operations on Relations –There are two flavors: –(a) RELATIONAL ALGEBRA -- somewhat procedural, tells how to compute the result –(b) RELATIONAL CALCULUS -- somewhat declarative, tells what properties the result should have No database system supports the two flavors of retrieval languages in their pure forms. This is because the issues of “ease of use”, “convenience”, etc., play an essential role in user interaction. Yet, the languages supported in DBMSs have their roots in either relational algebra or calculus.

19 YV - Relational Model and Algebra 128 Relational Languages Query languages: Allow manipulation and retrieval of data from a database. Relational model supports simple, powerful QLs: –Strong formal foundation based on logic. –Allows for much optimization. Query Languages != programming languages! –QLs not expected to be “Turing complete”. –QLs not intended to be used for complex calculations. –QLs support easy, efficient access to large data sets.

20 YV - Relational Model and Algebra 129 Relational Algebra Operations RELATIONAL ALGEBRA A set of operators each of which maps one or more relations into a new relation (the algebra is CLOSED). The operators, just as in arithmetic algebra, can be nested, since the result of each operation is itself a relation. There are two types of operators: –traditional (regular) set operators »union, intersection, difference,... –database specific set operators »projection, selection, join,...

21 YV - Relational Model and Algebra 130 Relational Algebra: Set Ops Traditional Set Operators –Union, Intersection, Difference, Cartesian Product –For the first three operators to apply, we must have UNION COMPATIBILITY between the two operand relations. That is: R 1 ( A 1, A 2,..., A n ) and R 2 ( B 1, B 2,..., B n ) must have the same number of attributes and the domains of the corresponding attributes must be compatible, i.e., D(A i ) = D(B i ), for i = 1, 2,..., n –By convention, the resulting relation for these operators has the same attribute names as the first operand relation R 1

22 YV - Relational Model and Algebra 131 Relational Algebra : Example Database Database Schema STUDENT (SName, SAge), relation schema R INSTRUCTOR (IName, IAge) relation schema S Database Instance R (STUDENT) S (INSTRUCTOR) CARD R = 5 CARD S = 4

23 YV - Relational Model and Algebra 132 Relational Algebra: Set Ops (2) UNION - Put all the tuples of two relations in one relation –Notation: R  S –Formally: R  S = { t | t is in R or t is in S } –Example: STUDENT  INSTRUCTOR CARD R  S <= CARD R + CARD S = 

24 YV - Relational Model and Algebra 133 Relational Algebra: Set Ops (3) INTERSECTION - Put the common tuples of two relations in one relation –Notation: R  S –Formally: R  S = { t | t is in R and t is in S } –Example: STUDENT  INSTRUCTOR CARD R  S <= max(CARD R, CARD S ) = 

25 YV - Relational Model and Algebra 134 Relational Algebra: Set Ops (4) SET DIFFERENCE - Select the tuples of the first relation which are not members of the second relation –Notation: R  S –Formally: R  S = { t | t is in R and t is not in S } –Example: STUDENT  INSTRUCTOR CARD R  S <= CARD R = 

26 YV - Relational Model and Algebra 135 Relational Algebra: Set Ops (5) CARTESIAN PRODUCT - Combine each tuple of one relation with each tuple of the other –Notation: R 5 S –Formally: R 5 S = { t | t is the concatenation of a tuple in R with a tuple in S } –Example: STUDENT 5 INSTRUCTOR CARD R  5  S = CARD R x CARD S

27 YV - Relational Model and Algebra 136 Relational Algebra Operators: SELECTION SELECTION - Selects the subset of tuples of a relation that satisfy a certain condition (qualification) c, which is an arbitrary Boolean expression on the attributes of R (“horizontal” subset of R) –Notation:  c (R) or R[c] –Formally:  c (R) = { t | t is in r(R) and condition c holds for t } –Examples:  DNumber = 4 (EMPLOYEE),  Salary>30000 (EMPLOYEE)  (Salary>30000 AND DNumber = 4 ) OR DNumber = 5 (EMPLOYEE), EMPLOYEE [ Dnumber = 4 ], EMPLOYEE [ Salary > 30000 ]

28 YV - Relational Model and Algebra 137 Relational Algebra Operators: SELECTION (2) –Selection is both commutative and associative (a)  c1 (  c2 (R) ) =  c2 (  c1 (R) ) (b)  c1 (  c2 (R) ) =  c1 AND c2 (R) =  c1, c2 (R) (c)  c1 (  c2 (  c3 (R) ) ) ) =  c2 (  c3 (  c1 (R) ) ) ) –Example Result:  DNumber = 4 (EMPLOYEE) All Employees in department number 4

29 YV - Relational Model and Algebra 138 Relational Algebra Operators: PROJECTION PROJECTION - Keeps only certain attributes (specified by a list L) and eliminates the other attributes of a relation R and also all duplicate tuples (“vertical” subset of R) –Notation:  L (R)orR[L] –Formally:  L (R) = { t[L] | t is in r(R) and L  R } –Example:  Location (PROJECT), or PROJECT[Location] All Locations where projects are

30 YV - Relational Model and Algebra 139 Relational Algebra Operators: JOINS There are several types of JOINs - all combining two relations to form a new one: –(theta) join, equality join, natural join, semi-join, outer join THETA (CONDITION) JOIN: Connect tuples from two relations that match (satisfy a Boolean condition c) on certain attributes –A theta-join is equivalent to a Cartesian product followed by a selection on the condition c. –Notation: R c SorR [ c ] S –The resulting relation has ALL the attributes of R and of S

31 YV - Relational Model and Algebra 140 Relational Algebra Operators: THETA JOIN –Example: DEPARTMENT MgrSSN > MgrSSN DEPARTMENT All department-department combinations where the first department’s number is greater than the second’s

32 YV - Relational Model and Algebra 141 Relational Algebra Operators: EQUALITY JOIN EQUALITY JOIN: Connect tuples from two relations that match (have equal values) on certain attributes. This is exactly like THETA JOIN, except that the condition c is only allowed to have equalities. –Notation: R c SorR [ c ] S –Example: WORKS_ON HoursPW = DNumber PROJECT A totally MEANINGLESS Relation

33 YV - Relational Model and Algebra 142 Relational Algebra Operators: NATURAL JOIN NATURAL JOIN: Connect tuples from two relations that match (have equal values) on all common attributes. In the result, the common attributes are kept only once –Notation: R Sor R [ X = X ] S –Example: DEPARTMENT DEPT_LOCATION

34 YV - Relational Model and Algebra 143 Relational Algebra Operators: SEMI--JOIN SEMI-JOIN: Select the subset of one relation that joins with another. A semi-join is equivalent to a join followed by a projection. –Notation: R  c S orR < c ] S –Example: EMPLOYEE  SSN=MgrSSN DEPARTMENT Semi-joins are USEFUL in distributed database operations

35 YV - Relational Model and Algebra 144 Relational Algebra Operators: OUTER--JOIN Motivation: In a regular join operation, tuples in relations R or S that do not have matching tuples in the other relation do not appear in the result. In some queries, all tuples in R (or S) must appear in the result - when no matching tuples are found, NULLs are placed for the missing attribute values. –Notation: R  S OUTER-JOINs are distinguished in: –Left outer join(all tuples in R appear in the result) –Right outer join(all tuples in S appear) –Full outer join(all tuples in R and S appear)

36 YV - Relational Model and Algebra 145 Relational Algebra Operators: DIVISION DIVISION: Given relations R(X,Y) and S(Y), where X, Y are sets of attributes, a tuple t is a member of the division (denoted: (R  S)[X] ) IF for all t S in S there exist t R in R, such that: t R [Y] = t S [Y] and t R [X] = t [X] –Analogy with number arithmetic: The quotient q of a/b is the largest number s.t. qb <= a The quotient Q of R  S is the maximal relation s.t. Q X S  R

37 YV - Relational Model and Algebra 146 Relational Algebra Queries (1) A series of queries in relational algebra are presented in the sequel, using an example relational database that involves SAILORS who RESERVE some BOATS. SAILORS (Sid, SName, Rating) BOATS (Bid, BName, Color) RESERVE (Sid, Bid, Date)

38 YV - Relational Model and Algebra 147 Relational Algebra Queries (2) QUERY1: Find the names of sailors who have reserved boat number 2 ( RESERVE [Bid=2] [Sid=Sid] SAILORS ) [SName]  SName (  Bid=2 RESERVE Sid=Sid SAILORS ) QUERY2: Find the names of sailors who have reserved a red boat ( BOAT [Color=red] [Bid=Bid] RESERVE [Sid=Sid] SAILORS ) [SName]  SName (  Color=red BOAT  Bid=Bid RESERVE  Sid=Sid SAILORS )

39 YV - Relational Model and Algebra 148 Relational Algebra Queries (3) QUERY3: Find the colors of the boats reserved by eleni (SAILORS [SName=eleni] [Sid=Sid] RESERVE [Bid=Bid] BOATS) [Color]  Color (  SName=eleni SAILORS  Sid=Sid RESERVE  Bid=Bid BOATS ) QUERY4: Find the names of the sailors who have reserved at least one boat ( RESERVE [Sid=Sid] SAILORS ) [SName]  SName ( RESERVE  Sid=Sid SAILORS )

40 YV - Relational Model and Algebra 149 Relational Algebra Queries (4) QUERY5: Find the names of sailors who have reserved a red or a green boat  SName (  Color=red BOATS   Color=green BOATS )   Bid=Bid RESERVE  Sid=Sid SAILORS ) QUERY6: Find the names of sailors who have reserved both a red and a green boat  SName ( (  Sid (  Color=red BOATS  Bid=Bid RESERVE )   Sid (  Color=green BOATS  Bid=Bid RESERVE ) )  Sid=Sid SAILORS )

41 YV - Relational Model and Algebra 150 Relational Algebra Queries (5) QUERY7: Find the names of sailors who have reserved all boats  SName ( (  Sid, Bid RESERVE /  Bid BOATS )  Sid=Sid SAILORS ) QUERY8: Find the names and ratings of sailors who have reserved all red boats  SName, Rating (  Sid, Bid RESERVE /  Bid (  Color=red BOATS ) )  Sid=Sid SAILORS )

42 YV - Relational Model and Algebra 151 Relational Algebra: Comments There are several properties that hold in a relational algebra expression (commutatitivity, associativity, etc.) –Examples:  c1 (  L (R) ) =  L (  c1 (R) )  c1 ( R  c2 S ) =  c1 ( R )  c2 S  c1 (R  S) =  c1 (R)   c1 (S).... Such properties are very useful in query optimization

43 YV - Relational Model and Algebra 152 Relational Algebra: Comments COMPLETE SET OF OPERATIONS –The set of operators {  5} is called a complete set of relational algebra operations. The implication is that ALL other operators can be described as a sequence of the above operators. –For example, the division operator can be described as: R / S =  X (R)  ( (  X (R) 5 S)  R ) where X are the non-common attributes in R and S –Equivalently, it is expressed as: (R / S) [X] = R[X] - ( ( R[X] x S ) - R )[X]

44 YV - Relational Model and Algebra 153 Relational Algebra Competeness There are several combinations of relational algebra operators that define a complete set. Any Query Language equivalent to a complete set of operations is called RELATIONALLY COMPLETE –NOTE: This does not imply that the language is adequate to do all database operations (e.g., a good language must support aggregates, many forms of joins, built-in functions,...) An interesting operator -which goes beyond the expressive power of the relational set of operators as defined by Codd- is that of transitive closure. This is a form of recursion in relational databases and is very useful in many applications.


Download ppt "YV - Relational Model and Algebra 110 Κεφάλαιο 3 ΣΧΕΣΙΑΚΟ ΜΟΝΤΕΛΟ."

Similar presentations


Ads by Google