RELATIONAL ALGEBRA (III) Prof. Sin-Min LEE Department of Computer Science.

Slides:



Advertisements
Similar presentations
Schema Refinement: Normal Forms
Advertisements

Schema Refinement: Canonical/minimal Covers
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
1 Design Theory. 2 Minimal Sets of Dependancies A set of dependencies is minimal if: 1.Every right side is a single attribute 2.For no X  A in F and.
Database Management COP4540, SCS, FIU Functional Dependencies (Chapter 14)
Properties of Armstrong’s Axioms Soundness All dependencies generated by the Axioms are correct Completeness Repeatedly applying these rules can generate.
RELATIONAL ALGEBRA (II) Prof. Sin-Min LEE Department of Computer Science.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 227 Database Systems I Design Theory for Relational Databases.
CS Algorithm : Decomposition into 3NF  Obviously, the algorithm for lossless join decomp into BCNF can be used to obtain a lossless join decomp.
Normalization DB Tuning CS186 Final Review Session.
Relational Design. DatabaseDesign Process Conceptual Modeling -- ER diagrams ER schema transformed to relational schema Designer may add additional integrity.
Normalization DB Tuning CS186 Final Review Session.
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
Normal Form Design addendum by C. Zaniolo. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Normal Form Design Compute the canonical cover.
1 Normalization Chapter What it’s all about Given a relation, R, and a set of functional dependencies, F, on R. Assume that R is not in a desirable.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
RELATIONAL ALGEBRA (III) Prof. Sin-Min LEE Department of Computer Science.
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
RELATIONAL ALGEBRA (II) Prof. Sin-Min LEE Department of Computer Science.
Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E.
RELATIONAL ALGEBRA (II) Prof. Sin-Min LEE Department of Computer Science.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
Database Systems Normal Forms. Decomposition Suppose we have a relation R[U] with a schema U={A 1,…,A n } – A decomposition of U is a set of schemas.
Normal Forms1. 2 The Problems of Redundancy Redundancy is at the root of several problems associated with relational schemas: Wastes storage Causes problems.
Chapter 2 Adapted from Silberschatz, et al. CHECK SLIDE 16.
Schema Refinement and Normalization. Functional Dependencies (Review) A functional dependency X  Y holds over relation schema R if, for every allowable.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
Schema Refinement and Normal Forms Chapter 19 1 Database Management Systems 3ed, R.Ramakrishnan & J.Gehrke.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 7: Relational.
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
THIRD NORMAL FORM (3NF) A relation R is in BCNF if whenever a FD XA holds in R, one of the following statements is true: XA is a trivial FD, or X is.
CSE314 Database Systems The Relational Algebra and Relational Calculus Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson Ed Slide Set.
Normal Forms through BCNF CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
Computing & Information Sciences Kansas State University Tuesday, 27 Feb 2007CIS 560: Database System Concepts Lecture 18 of 42 Tuesday, 27 February 2007.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com ICOM 5016 – Introduction.
R EVIEW. 22 Exam Su 3:30PM - 6:30PM 2010/12/12 Room C9000.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
1 Functional Dependencies and Normalization Chapter 15.
Christoph F. Eick: Functional Dependencies, BCNF, and Normalization 1 Functional Dependencies, BCNF and Normalization.
1 Schema Refinement and Normal Forms Week 6. 2 The Evils of Redundancy  Redundancy is at the root of several problems associated with relational schemas:
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 2: Intro to Relational.
Chapter 2 Introduction to Relational Model. Example of a Relation attributes (or columns) tuples (or rows) Introduction to Relational Model 2.
Chapter 2: Intro to Relational Model. 2.2 Example of a Relation attributes (or columns) tuples (or rows)
Copyright, Harris Corporation & Ophir Frieder, The Process of Normalization.
1 CS 430 Database Theory Winter 2005 Lecture 5: Relational Algebra.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Functional Dependencies CIS 4301 Lecture Notes Lecture 8 - 2/7/2006.
Ch 7: Normalization-Part 1
Huffman code and Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke1 Schema Refinement and Normal Forms Chapter 19.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 2: Intro to Relational.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 CS 430 Database Theory Winter 2005 Lecture 8: Functional Dependencies Second, Third, and Boyce-Codd Normal Forms.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
RELATIONAL ALGEBRA (II)
Advanced Normalization
Relational Database Design
CS 480: Database Systems Lecture 22 March 6, 2013.
Advanced Normalization
Functional Dependencies and Normalization
Relational Database Design
Chapter 7a: Overview of Database Design -- Normalization
CS4222 Principles of Database System
Presentation transcript:

RELATIONAL ALGEBRA (III) Prof. Sin-Min LEE Department of Computer Science

Unary Relational Operations: SELECT and PROJECT  The PROJECT Operation  Sequences of Operations and the RENAME Operation  The SELECT Operation

Relational Algebra Operations from Set Theory  The UNION, INTERSECTION, and MINUS Operations  The CARTESIAN PRODUCT (or CROSS PRODUCT) Operation

Binary Relational Operations: JOIN and DIVISION  The JOIN Operation  The EQUIJOIN and NATURAL JOIN Variations of JOIN  A Complete Set of Relational Algebra Operations  The DIVISION Operation

Additional Relational Operations  Aggregate Functions and Grouping  Recursive Closure Operations  OUTER JOIN Operations  The OUTER JOIN Operation

SPECIAL RELATIONAL OPERATORS The following operators are peculiar to relations: - Join operators There are several kind of join operators. We only consider three of these here (others will be considered when we discuss null values): - (1) Condition Joins - (2) Equijoins - (3) Natural Joins - Division

JOIN OPERATORS Condition Joins: Condition Joins: - Defined as a cross-product followed by a selection: R ⋈ c S = σ c (R  S) ( ⋈ is called the bow-tie) R ⋈ c S = σ c (R  S) ( ⋈ is called the bow-tie) where c is the condition. - Example: Given the sample relational instances S1 and R1 The condition join S ⋈ S1.sid<R1.sid R1 yields

JOIN OPERATORS Condition Joins: Condition Joins: - Defined as a cross-product followed by a selection: R ⋈ c S = σ c (R  S) ( ⋈ is called the bow-tie) R ⋈ c S = σ c (R  S) ( ⋈ is called the bow-tie) where c is the condition. - Example: Given the sample relational instances S1 and R1 The condition join S ⋈ S1.sid<R1.sid R1 yields

Equijoin: Special case of the condition join where the join condition consists solely of equalities between two fields in R and S connected by the logical AND operator ( ∧ ). Example: Given the two sample relational instances S1 and R1 The operator S1 R.sid=Ssid R1 yields

Natural Join Natural Join - Special case of equijoin where equalities are implicitly specified on all fields having the same name in R and S. - The condition c is now left out, so that the “bow tie” operator by itself signifies a natural join. - N. B. If the two relations have no attributes in common, the natural join is simply the cross-product.

Functional Dependency  holds on schema R if, in any legal relation r(R), for all pairs of tuples t 1 and t 2 in r such that t 1 [  ] = t 2 [  ], it is also the case that t 1 [  ] = t 2 [  ].

Functional Dependencies  FDs defined over two sets of attributes: X, Y  R  Notation: X  Y reads as “ X determines Y ”  If X  Y, then all tuples that agree on X must also agree on Y XYZXYZ R

A  C, but not C  A ABCD t0t0t0t0 a1a1a1a1 b1b1b1b1 c1c1c1c1 d1d1d1d1 t1t1t1t1 a1a1a1a1 b2b2b2b2 c1c1c1c1 d2d2d2d2 t2t2t2t2 a2a2a2a2 b2b2b2b2 c2c2c2c2 d2d2d2d2 t3t3t3t3 a2a2a2a2 b2b2b2b2 c2c2c2c2 d3d3d3d3 t4t4t4t4 a3a3a3a3 b3b3b3b3 c3c3c3c3 d4d4d4d4 t5t5t5t5 a4a4a4a4 b3b3b3b3 c3c3c3c3 d4d4d4d4

Minimal cover: A  B, D  B {A, B, C, D} is a candidate key R(ABCD)

XYZXYZ XYZXYZ Functional Dependencies Graph (example)

Closure  Let F be a set of functional dependencies.  The closure of F, denoted by F +, is the set of all functional dependencies logically implied by F.

Minimal cover  The concept of minimal cover of F is sometimes called Irreducibe Set of F. To find the minimal cover of a set of functional dependencies F, we transform F such that each FD in it that has more than one attribute in the right hand side is reduced to a set of FDs that have only one attribute on the right hand side.

 The minimal cover of F is then a set of FDs such that:  (a) every right hand side of each dependency is a single attribute;  (b) for no X -> A in F is the set F - {X -> A} equivalent to F;  (c) for no X -> A in F and proper subset Z of X is F - {X -> A} U {Z -> A} equivalent to F.

ALGORITHM. Finding a minimal cover G for F 1. set G := F; 2. replace each functional dpendency X->A1,A2,...,An in G by the n functional dependencies X->A1,X->A2,...,X->An; 3. for each functional dependency X -> A in G for each attribute B that is an element of X {if G is equivalent to ((G - (X->A)) UNION ((X-B)->A)) then replace X->A with (X-B)->A in G} 4. for each remaining functional dependency X -> A in G {compute X+ with respect to the set of dependencies (G - (X->A)); if X+ contains A, then remove X->A from G} Note: In step 3 to determine if G is equivalent to ((G - (X->A)) UNION ((X-B)->A)) you need to see if (X-B)+ in G contains A. If it does then they are equivalent.

R(A,B,C,D) F={AB->CD} Following algorithm: 2. G={AB->C,AB->D} 3. a) Try to replace AB->C with B->C: {AB->C,AB->D} is not equivalent to {B->C,AB->D} Note that B+ wrt (with respect to) G ={B}. Since it does not contain C, they are not equivalent. b) Try to replace AB->C with A->C: {AB->C,AB->D} is not equivalent to {A->C,AB->D} Note that A+ wrt G ={A}. Since it does not contain C, they are not equivalent. c) Try to replace AB->D with B->D: {AB->C,AB->D} is not equivalent to {AB->C,B->D} Note that B+ wrt G ={B}. Since it does not contain D, they are not equivalent.

d) Try to replace AB->D with A->D: {AB->C,AB->D} is not equivalent to {AB->C,A->D} Note that A+ wrt G ={A}. Since it does not contain D, they are not equivalent. Therefore, we can not make any changes to G in this step. 4. a) Try to remove AB->C: We can do this if G is equivalent to H={AB->D} However, AB+ wrt G = {A,B,C,D} <> AB+ wrt H = {A,B,D} b) Try to remove AB->D: We can do this if G is equivalent to I={AB->C} However, AB+ wrt G = {A,B,C,D} <> AB+ wrt I = {A,B,C} Therefore we can not make any changes to G in this step. Therefore, F is a minimal cover.

R={A,B,C,D,E,F} G={AB->C,B->CD,D->EF,B->F} ALTERNATIVE I (Synthesis Approach - p 422 in Ramakrishnan) Place into Minimal Cover (p420 in Ramakrishnan Book): 1) G1={AB->C,B->C,B->D,D->E,D->F,B->F} 2) Remove extra attributes on LHS; AB->C: Can remove A as B+ in G1 does contain C. Thus we get G2={B->C,B->D,D->E,D->F,B->F} Don't need to look at remaining FD because all only have 1 attribute on LHS.

3) Remove extra FD from G2: B->C: Can't be removed since B+ would then not contain C. No other FD in G2 have C on RHS. B->D: Can't be removed since B+ would then not contain D. No other FD in G2 have D on RHS. D->E: Can't be removed since D+ would then not contain E. No other FD in G2 have E on RHS. D->F: Can't be removed since D+ would not contain F. B->F: Can be removed since B->D,D->F. Thus we have the Minimal Cover: G3={B->C,B->D,D->E,D->F} We now decompose. We get: R1={B,C,D} R2={D,E,F}

Note that A is not in either scheme, also we need to add another scheme because neither of these contain a candidate key. Since A,B are not on the RHS any candidate key must contain them. AB+={A,B,C,D,E,F}=R. Thus AB is the key. Since no scheme contains AB, we must add one more scheme: R3={A,B} So we have: R1={B,C,D}, F1={B->C,B->D} R2={D,E,F}, F2={D->E,D->F} R3={A,B}, F3={} Note that this is dependency preserving as F1 union F2 = G3. Is this lossless? (See p 414 in Ramakrishnan) R1 intersect R2 = {D} and D-> R2 (R1 union R2) intersect R3 = {B} and B->{B,C,D,E,F} Therefore this is lossless

ALTERNATIVE II (Decomposition Approach - p 421 in Ramakrishnan) Decompose R using BCNF Decomposition (p416 in Ramakrishnan) From above we know that AB is the key. So we need to look at any FD in F where the LHS is not a superkey. We have the following that are not superkeys: {B->CD,D->EF,B->F} Decompose using B->CD into: R1={A,B,E,F}, R2={B,C,D} F1={B->EF}, F2={B->CD} R2 is in BCNF, but R1 is not. So we split R1 into: R6={A,B}, R7{B,E,F} F6={}, F7={B->EF} We put F into minimal cover: G3={B->C,B->D,D->E,D->F}

The following functional dependencies are not preserved: {D->E,D->F} So we create a relation scheme for each of these: R3={D,E}, R4={D,F} F3={D->E}, F4={D->F} Combining R3 and R4, we get: R6={A,B}, F1={} R7={B,E,F}, F1={B->EF} R2={B,C,D}, F2={B->CD} R5={D,E,F}, F5={D->EF} This is dependency preserving and lossless.

Candidate Keys  an attribute (or set of attributes) that uniquely identifies a row  primary key is a special candidate key values cannot be null values cannot be null  e.g. ENROLL (Student_ID, Name, Address, …) ENROLL (Student_ID, Name, Address, …) PK = Student_IDPK = Student_ID candidate key = Name, Addresscandidate key = Name, Address

… candidate key  a candidate key must satisfy: unique identification. unique identification. implies that each nonkey attribute is functionally dependent on the key (for not(A  B) to be true, A must occur more than once (with a different B), or A must map to more than one B in a given row)implies that each nonkey attribute is functionally dependent on the key (for not(A  B) to be true, A must occur more than once (with a different B), or A must map to more than one B in a given row) nonredundancy nonredundancy no attribute in the key can be deleted and still be uniqueno attribute in the key can be deleted and still be unique minimal set of columns (Simsion)minimal set of columns (Simsion)

keys and dependencies EMPLOYEE1 (Emp_ID, Name, Dept_Name, Salary) Emp_ID Emp_ID Name Name Dept_Name Dept_Name Salary Salary functional dependency determinant

EMPLOYEE2 (Emp_ID, Course_Title, Name, Dept_Name, Salary, Date_Completed) Emp_IDCourse_TitleName Dept_ Dept_ Name Name Salary Salary Date_ Comp. not fully functionally dependant on the primary key

Trivial Functional Dependency  In general, a functional dependency of the form  is trivial if     (Example) A  B, BC  C