SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.

Slides:



Advertisements
Similar presentations
CS 319: Theory of Databases
Advertisements

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
Lecture 21 CS 157 B Revision of Midterm3 Prof. Sin-Min Lee.
Normalization Decomposition techniques for ensuring: Lossless joins Dependency preservation Redundancy avoidance We will look at some normal forms: Boyce-Codd.
Announcements Read 6.1 – 6.3 for Wednesday Project Step 3, due now Homework 5, due Friday 10/22 Project Step 4, due Monday Research paper –List of sources.
Logical Database Design (3 of 3) John Ortiz. Lecture 7Logical Database Design (2)2 Normalization  If a relation is not in BCNF or 3NF, we refine it by.
Schema Refinement and Normal Forms Given a design, how do we know it is good or not? What is the best design? Can a bad design be transformed into a good.
NORMALIZATION. Normalization Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations.
4NF and 5NF Prof. Sin-Min Lee Department of Computer Science.
1 Design Theory. 2 Minimal Sets of Dependancies A set of dependencies is minimal if: 1.Every right side is a single attribute 2.For no X  A in F and.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Database Management COP4540, SCS, FIU Functional Dependencies (Chapter 14)
Murali Mani Normalization. Murali Mani What and Why Normalization? To remove potential redundancy in design Redundancy causes several anomalies: insert,
Functional Dependencies, Normalization Rose-Hulman Institute of Technology Curt Clifton.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 227 Database Systems I Design Theory for Relational Databases.
Relational Design. DatabaseDesign Process Conceptual Modeling -- ER diagrams ER schema transformed to relational schema Designer may add additional integrity.
CMSC424: Database Design Instructor: Amol Deshpande
Normal Form Design addendum by C. Zaniolo. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Normal Form Design Compute the canonical cover.
1 Normalization Chapter What it’s all about Given a relation, R, and a set of functional dependencies, F, on R. Assume that R is not in a desirable.
The principal problem that we encounter is redundancy, where a fact is repeated in more than one tuple. Most common cause: attempts to group into one relation.
Winter 2002Arthur Keller – CS 1804–1 Schedule Today: Jan. 15 (T) u Normal Forms, Multivalued Dependencies. u Read Sections Assignment 1 due. Jan.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
Cs3431 Normalization. cs3431 Why Normalization? To remove potential redundancy in design Redundancy causes several anomalies: insert, delete and update.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
1 Schema Refinement and Normal Forms Chapter 19 Raghu Ramakrishnan and J. Gehrke (second text book) In Course Pick-up box tomorrow.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form Source: Slides by Jeffrey Ullman.
Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E.
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
Department of Computer Science and Engineering, HKUST Slide 1 7. Relational Database Design.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
Decompositions uDo we need to decompose a relation? wSeveral normal forms for relations. If schema in these normal forms certain problems don’t.
Database Management Systems Chapter 3 The Relational Data Model (III) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston.
Databases 1 Seventh lecture. Topics of the lecture Extended relational algebra Normalization Normal forms 2.
Normalization Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” Formally, R is in BCNF if for every nontrivial FD.
Chapter 8: Relational Database Design First Normal Form First Normal Form Functional Dependencies Functional Dependencies Decomposition Decomposition Boyce-Codd.
Schema Refinement and Normalization. Functional Dependencies (Review) A functional dependency X  Y holds over relation schema R if, for every allowable.
Database Normalization.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Functional Dependencies An example: loan-info= Observe: tuples with the same value for lno will always have the same value for amt We write: lno  amt.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Functional Dependencies and Normalization 1 Instructor: Mohamed Eltabakh
1 Functional Dependencies and Normalization Chapter 15.
IST 210 Normalization 2 Todd Bacastow IST 210. Normalization Methods Inspection Closure Functional dependencies are key.
Third Normal Form (3NF) Zaki Malik October 23, 2008.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
1 Multivalued Dependencies Fourth Normal Form Reasoning About FD’s + MVD’s.
Design Theory for RDB Normal Forms. Lu Chaojun, SJTU 2 Redundant because these info may be figured out by using FD s1  … What’s Bad Design? Redundancy.
3 Spring Chapter Normalization of Database Tables.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
CS 338Database Design and Normal Forms9-1 Database Design and Normal Forms Lecture Topics Measuring the quality of a schema Schema design with normalization.
Ch 7: Normalization-Part 1
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms: BCNF, Third Normal Form Introduction to Multivalued Dependencies.
1 Lecture 8 Design Theory for Relational Databases (part 2) Slides from
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
1 Database Design: DBS CB, 2 nd Edition Physical RDBMS Model: Schema Design and Normalization Ch. 3.
Functional Dependency and Normalization
Design Theory for Relational Databases
Schedule Today: Next After that Normal Forms. Section 3.6.
CPSC-310 Database Systems
Schedule Today: Jan. 23 (wed) Week of Jan 28
BCNF and Normalization
Normalization Murali Mani.
Designing Relational Databases
Anomalies Boyce-Codd Normal Form 3rd Normal Form
CS4222 Principles of Database System
Presentation transcript:

SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries. u Read Sections

SCUJ. Holliday - coen 1784–2 Normalization Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” Formally, R is in BCNF if for every nontrivial FD for R, if X  A, then X is a superkey. u “Nontrivial” = right-side attribute not in left side. Why? 1. Guarantees no redundancy due to FD’s. 2. Guarantees no update anomalies = one occurrence of a fact is updated, not all. 3. Guarantees no deletion anomalies = valid fact is lost when tuple is deleted.

SCUJ. Holliday - coen 1784–3 Example of Problems Drinkers(name, addr, beerLiked, manf, favoriteBeer) FD’s: 1. name  addr 2. name  favoriteBeer 3. beerLiked  manf ???’s are redundant, since we can figure them out from the FD’s. Update anomalies: If Bill transfers to UC Berkeley, will we remember to change addr in each of his tuples? Deletion anomalies: If nobody likes Bud, we lose track of Bud’s manufacturer.

SCUJ. Holliday - coen 1784–4 Each of the 3 given FD’s is a BCNF violation: Key = { name, beerLiked } u Each of the given FD’s has a left side that is a proper subset of the key. Another Example Beers(name, manf, manfAddr). FD’s: name  manf, manf  manfAddr Only key is name.  Manf  manfAddr violates BCNF with a left side not a super key.

SCUJ. Holliday - coen 1784–5 Decomposition to Reach BCNF Given: relation R, and FD’s F. If there is a non-trivial FD in F, X  B, and X is not a superkey, then R is not in BCNF. Suppose relation R has BCNF violation X  B. We can decompose R into two or more relations so that each of the relations will be in BCNF.

SCUJ. Holliday - coen 1784–6 1. Compute X +. u Cannot be all attributes – why? 2. Decompose R into X + and (R–X + )  X. 3. Find the FD’s for the decomposed relations. u Project the FD’s from F = calculate all consequents of F that involve only attributes from X + or only from (R  X + )  X. R X + X

SCUJ. Holliday - coen 1784–7 Example 1 R= (A,B,C,D) F = {A  BC, C  D} The key is A (why?) The functional dependency C  D violates BCNF (why?) Decomposition: 1. Compute X +. C + = CD 2. Decompose R: R1 = X + and R2 = (R–X + )  X. R1 = CDR2 = ABC 3. Find the FD’s for the decomposed relations. (why?)

SCUJ. Holliday - coen 1784–8 Example 2 R = Drinkers(name, addr, beerLiked,manf,favoriteBeer) F = 1.name  addr 2.name  favoriteBeer 3.beerLiked  manf Pick BCNF violation name  addr Close left side: name + = name addr favoriteBeer. Decomposed relations: Drinkers1(name, addr, favoriteBeer) Drinkers2(name, beerLiked, manf) Projected FD’s (skipping a lot of work):  For Drinkers1 : name  addr and name  favoriteBeer.  For Drinkers2 : beerLiked  manf.

SCUJ. Holliday - coen 1784–9 (Repeating) Decomposed relations: Drinkers1(name, addr, favoriteBeer) Drinkers2(name, beerLiked, manf) Projected FD’s:  For Drinkers1 : name  addr and name  favoriteBeer.  For Drinkers2 : beerLiked  manf. BCNF violations?  For Drinkers1, name is key and all left sides of FD’s are superkeys.  For Drinkers2, { name, beerLiked } is the key, and beerLiked  manf violates BCNF.

SCUJ. Holliday - coen 1784–10 Decompose Drinkers2 First set of decomposed relations: Drinkers1(name, addr, favoriteBeer) Drinkers2(name, beerLiked, manf) Close beerLiked + = beerLiked manf Decompose Drinkers2 into: Drinkers3(beersLiked, manf) Drinkers4(name, beersLiked) Resulting relations are all in BCNF: Drinkers1(name, addr, favoriteBeer) Drinkers3(beerLiked, manf) Drinkers4(name, beerLiked)

SCUJ. Holliday - coen 1784–11 Why Decompose This Way? Eliminate unnecessary redundancy (update and delete anomalies) Loss-less join decomposition (recover original information with join on equality) Dependency preserving (efficient checking of constraints)

SCUJ. Holliday - coen 1784–12 Lossless Join Decomposition If decomposition of a schema to avoid redundant info is not done carefully, we can lose information and generate extra tuples when we try to reconstruct the information from the original table. Consider the decomposition of emp-dept (ename, ssn, bdate, address, dnumber, dname, dmgrssn) into emp-mgr (ename, ssn, bdate, address, dmgrssn) dept (dnumber, dname, dmgrssn) This decomposition solves the redundant info problem. However, there can be problems joining the tables emp-mgr and dept.

SCUJ. Holliday - coen 1784–13 Lossless Join Decomposition emp-dept John /12/ th Street5CS Sue /22/ th Street 1EX Mike /25/7523 A Street3TS Bob /21/62568 Main Street1EX emp-mgr John /12/ th Street Sue /22/ th Street Mike /25/7523 A Street Bob /21/62568 Main Street dept 1EX TS CS

SCUJ. Holliday - coen 1784–14 If we try to recover the original info by doing a join of emp-mgr and dept, we get: emp-mgr join dept John /12/ th Street TS John /12/ th Street CSCS Sue /22/ th Street EX Mike /25/7523 A Street TS Mike /25/7523 A Street CSCS Bob /21/62568 Main Street EXEX There are some extra rows here!! Another way of looking at it is that we lost some information when we decomposed emp-dept into emp-mgr and dept. Why did this happen? We have the functional dependency(dnumber  dname dmgrssn), but not (dmgrssn  dnumber) and we joined on dmgrssn.

SCUJ. Holliday - coen 1784–15 Lossless Join Decomposition In a lossless join decomposition into R1 and R2, at least one of the following dependencies is in F + R1  R2  R1 or R1  R2  R2 Example: R = (A, B, C)F = { A  B, B  C} Decomposition is: R1 = (A, B) R2 = (B, C) This is a lossless join decomposition because R1  R2 = {B} and B  BC, so R1  R2  R2 What about the decomposition R1 = (A, C) R2 = (B, C) ?

SCUJ. Holliday - coen 1784–16 Dependency Preserving Decomposition This property ensures that checking updates for violation of FD’s is efficient. Dependency preservation: Let Fi be the set of dependencies in F + that includes only attributes in Ri. The decomposition is dependency preserving if (  Fi ) + = F + that is, for a 2 relation decomposition (F1  F2) + = F + Example: R = (A, B, C)F = { A  B, B  C} R1 = (A, B) R2 = (B, C)This is dependency preserving. What about the decomposition R1 = (A, C) R2 = (B, C) ?

SCUJ. Holliday - coen 1784–17 3NF One FD structure causes problems: If you decompose, you can’t check all the FD’s only in the decomposed relations. If you don’t decompose, you violate BCNF. Structure: AB  C and C  B. Example 1: title city  theatre and theatre  city. Example 2: street city  zip, zip  city. Keys: {A, B} and {A, C}, but C  B has a left side that is not a superkey. Decompose into BC and AC. u But you can’t check the FD AB  C in only these relations.

SCUJ. Holliday - coen 1784–18 “Elegant” Workaround Define the problem away. A relation R is in 3NF iff (if and only if) for every nontrivial FD X  A, either: 1. X is a superkey, or 2. A is prime = member of at least one key. Thus, the canonical problem goes away: you don’t have to decompose because all attributes are prime.

SCUJ. Holliday - coen 1784–19 What 3NF Gives You There are two important properties of a decomposition: 1.We should be able to recover from the decomposed relations the data of the original. u Recovery involves projection and join. 2.We should be able to check that the FD’s for the original relation are satisfied by checking the projections of those FD’s in the decomposed relations. You can always decompose into BCNF and satisfy (1). We can decompose into 3NF and satisfy both (1) and (2). But it is not always possible to decompose into BNCF and get both (1) and (2). u Street-city-zip is an example of this point.

SCUJ. Holliday - coen 1784–20 BCNF and 3NF BCNF: Whenever a non-trivial functional dependency X  A holds in R, then X is a superkey of R. 3NF: Whenever a non-trivial functional dependency X  A holds in R, then X is a superkey of R OR each attribute of A is a member of a candidate key (prime).

SCUJ. Holliday - coen 1784–21 Exercise Consider the schema and 2 sets of FD’s F and E: emp-dept (ename, ssn, bdate, address, dnumber, dname, dmgrssn) F ={ ssn  ename bdate address dnumber, dnumber  dname dmgrssn } E ={ ssn  ename address dnumber, ssn  dname bdate, dnumber  dname dmgrssn } Are F and E equivalent?

SCUJ. Holliday - coen 1784–22 3NF Decomposition Find the canonical form for F A canonical cover of a set of dependencies, F, has the following properties:  No functional dependency contains an extraneous attribute. That is, an attribute that can be removed from the dependency without changing the closure of F.  Each left side of a functional dependency in F is unique.

SCUJ. Holliday - coen 1784–23 3NF Decomposition Algorithm 1. Calculate the canonical cover of F 2. set j = 0 3. for each FD A  B in F c, do if none of current schemas contain AB then j = j+1 Rj = (A B) 4. if none of the schemas in the result contains a candidate key for the original R, then: j = j + 1 and Rj = (any candidate key)

SCUJ. Holliday - coen 1784–24 3NF Example Example: R = (A, B, C, D, E) F = {A  BC, C  DE, DE  A) Soln: Candidate keys are: A, C, and DE F is in canonical form. R1 = (A, B, C), R2 = (C, D, E), R3 = (A, D, E) No natural join produces spurious tuples, so the decomposition is lossless. Dependencies are preserved. All relations are in 3NF.

SCUJ. Holliday - coen 1784–25 Example - BCNF R = (b-name, b-city, assets, c-name, loan#, amount) F = {b-name  b-city assets, loan#  amount b-name} Primary key = {loan#, c-name} ** R is not in BCNF "b-name  b-city assets" is a non-trivial FD that holds on R and "b-name  R" is not in F + (that is, b-name is not a super key). Split R into R1 and R2 R1 = (b-name, b-city, assets) R2 = (b-name, c-name, loan#, amount)

SCUJ. Holliday - coen 1784–26 Continued R1 = (b-name, b-city, assets) R2 = (b-name, c-name, loan#, amount) **Now R1 is in BCNF, but R2 is not (why?) So, we split R2 into R3 and R4 R3 = (b-name, loan#, amount) R4 = (c-name, loan#) The final decomposition is R1, R3, R4

SCUJ. Holliday - coen 1784–27 Exercise 1.Find the keys. 2.How should this be decomposed? R = (A, B, C, D, E) F = {A  BC, C  DE, DE  A)

SCUJ. Holliday - coen 1784–28 Answers R = (A, B, C, D, E) F = {A  BC, C  DE, DE  A) Keys are: A, C, and DE R is already in BCNF.