Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections 3.5-3.6. Project Part 1 due. Oct.

Slides:



Advertisements
Similar presentations
1 Lecture 7 Design Theory for Relational Databases (part 1) Slides based on
Advertisements

Functional Dependencies (FDs)
Lecture 21 CS 157 B Revision of Midterm3 Prof. Sin-Min Lee.
4NF and 5NF Prof. Sin-Min Lee Department of Computer Science.
1 Design Theory. 2 Minimal Sets of Dependancies A set of dependencies is minimal if: 1.Every right side is a single attribute 2.For no X  A in F and.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Functional Dependencies - Example
Functional Dependencies, Normalization Rose-Hulman Institute of Technology Curt Clifton.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 227 Database Systems I Design Theory for Relational Databases.
Functional Dependencies
Functional Dependencies. Babies At a birth, there is one baby (twins would be represented by two births), one mother, any number of nurses, and a doctor.
Classroom Exercise: Normalization
Relational Design. DatabaseDesign Process Conceptual Modeling -- ER diagrams ER schema transformed to relational schema Designer may add additional integrity.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form Source: Slides by Jeffrey Ullman.
Functional Dependencies Definition: If two tuples agree on the attributes A, A, … A 12n then they must also agree on the attributes B, B, … B 12m Formally:
1 Functional Dependencies Meaning of FD’s Keys and Superkeys Inferring FD’s.
The principal problem that we encounter is redundancy, where a fact is repeated in more than one tuple. Most common cause: attempts to group into one relation.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
Winter 2002Arthur Keller – CS 1804–1 Schedule Today: Jan. 15 (T) u Normal Forms, Multivalued Dependencies. u Read Sections Assignment 1 due. Jan.
Decomposition By Yuhung Chen CS157A Section 2 October
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #7 Matthew P. Johnson Stern School of Business, NYU Spring,
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form Source: Slides by Jeffrey Ullman.
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E.
Winter 2002Arthur Keller – CS 1803–1 Schedule Today: Jan. 10 (TH) u Relational Model, Functional Dependencies. u Read Sections Jan. 15 (T) u Normal.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms.
Decompositions uDo we need to decompose a relation? wSeveral normal forms for relations. If schema in these normal forms certain problems don’t.
Database Management Systems Chapter 3 The Relational Data Model (III) Instructor: Li Ma Department of Computer Science Texas Southern University, Houston.
Databases 1 Seventh lecture. Topics of the lecture Extended relational algebra Normalization Normal forms 2.
1 Normalization Anomalies Boyce-Codd Normal Form 3 rd Normal Form.
Normalization Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” Formally, R is in BCNF if for every nontrivial FD.
Normal Forms1. 2 The Problems of Redundancy Redundancy is at the root of several problems associated with relational schemas: Wastes storage Causes problems.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Functional Dependencies. FarkasCSCE 5202 Reading and Exercises Database Systems- The Complete Book: Chapter 3.1, 3.2, 3.3., 3.4 Following lecture slides.
IST 210 Normalization 2 Todd Bacastow IST 210. Normalization Methods Inspection Closure Functional dependencies are key.
© D. Wong Ch. 3 (continued)  Database design problems  Functional Dependency  Keys of relations  Decompositions based on Functional Dependency.
Third Normal Form (3NF) Zaki Malik October 23, 2008.
Santa Clara UniversityHolliday – COEN 1783–1 Today’s Topic Today: u Relational Model, Functional Dependencies. u Read Sections Next time u Normal.
1 Multivalued Dependencies Fourth Normal Form Reasoning About FD’s + MVD’s.
Design Theory for RDB Normal Forms. Lu Chaojun, SJTU 2 Redundant because these info may be figured out by using FD s1  … What’s Bad Design? Redundancy.
3 Spring Chapter Normalization of Database Tables.
Functional Dependencies CIS 4301 Lecture Notes Lecture 8 - 2/7/2006.
Multivalued Dependencies and 4th NF CIS 4301 Lecture Notes Lecture /21/2006.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
Databases 1 Sixth lecture. 2 Functional Dependencies X -> A is an assertion about a relation R that whenever two tuples of R agree on all the attributes.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms: BCNF, Third Normal Form Introduction to Multivalued Dependencies.
1 Lecture 8 Design Theory for Relational Databases (part 2) Slides from
4NF & MULTIVALUED DEPENDENCY By Kristina Miguel. Review  Superkey – a set of attributes which will uniquely identify each tuple in a relation  Candidate.
1 Database Design: DBS CB, 2 nd Edition Physical RDBMS Model: Schema Design and Normalization Ch. 3.
Design Theory for Relational Databases
Schedule Today: Next After that Normal Forms. Section 3.6.
Cushing, Keller, UlmanJudy Cushing
CS422 Principles of Database Systems Normalization
CPSC-310 Database Systems
Schedule Today: Jan. 23 (wed) Week of Jan 28
3.1 Functional Dependencies
BCNF and Normalization
Multivalued Dependencies & Fourth Normal Form (4NF)
Functional Dependencies
Functional Dependencies
Multivalued Dependencies
Anomalies Boyce-Codd Normal Form 3rd Normal Form
CS4222 Principles of Database System
Presentation transcript:

Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct. 9 (T) Multivalued Dependencies, Relational Algebra u Read Sections 3.7, Assignment 2 due. Oct. 11 (TH) SQL Queries. u Read Sections Project Part 2 due.

Fall 2001Arthur Keller – CS 1804–2 Significance of functional dependencies (FD's)? Consider R(A1,…, An) and X is a key then X  Y for any attributes Y in A1,…, An even if they overlap with X. Why? Suppose R is used to represent a many  one relationship: E1 entity set  E2 entity set where X key for E1, Y key for E2, Then, X  Y holds, And Y  X does not hold unless the relationship is 1-1. What about many-many relationships?

Fall 2001Arthur Keller – CS 1804–3 Inferring FD’s And this is important because … When we talk about improving relational designs, we often need to ask “does this FD hold in this relation?” Given FD’s X1  A1, X2  A2,…, Xn  An, does FD Y  B necessarily hold in the same relation? Start by assuming two tuples agree in Y. Use given FD’s to infer other attributes on which they must agree. If B is among them, then yes, else no.

Fall 2001Arthur Keller – CS 1804–4 Algorithm Define Y + = closure of Y = set of attributes functionally determined by Y: Basis: Y + :=Y. Induction: If X  Y +, and X  A is a given FD, then add A to Y +. End when Y + cannot be changed.

Fall 2001Arthur Keller – CS 1804–5 Example A  B, BC  D. A + = AB. C + =C. (AC) + = ABCD.

Fall 2001Arthur Keller – CS 1804–6 Given Versus Implied FD’s Typically, we state a few FD’s that are known to hold for a relation R. Other FD’s may follow logically from the given FD’s; these are implied FD’s. We are free to choose any basis for the FD’s of R – a set of FD’s that imply all the FD’s that hold for R.

Fall 2001Arthur Keller – CS 1804–7 Finding All Implied FD’s Motivation: Suppose we have a relation ABCD with some FD’s F. If we decide to decompose ABCD into ABC and AD, what are the FD’s for ABC, AD? Example: F = AB  C, C  D, D  A. It looks like just AB  C holds in ABC, but in fact C  A follows from F and applies to relation ABC. Problem is exponential in worst case.

Fall 2001Arthur Keller – CS 1804–8 Algorithm For each set of attributes X compute X +. Add X  A for each A in X + –X. Ignore or drop some “obvious” dependencies that follow from others: 1. Trivial FD’s: right side is a subset of left side. u Consequence: no point in computing  + or closure of full set of attributes. 2. Drop XY  A if X  A holds. u Consequence: If X + is all attributes, then there is no point in computing closure of supersets of X. 3. Ignore FD’s whose right sides are not single attributes. Notice that after we project the discovered FD’s onto some relation, the FD’s eliminated by rules 1, 2, and 3 can be inferred in the projected relation.

Fall 2001Arthur Keller – CS 1804–9 Example F = AB  C, C  D, D  A. What FD’s follow? A + = A; B + =B (nothing). C + =ACD (add C  A). D + =AD (nothing new). (AB) + =ABCD (add AB  D; skip all supersets of AB). (BC) + =ABCD (nothing new; skip all supersets of BC). (BD) + =ABCD (add BD  C; skip all supersets of BD). (AC) + =ACD; (AD) + =AD; (CD) + =ACD (nothing new). (ACD) + =ACD (nothing new). All other sets contain AB, BC, or BD, so skip. Thus, the only interesting FD’s that follow from F are: C  A, AB  D, BD  C.

Fall 2001Arthur Keller – CS 1804–10 Example 2 Set of FD’s in ABCGHI: A  B A  C CG  H CG  I B  H Compute (CG) +, (BG) +, (AG) +

Fall 2001Arthur Keller – CS 1804–11 Normalization Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” Formally, R is in BCNF if for every nontrivial FD for R, say X  A, then X is a superkey. u “Nontrivial” = right-side attribute not in left side. Why? 1. Guarantees no redundancy due to FD’s. 2. Guarantees no update anomalies = one occurrence of a fact is updated, not all. 3. Guarantees no deletion anomalies = valid fact is lost when tuple is deleted.

Fall 2001Arthur Keller – CS 1804–12 Example of Problems Drinkers(name, addr, beersLiked, manf, favoriteBeer) FD’s: 1. name  addr 2. name  favoriteBeer 3. beersLiked  manf ???’s are redundant, since we can figure them out from the FD’s. Update anomalies: If Janeway gets transferred to the Intrepid, will we change addr in each of her tuples? Deletion anomalies: If nobody likes Bud, we lose track of Bud’s manufacturer.

Fall 2001Arthur Keller – CS 1804–13 Each of the given FD’s is a BCNF violation: Key = { name, beersLiked } u Each of the given FD’s has a left side that is a proper subset of the key. Another Example Beers(name, manf, manfAddr). FD’s = name  manf, manf  manfAddr. Only key is name.  Manf  manfAddr violates BCNF with a left side unrelated to any key.

Fall 2001Arthur Keller – CS 1804–14 Decomposition to Reach BCNF Setting: relation R, given FD’s F. Suppose relation R has BCNF violation X  B. We need only look among FD’s of F for a BCNF violation. Proof: If Y  A is a BCNF violation and follows from F, then the computation of Y + used at least one FD X  B from F. u X must be a subset of Y. u Thus, if Y is not a superkey, X cannot be a superkey either, and X  B is also a BCNF violation.

Fall 2001Arthur Keller – CS 1804–15 1. Compute X +. u Cannot be all attributes – why? 2. Decompose R into X + and (R–X + )  X. 3. Find the FD’s for the decomposed relations. u Project the FD’s from F = calculate all consequents of F that involve only attributes from X + or only from (R  X + )  X.

Fall 2001Arthur Keller – CS 1804–16 Example R = Drinkers(name, addr, beersLiked, manf, favoriteBeer) F = 1. name  addr 2. name  favoriteBeer 3. beersLiked  manf Pick BCNF violation name  addr. Close the left side: name + = name addr favoriteBeer. Decomposed relations: Drinkers1(name, addr, favoriteBeer) Drinkers2(name, beersLiked, manf) Projected FD’s (skipping a lot of work that leads nowhere interesting):  For Drinkers1 : name  addr and name  favoriteBeer.  For Drinkers2 : beersLiked  manf.

Fall 2001Arthur Keller – CS 1804–17 Decomposed relations: Drinkers1(name, addr, favoriteBeer) Drinkers2(name, beersLiked, manf) Projected FD’s (skipping a lot of work that leads nowhere interesting):  For Drinkers1 : name  addr and name  favoriteBeer.  For Drinkers2 : beersLiked  manf. BCNF violations?  For Drinkers1, name is key and all left sides of FD’s are superkeys.  For Drinkers2, { name, beersLiked } is the key, and beersLiked  manf violates BCNF.

Fall 2001Arthur Keller – CS 1804–18 Decompose Drinkers2 Decomposed relations: Drinkers1(name, addr, favoriteBeer) Drinkers2(name, beersLiked, manf) Close beersLiked + = beersLiked, manf. Decompose: Drinkers3(beersLiked, manf) Drinkers4(name, beersLiked) Resulting relations are all in BCNF: Drinkers1(name, addr, favoriteBeer) Drinkers3(beersLiked, manf) Drinkers4(name, beersLiked)

Fall 2001Arthur Keller – CS 1804–19 3NF One FD structure causes problems: If you decompose, you can’t check all the FD’s only in the decomposed relations. If you don’t decompose, you violate BCNF. Abstractly: AB  C and C  B. Example 1: title city  theatre and theatre  city. Example 2: street city  zip, zip  city. Keys: {A, B} and {A, C}, but C  B has a left side that is not a superkey. Suggests decomposition into BC and AC. u But you can’t check the FD AB  C in only these relations.

Fall 2001Arthur Keller – CS 1804–20 Example A = street, B = city, C = zip. Join: zip  city street city  zip

Fall 2001Arthur Keller – CS 1804–21 “Elegant” Workaround Define the problem away. A relation R is in 3NF iff (if and only if) for every nontrivial FD X  A, either: 1. X is a superkey, or 2. A is prime = member of at least one key. Thus, the canonical problem goes away: you don’t have to decompose because all attributes are prime.

Fall 2001Arthur Keller – CS 1804–22 What 3NF Gives You There are two important properties of a decomposition: 1.We should be able to recover from the decomposed relations the data of the original. u Recovery involves projection and join, which we shall defer until we’ve discussed relational algebra. 2.We should be able to check that the FD’s for the original relation are satisfied by checking the projections of those FD’s in the decomposed relations. Without proof, we assert that it is always possible to decompose into BCNF and satisfy (1). Also without proof, we can decompose into 3NF and satisfy both (1) and (2). But it is not possible to decompose into BNCF and get both (1) and (2). u Street-city-zip is an example of this point.