1 Lecture 7: Schema refinement: Normalisation www.cl.cam.ac.uk/Teaching/current/Databases/

Slides:



Advertisements
Similar presentations
Schema Refinement: Normal Forms
Advertisements

Normalization 1 Instructor: Mohamed Eltabakh Part II.
Logical Database Design (3 of 3) John Ortiz. Lecture 7Logical Database Design (2)2 Normalization  If a relation is not in BCNF or 3NF, we refine it by.
Review for Final Exam Lecture Week 14. Problems on Functional Dependencies and Normal Forms.
NORMALIZATION. Normalization Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations.
1 Design Theory. 2 Minimal Sets of Dependancies A set of dependencies is minimal if: 1.Every right side is a single attribute 2.For no X  A in F and.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Relational Normalization Theory. Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does not provide.
1 Normalization. 2 Normal Forms v If a relation is in a certain normal form (BCNF, 3NF etc.), it is known that certain kinds of redundancies are avoided/minimized.
©Silberschatz, Korth and Sudarshan Relational Database Design First Normal Form Pitfalls in Relational Database Design Functional Dependencies Decomposition.
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
Normalization DB Tuning CS186 Final Review Session.
Slides adapted from A. Silberschatz et al. Database System Concepts, 5th Ed. Relational Database Design - part 2 - Database Management Systems I Alex Coman,
Nov 11, 2003Murali Mani Normalization B term 2004: lecture 7, 8, 9.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
Decomposition By Yuhung Chen CS157A Section 2 October
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
Databases 6: Normalization
Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E.
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
Department of Computer Science and Engineering, HKUST Slide 1 7. Relational Database Design.
Boyce-Codd Normal Form By: Thanh Truong. Boyce-Codd Normal Form Eliminates all redundancy that can be discovered by functional dependencies But, we can.
Ch 7: Normalization-Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Database Systems Normal Forms. Decomposition Suppose we have a relation R[U] with a schema U={A 1,…,A n } – A decomposition of U is a set of schemas.
Databases 1 Seventh lecture. Topics of the lecture Extended relational algebra Normalization Normal forms 2.
Relational Database Design by Relational Database Design by Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING.
Normal Forms1. 2 The Problems of Redundancy Redundancy is at the root of several problems associated with relational schemas: Wastes storage Causes problems.
Chapter 8: Relational Database Design First Normal Form First Normal Form Functional Dependencies Functional Dependencies Decomposition Decomposition Boyce-Codd.
Schema Refinement and Normalization. Functional Dependencies (Review) A functional dependency X  Y holds over relation schema R if, for every allowable.
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Schema Refinement and Normal Forms Chapter 19 1 Database Management Systems 3ed, R.Ramakrishnan & J.Gehrke.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Functional Dependencies and Normalization for Relational Databases
Computing & Information Sciences Kansas State University Tuesday, 27 Feb 2007CIS 560: Database System Concepts Lecture 18 of 42 Tuesday, 27 February 2007.
Functional Dependencies and Normalization 1 Instructor: Mohamed Eltabakh
1 Lecture 6: Schema refinement: Functional dependencies
1 Functional Dependencies and Normalization Chapter 15.
1 Schema Refinement and Normal Forms Week 6. 2 The Evils of Redundancy  Redundancy is at the root of several problems associated with relational schemas:
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
Third Normal Form (3NF) Zaki Malik October 23, 2008.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
Design Process - Where are we?
Dr. Mohamed Osman Hegaz1 Logical data base design (2) Normalization.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke1 Schema Refinement and Normal Forms Chapter 19.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Normalization.
Chapter 5.1 and 5.2 Brian Cobarrubia Database Management Systems II January 31, 2008.
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
CS 338Database Design and Normal Forms9-1 Database Design and Normal Forms Lecture Topics Measuring the quality of a schema Schema design with normalization.
Ch 7: Normalization-Part 1
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke1 Schema Refinement and Normal Forms Chapter 19.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
11/06/97J-1 Principles of Relational Design Chapter 12.
1 CS 430 Database Theory Winter 2005 Lecture 8: Functional Dependencies Second, Third, and Boyce-Codd Normal Forms.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
Normal Forms Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems June 18, 2016 Some slide content courtesy of Susan Davidson.
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
More on Decompositions and Third Normal Form CIS 4301 Lecture Notes Lecture /16/2006.
1 CS122A: Introduction to Data Management Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li.
Relational Database Design by Dr. S. Sridhar, Ph. D
Module 5: Overview of Normalization
Presentation transcript:

1 Lecture 7: Schema refinement: Normalisation

2 Decomposing relations In previous lecture, we saw that we could ‘decompose’ the bad relation schema Data(sid,sname,address,cid,cname,grad e) to a ‘better’ set of relation schema Student(sid,sname,address) Course(cid,cname) Enrolled(sid,cid,grade)

3 Are all decompositions good? Consider our motivating example: Data(sid,sname,address,cid,cname,grade) Alternatively we could decompose into R1(sid,sname,address) R2(cid,cname,grade) But this decomposition loses information about the relationship between students and courses

4 Decomposition A decomposition of a relation R=R(A 1 :  1, …, A n :  n ) is a collection of relations {R 1, …, R k } and a set of queries if then such that This is Tim’s somewhat non-standard definition….

5 Special Case: Lossless- join decomposition {R 1,…,R k } is a lossless-join decomposition of R with respect to an FD set F, if for every relation instance r of R that satisfies F,  R 1 (r) V … V  R k (r) = r (this means project on the attributes of the relation’s schema)

6 Lossless-join: Example 2 Lossless-join? ABC BC AB

7 Lossless-join: Example sidsnameaddres s cidcnamegrade 124JuliaUSA206DatabaseA++ 204KimEssex202SemanticsC 124JuliaUSA201S/Eng IA+ 206TimLondon206DatabaseB- 124JuliaUSA202SemanticsB+ What happens if we decompose on (sid,sname,address) and (cid,cname,grade)?

8 Dependency preservation Intuition: If R is decomposed into R 1, R 2 and R 3, say, and we enforce the FDs that hold individually on R 1, on R 2 and on R 3, then all FDs that were given to hold on R must also hold Reason: Otherwise checking updates for violation of FDs may require computing joins 

9 Dependency preservation The projection of an FD set F onto a set of attributes Z, written F z is defined {X  Y | X  Y  F + and X  Y  Z} A decomposition  ={R 1,…,R k } is dependency preserving if F + =(F R 1  …  F R k ) + GOAL OF SCHEMA REFINEMENT: REDUCE REDUNDANCY WHILE PRESERVING DEPENDENCIES IN A LOSSLESS-JOIN MANNER.

10 Dependency preservation: example Take R=R(city, street&no, zipcode) with FDs: –city,street&no  zipcode –zipcode  city Decompose to –R1(street&no,zipcode) –R2(city,zipcode) Claim: This is a lossless-join decomposition Is it dependency preserving?

11 Boyce-Codd normal form “Represent Every Fact Only ONCE” A relation R with FDs F is said to be in Boyce-Codd normal form (BCNF) if for all X  A in F + then –Either A  X (‘trivial dependency’), or –X is a superkey for R Intuition: A relation R is in BCNF if the left side of every non-trivial FD contains a key

12 BCNF: Example Consider R=R(city, street&no, zipcode) with FDs: –city,street&no  zipcode –zipcode  city This is not in BCNF, because zipcode is not a superkey for R –We potentially duplicate information relating zipcodes and cities 

13 BCNF: Example BankerSchema(brname,cname,bname) With FDs –bname  brname –brname,cname  bname Not in BCNF (Why?) We might decompose to –BBSchema(bname,brname) –CBrSchema(cname,bname) This is in BCNF BUT this is not dependency-preserving 

14 Third normal form A relation R with FDs F is said to be in third normal form (3NF) if for all X  A in F + then –Either A  X (‘trivial dependency’), or –X is a superkey for R, or –A is a member of some candidate key for R Notice that 3NF is strictly weaker than BCNF (A prime attribute is one which appears in a candidate key) It is always possible to find a dependency-preserving lossless-join decomposition that is in 3NF.

15 3NF: Example Recall R=R(city, street&no, zipcode) with FDs: –city,street&no  zipcode –zipcode  city We saw earlier that this is not in BCNF However this is in 3NF, because city is a member of a candidate key ({city,street&no})

16 Prehistory: First normal form First normal form (1NF) is now considered part of the formal definition of the relational model It states that the domain of all attributes must be atomic (indivisible), and that the value of any attribute in a tuple must be a single value from the domain NOTE: Modern databases have moved away from this restriction

17 Prehistory: Second normal form A partial functional dependency X  Y is an FD where for some attribute A  X, (X- {A})  Y A relation schema R is in second normal form (2NF) if every non-prime attribute A in R is not partially dependent on any key of R

18 Summary: Normal forms 1NF 2NF 3NF BCNF

19 Not the end of problems… ONLY TRIVIAL FDs!! (see Date) Is in BCNF! Obvious insertion anomalies… CourseTeacherBook DatabasesgmbDate DatabasesgmbElmasri DatabasesjkmmDate DatabasesjkmmElmasri OSFgmbSilberschatz OSFtlhSlberschatz

20 Decomposition Even though its in BCNF, we’d prefer to decompose it to the schema –Teaches(Course,Teacher) –Books(Course,Title) We need to extend our underlying theory to capture this form of redundancy

21 Further normal forms We can generalise the notion of FD to a ‘multi-valued dependency’, and define two further normal forms (4NF and 5NF) These are detailed in the textbooks In practise, BCNF (preferably) and 3NF (at the very least) are good enough

22 Design goals: Summary Our goal for relational database design is –BCNF –Lossless-join decomposition –Dependency preservation If we can’t achieve this, we accept –Lack of dependency preservation, or –3NF

23 Summary You should now understand: Decomposition of relations Lossless-join decompositions Dependency preserving decompositions BCNF and 3NF 2NF and 1NF Next lecture: More algebra, more SQL