1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.

Slides:



Advertisements
Similar presentations
Functional Dependencies and Normalization for Relational Databases
Advertisements

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
Announcements Read 6.1 – 6.3 for Wednesday Project Step 3, due now Homework 5, due Friday 10/22 Project Step 4, due Monday Research paper –List of sources.
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala N ATIONAL I NSTITUTE OF T ECHNOLOGY A GARTALA Aug-Dec,2010 Normalization 2 CSE-503 :: D ATABASE.
NORMALIZATION. Normalization Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Ch 10, Functional Dependencies and Normal forms
Deanship of Distance Learning Avicenna Center for E-Learning 1 Session - 7 Sequence - 4 Normalization 2NF & 3NF Presented by: Dr. Samir Tartir.
Functional Dependencies and Normalization for Relational Databases.
Kingdom of Saudi Arabia Ministry of Higher Education Al-Imam Muhammad Ibn Saud Islamic University College of Computer and Information Sciences Normalization.
Ms. Hatoon Al-Sagri CCIS – IS Department Normalization.
Functional Dependencies and Normalization for Relational Databases
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
METU Department of Computer Eng Ceng 302 Introduction to DBMS Functional Dependencies and Normalization for Relational Databases by Pinar Senkul resources:
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 8 Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Announcements Homework 1 due Friday. Slip it under my office door (1155) or put in my mailbox on 5 th floor. Program 2 has been graded ;-( Program 3 out.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 10 Functional Dependencies and Normalization for Relational Databases.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 6 NORMALIZATION FOR RELATIONAL DATABASES Instructor Ms. Arwa Binsaleh.
IS 230Lecture 8Slide 1 Normalization Lecture 9. IS 230Lecture 8Slide 2 Lecture 8: Normalization 1. Normalization 2. Data redundancy and anomalies 3. Spurious.
King Saud University College of Computer & Information Sciences Computer Science Department CS 380 Introduction to Database Systems Functional Dependencies.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Normalization for Relational Databases.
DatabaseIM ISU1 Chapter 10 Functional Dependencies and Normalization for RDBs Fundamentals of Database Systems.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Topic 10 Functional Dependencies and Normalization for Relational Databases Faculty of Information Science and Technology Mahanakorn University of Technology.
Instructor: Churee Techawut Functional Dependencies and Normalization for Relational Databases Chapter 4 CS (204)321 Database System I.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 4 Normalization.
Top-Down Database Design Mini-world Requirements Conceptual schema E1 E2 R Relation schemas ?
Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Chapter Functional Dependencies and Normalization for Relational Databases.
1 Functional Dependencies and Normalization Chapter 15.
Normalization Sept. 2012ACS-3902 Yangjun Chen1 Outline: Normalization Chapter 14 – 3rd ed. (Chap. 10 – 4 th, 5 th ed.; Chap. 6, 6 th ed.) Redundant information.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Lecture 8: Database Concepts May 4, Outline From last lecture: creating views Normalization.
Dr. Mohamed Osman Hegaz1 Logical data base design (2) Normalization.
What is normalization ? Proposed by Codd in 1972 Takes a relation through a series of steps to certify whether it satisfies a certain normal form Initially.
Normalization Sept. 2014ACS-3902 Yangjun Chen1 Outline: Normalization Redundant information and update anomalies Function dependencies Normal forms -1NF,
Database Design FUNCTIONAL DEPENDENCES NORMAL FORMS D. Christozov / G.Tuparov INF 280 Database Systems: DB design: Normal Forms 1.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 340 Introduction to Database Systems.
Al-Imam University Girls Education Center Collage of Computer Science 1 st Semester, 1432/1433H Chapter 10_part 1 Functional Dependencies and Normalization.
11/06/97J-1 Principles of Relational Design Chapter 12.
Al-Imam University Girls Education Center Collage of Computer Science 1 nd Semester, 1432/1433H Chapter 10_part2 Functional Dependencies and Normalization.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Functional Dependencies and Normalization for Relational Databases تنبيه : شرائح العرض (Slides) هي وسيلة لتوضيح الدرس واداة من الادوات في ذلك. حيث المرجع.
1 Normalization David J. Stucki. Outline Informal Design Guidelines Normal Forms  1NF  2NF  3NF  BCNF  4NF 2.
10/3/2017.
COP 6726: New Directions in Database Systems
Functional Dependency and Normalization
Functional Dependencies and Normalization for Relational Databases
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for RDBs
Normal forms First Normal Form (1NF) Second Normal Form (2NF)
Normalization 2NF & 3NF Presented by: Dr. Samir Tartir
Database Management systems Subject Code: 10CS54 Prepared By:
Outline: Normalization
Normalization February 28, 2019 DB:Normalization.
Chapter Outline 1 Informal Design Guidelines for Relational Databases
Presentation transcript:

1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization

2 Functional Dependencies A functional dependency (FD) takes the form of X  Y, where X and Y are subsets of attributes in a relation What does X  Y mean? Values of attributes X determines the values of attributes Y; Values of attributes Y depends on the values of attributes X; Suppose t 1 and t 2 are two tuples in the relation. If t 1 and t 2 have the same values for attribute set X, then their values for attribute set Y must be identical to each other in these two tuples

3 Functional Dependencies EMP_PRJ(Ssn, Pnumber, Hours, Ename, Pname, Plocation) {Ssn}  {Ename} is a FD Ename depends on Ssn {Pnumber}  {Pname, Plocation} is a FD Pname and Plocation depends on Pnumber Two rows with the same Pnumber must have the same values of Pname and Plocation {Plocation}  {Pnumber} is not a FD {Ename, Plocation}  {Pnumber} is not a FD

4 Functional Dependencies l Graphical Representation of FDs: l FD1: {SSN, Pnumber}  {Hours} l FD2: {SSN}  {Ename} l FD3: {PNumber}  {PName, PLocation}

5 Functional Dependencies l A relation may contain many functional dependencies –How to derive all of them? l Given a set of functional dependencies of a relation R:  = {AC  B, A  C, D  A} –Does  entail AD  BC (i.e., is AD  BC also a FD of R)?

6 Inference Rules (Example) Given  AC  B, A  C, D  A } Does  entail AD  BC? 1. D  A (given in  ) 2. AD  A (augmenting (1) with A) 3. A  C (given in  ) 4. A  AC (augmenting (3) with A) 5. AC  B (given in  ) 6. AC  BC (augmenting (5) with C) 7. A  BC (transitive between (4) and (6)) 8. AD  BC (transitive between (2) and (7))

7 Normal Forms and Normalization l Functional dependencies can help us analyze whether a relational schema is “good” or “bad” l In relational model, we don’t say that a schema is good/bad. We say it is in 1NF, 2NF, 3NF, etc –Properties  The higher the NF, the stricter the conditions placed on the schema  A higher NF relation is also in lower NF but not vice-versa –A 3NF relation is in 2NF and 1NF (but not in 4NF, 5NF) l Normalization: –The process of decomposing "bad" (lower normal form) relations by breaking up their attributes into smaller relations

8 First Normal Form l A schema is in 1NF if it permits only atomic (indivisible) attribute values l 1NF disallows –composite attributes –multivalued attributes l The relational model itself prohibits relations that contain composite and multivalued attributes –Therefore, all the schemas in relational model are at least in 1NF

9 Example Relation is not in 1NF because it has a multivalued attribute (Dlocations)

10 Normalization into 1NF l 3 strategies for normalization: –Place the “offending” attributes in a separate relation  DEPARTMENT(Dname, Dnumber, Dmgr_ssn)  DEPTLOCATIONS(Dnumber, Dlocation) –Change Dlocations into Dlocation and modify the primary key  DEPARTMENT(Dname, Dnumber, Dmgr_ssn, Dlocation) –If the maximum number of locations per department is 3:  DEPARTMENT(Dname, Dnumber, Dmgr_ssn, Dloc1, Dloc2, Dloc3)

1 Is 1NF Sufficient? l Key of the relation is the combination of (Dnumber, Dlocation) l Relation is in 1NF, but there are redundancies: –Two rows with the same Dnumber must have the same Dname and Dmgr_ssn (even though their Dlocations are different)

12 2NF (Motivating Example) l Functional dependencies –{Dnumber, Dlocation}  {Dname, Dmgr_ssn} (from primary key) –{Dnumber}  {Dname, Dmgr_ssn} l Consequence: two tuples with same Dnumber but different Dlocation will have same Dname and Dmgr_ssn, which leads to redundancy! l If {Dnumber}  {Dname, Dmgr_ssn} is not a FD, then there won’t be a redundancy problem

13 2NF (Motivating Example) l This example suggests that if X  Y is a FD, where X is the key, you can’t have X’  Y also a FD of the same table (where X’ is a subset of X), otherwise, there’ll be redundancies in the table –We say that X  Y must be a full FD {Dnumber, Dlocation}  {Dname, Dmgr_ssn} (from primary key) {Dnumber}  {Dname, Dmgr_ssn}

14 Full versus Partial Dependencies l X  Y is a full FD if removal of any attribute from X means the FD does not hold any more l X  Y is a partial FD if there is a FD X’  Y where X’ is a subset of X l Example: –{Dnumber, Dlocation}  {Dname, Dmgr_ssn} is a partial FD because {Dnumber}  {Dname, Dmgr_ssn} is also a FD of the schema

15 Prime versus NonPrime Attributes l Prime attribute: –an attribute that is a member of the candidate key K –Example (from previous slide): Dnumber, Dlocation l Nonprime attribute: –an attribute that is not a member of any candidate key. –Example (from previous slide): Dname, Dmgr_ssn

16 2NF Definition l A relation schema R is in second normal form (2NF) if every non- prime attribute A in R is fully functionally dependent on the key of R l Since {Dnumber, Dlocation} is the key –{Dnumber, Dlocation}  {Dname, Dmgr_ssn} is FD of the schema –But {Dnumber}  {Dname, Dmgr_ssn} is also a FD of the schema  The non-prime attributes are not fully functionally dependent on the key  So schema is not in 2NF

17 Example l FDs: –{SSN, Pnumber}  {Hours, Ename, Pname, Plocation}, –{SSN}  {Ename}, –{Pnumber}  {Pname, Plocation}

18 Example –{SSN, PNUMBER}  HOURS is a full FD since neither SSN  HOURS nor PNUMBER  HOURS hold –But {SSN, PNUMBER}  ENAME is a partial dependency since SSN  ENAME also holds

19 2NF –Is {SSN, PNUMBER}  {Hours} a full FD? Yes –Is {SSN, PNUMBER}  {Ename} a full FD? No –Is {SSN, PNUMBER}  {Pname} a full FD? No –Is {SSN, PNUMBER}  {Plocation} a full FD? No l Conclusion: The EMP_PROJ relation is not in 2NF l 2NF normalization: take the “offending” FDs and create separate relations

20 Normalizing into 2NF {SSN, Pnumber}  {Hours}, {SSN}  {Ename}, {Pnumber}  {Pname, Plocation}

21 Is 2NF sufficient? l Key is SSN l FDs: –{SSN}  {Ename, Bdate, Address, Dnumber, Dname, Dmgr_ssn} –{Dnumber}  {Dname, Dmgr_ssn} l Is the table in 2NF? –Yes because every non-prime attribute is fully FD on the key

2 Is 2NF sufficient? l Are there still redundancies in the relation? Yes –Two tuples with the same Dnumber have the same Dname and Dmgr_ssn l What is the “offending” FD that causes redundancy?

23 Is 2NF sufficient? l Functional dependencies: –{SSN}  {Ename, Bdate, Address, Dnumber, Dname, Dmgr_ssn} –{Dnumber}  {Dname, Dmgr_ssn} l Since Dnumber is not a key, you can have two rows with the same Dnumber. Hence their Dname and Dmgr_ssn must be the same => redundancy!

24 3NF l A relation schema R is in third normal form (3NF) if –It is in 2NF and –There is no non-prime attribute in R that is transitively dependent on the primary key  In X  Y and Y  Z are FDs, with X as the primary key, we consider Z to be transitively dependent on X only if Y is not a candidate key. If Y is a candidate key, then we do not consider this as a transitive dependency problem

25 Example of 3NF l FDs: –SSN  Ename, Bdate, Address, Dnumber –SSN  Dnumber –Dnumber  Dname, Dmgr_ssn l Dname is transitively dependent on the primary key SSN because SSN  Dnumber and Dnumber  Dname are FDs of the relation –Therefore the relation is not in 3NF

26 Third Normal Form l Another way to check whether a relation is in 3NF (without checking for partial and transitive dependencies): –A relation schema R is in 3NF if whenever a nontrivial FD X  A holds, either  X is a superkey of R or  A is a prime attribute of R

27 3NF l FDs: –SSN  Ename, Bdate, Address –SSN  Dnumber –Dnumber  Dname, Dmgr_ssn  But Dnumber is not superkey and Dname,Dmgr_ssn are not prime attributes l Therefore the relation is not in 3NF Transitive dependency

28 Normalizing into 3NF Take the “offending” FDs and create separate relations

29 Is 3NF enough to remove redundancy? l FDs: –{Student, Course}  Instructor –Instructor  Course l Relation is in 3NF (but there is still redundancy) Assume every instructor teaches only 1 course Key is (Student, Course) No transitive dependency because Course is not a prime attribute

30 BCNF (Boyce-Codd Normal Form) l A relation schema R is in BCNF if whenever an FD X  A holds in R, then X must be a superkey of R l FDs: –{Student, Course}  Instructor –Instructor  Course l Relation is not in BCNF because Instructor is not a superkey

31 Achieving BCNF by Decomposition l STUD_COURSE –Key is {Student,Course} l COURSE_INSTRUCT –Key is {Instructor} –FD: Instructor  Course l Loses the FD: {Student, Course}  Instructor –But no redundancy STUD_COURSECOURSE_INSTRUCT

32 Decomposition 1 l Problem: decomposition does not result in lossless join (i.e., does not have nonadditive join property) –i.e., spurious tuples may be generated

3 Decomposition 2 l Dependency preserving? No –loses the FD: {Student, Course}  Instructor l Lossless join? Yes

34 Decomposition 3 l Dependency preserving? No –loses the FD: {Student, Course}  Instructor l Lossless join? No

35 Summary l 1 st normal form –no composite/multivalued attributes in relations l 2 nd, 3 rd, and Boyce-Code normal forms –Eliminate redundancies based on FDs l More normal forms (see textbook) –4 th : deal with multivalued dependencies –5 th : deal with join dependencies