1/22/20091 Study the methods of first, second, third, Boyce-Codd, fourth and fifth normal form for relational database design, in order to eliminate data.

Slides:



Advertisements
Similar presentations
Shantanu Narang.  Background  Why and What of Normalization  Quick Overview of Lower Normal Forms  Higher Order Normal Forms.
Advertisements

Normalisation to 3NF Database Systems Lecture 11 Natasha Alechina.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Normalization of Database Tables
Normalization What is it?
Wei-Pang Yang, Information Management, NDHU More on Normalization Unit 18 More on Normalization ( 表格正規化探討 ) 18-1.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Normalization of Database Tables
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Normalization I.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Chapter 5 Normalization of Database Tables
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
NORMALIZATION N. HARIKA (CSC).
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
Introduction to Schema Refinement
Chapter 5 Normalization of Database Tables
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Week 6 Lecture Normalization
Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Chapter 3 Database Normalization 1.
Copyright © Curt Hill Schema Refinement III 4 th NF and 5 th NF.
Chapter 13 Further Normalization II: Higher Normal Forms.
Normalization. 2 Objectives u Purpose of normalization. u Problems associated with redundant data. u Identification of various types of update anomalies.
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
Database Systems: Design, Implementation, and Management Tenth Edition
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
Normalization. Learners Support Publications 2 Objectives u The purpose of normalization. u The problems associated with redundant data.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
Further Normalization II: Higher Normal Forms Prof. Yin-Fu Huang CSIE, NYUST Chapter 13.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, Modified by Dr. Mathis 3-1 David M. Kroenke’s Chapter Three: The Relational.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
1 Functional Dependencies and Normalization Chapter 15.
Dr. Mohamed Osman Hegaz1 Logical data base design (2) Normalization.
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Normalization.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
3 Spring Chapter Normalization of Database Tables.
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
Normalisation 1NF to 3NF Ashima Wadhwa. In This Lecture Normalisation to 3NF Data redundancy Functional dependencies Normal forms First, Second, and Third.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Advanced Database System
Database Architecture Normalization. Purpose of Normalization A technique for producing a set of relations with desirable properties, given the data requirements.
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Chapter 3 Database Normalization 1.
Chapter 8 Relational Database Design Topic 1: Normalization Chuan Li 1 © Pearson Education Limited 1995, 2005.
4TH NORMAL FORM By: Karen McVay.
Normalization Database Management Systems, 3rd ed., Ramakrishnan and Gehrke, Chapter 19.
Functional Dependency and Normalization
Advanced Normalization
Announcements Read 5.1 – 5.5 for today Read 5.6 – 5.7 for Wednesday
A brief summary of database normalization
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Payroll Management System
Chapter 8: Relational Database Design
Advanced Normalization
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Database Normalization
Chapter 6 Normalization of Database Tables
Database solutions The process of normalization Marzena Nowakowska Faculty of Management and Computer Modelling Kielce University of Technology rooms:
Normalization.
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Normalisation to 3NF.
Relational Database Design
Presentation transcript:

1/22/20091 Study the methods of first, second, third, Boyce-Codd, fourth and fifth normal form for relational database design, in order to eliminate data redundancy and update abnormality. Lecture 3 on Data Normalization

1/22/20092 Normalization Theory Refine database design to eliminate abnormalities (irregularities) of manipulating database

1/22/ NF, 2NF and 3NF Built around the concept of normal forms –Normal form: Contains atomic values only –All normalized relations are in 1NF –2NF is the subset of 1NF, 3NF is the subset of 2NF and so on … –3NF is more desirable than 2NF, 2NF is more desirable than 1NF

1/22/20094 BCNF, 4NF and 5NF(PJNF) Boyce-Codd Normal Form –A stronger form of 3NF –Every BCNF is also 3NF, but some 3NF are not BCNF 4NF and 5NF –Defined recently –Deal with multi-valued dependency (MVD) and join dependency (JD)

1/22/20095 Relationship between Normal Forms Universe of relations 1NF relations 2NF relations 3NF relations BCNF relations 4NF relations 5NF/PJNF relations

1/22/20096 First Normal Form A relation is in 1NF if each attribute contains only one value (not a set of values) The primary key (PK) can not be null

1/22/20097 First Normal Form S#S-nameEnrollments S1BrownC1 Math C2 Chem C3 Phys S2SmithC2 Chem C3 Phys C4 Math S3BrownC2 Chem C3 Phys Is this relation in 1NF? Relation STUDENT-A

1/22/20098 First Normal Form S#S-nameEnrollments S1BrownC1 Math C2 Chem C3 Phys S2SmithC2 Chem C3 Phys C4 Math S3BrownC2 Chem C3 Phys NO!!! Elements in the domain Enrollments are not atomic Could be split into two domains: C# and C- Name Relation STUDENT-B

1/22/20099 First Normal Form Enrollments is split into C# and C-Name Use S# and C# as a compound PK A student may attend several courses and a course may have several students So S# and C# has a m:n mapping S#S-NameC#C-Name S1BrownC1Math S1BrownC2Chem S1BrownC3Phys S2SmithC2Chem S2SmithC3Phys S2SmithC4Math S3BrownC2Chem S3BrownC3Phys Relation STUDENT-B

1/22/ Functional Dependency (FD) Attribute Y of relation R is functionally dependent on attribute X of R  each value of X is associated with exactly one value of Y Denoted by X  Y In the relation STUDENT-B: –S#  S-Name –C#  C-Name –S#, C#  0

1/22/ Anomalies using 1NF 1NF relations require less complicated application to operate as opposed to unnormalized relations Anomalies in insert: –Since PK is composed of C# and S#, both details of student and course must be known before inserting a entry –Eg: to add a course, at least one student is enrolled

1/22/ Anomalies using 1NF Anomalies in delete: –If all students attending a particular course are deleted, the course will not be found in the database Anomalies in update: –Redundancy of S-Name and C-Name –Increase storage space and effort to modify data item –If a course is modified, all tuples containing that course must be updated

1/22/ Second Normal Form A relation is in 2NF if it is in 1NF and every non- PK attribute is fully functionally dependant on the PK In the relation STUDENT-B –PK: C#, S# –Non-PK attribute: C-Name, S-Name –C#, S#  S-Name –S#  S-Name –Since S-Name is only partially dependent on the PK, relation Student-B is not in 2NF

1/22/ Second Normal Form All of them are in 2NF as none of them has partial dependency Original information can be reconstructed by natural join operation S#S- Name S1Brown S2Smith S3Brown C#C-Name C1Math C2Chem C3Phys C4Math S#C# S1C1 S1C2 S1C3 S2C2 S2C3 S2C4 S3C2 S3C3 Relation STUDENT Relation COURSE Relation SC

1/22/ Anomalies in 2NF Suppose we have the relations PRODUCT, MACHINE and EMPLOYEE P#  M# P#  E# M#  E# The tuple (P1, M1, E1) means product P1 is manufactured on machine M1 which is operated by employee E1

1/22/ Anomalies in 2NF Anomalies in insert: –It is not possible to store the fact that which machine is operated by which employee without knowing at least one product produced by this machine Anomalies in delete: –If an employee is fired the fact that which machine he operated and what product that machine produced are also lost

1/22/ Anomalies in 2NF Anomalies in update: –If one employee is assigned to operate another machine then several tuples have to be updated as well

1/22/ Third Normal Form A relation is in 3NF if it is in 2NF and no non-PK attributes is transitively dependent on the PK In the manufacture relations: –P#  M# and M#  E# implies P#  E# –So P#  E# is a transitive dependency

1/22/ Third Normal Form P#M#E# P1M1E1 P2M2E3 P3M1E1 P4M1E1 P5M3E2 P6M4E1 P#M# P1M1 P2M2 P3M1 P4M1 P5M3 P6M4 M#E# M1E1 M2E3 M3E2 M4E1 MANUFACTURE R1 R2 No loss of information Insert, delete and update anomalies are eliminated

1/22/ Boyce/Codd Normal Form A relation is BCNF  every determinant is a candidate key A determinant is an attribute, possibly composite, on which some other attribute is fully functionally dependent

1/22/ Boyce/Codd Normal Form There exists a relation SJT with attributes S (student), J (subject) and T (teacher). The meaning of SJT tuple is that the specified student is taught the specified subject by the specified teacher. SJT SmithMathProf. White SmithPhysicsProf. Green JonesMathProf. White JonesPhysicsProf. Brown Relation SJT  1.For each subject (J), each student (S) of that subject taught by only one teacher (T): FD: S, J  T  2.Each teacher (T) teaches only one subject (J): FD: T  J  T 3.Each subject (J) is taught by several teacher: MVD: J   T

1/22/ Boyce/Codd Normal Form There are two determinants: (S, J) and T in functional dependency Anomalies in update: –If the fact that Jones studies physics is deleted, the fact that Professor Brown teaches physics is also lost. It is because T is a determinant but not a candidate key

1/22/ Boyce/Codd Normal Form SJ SmithMath SmithPhysics JonesMath JonesPhysics TJ Prof. WhiteMath Prof. GreenPhysics Prof. BrownPhysics Relation ST Relation TJ Relations (S, J) and (T, J) are in BCNF because all determinants are candidate keys.

1/22/ Multi-valued Dependency Given a relation R with attributes A, B and C. The multi-valued dependence R.A  R.B holds  the set of B-values matching a given (A-value, C-value) pair in R depends only on the A-value and is independent of the C-value

1/22/ Fourth Normal Form A relation is in 4NF  whenever there exists an multi-valued dependence (MVD), say A  B, then all attributes are also functionally dependent on A, i.e. A  X for all attribute X of the relation

1/22/ Fourth Normal Form CourseTeacherText PhysicsProf. GreenBasic Mechanics PhysicsProf. GreenPrinciples of Optics PhysicsProf. BrownBasic Mechanics PhysicsProf. BrownPrinciples of Optics PhysicsProf. BlackBasic Mechanics PhysicsProf. BlackPrinciples of Optics MathProf. WhiteModern Algebra MathProf. WhiteProjective Geometry Relation CTX (not in 4NF)

1/22/ Fourth Normal Form A tuple (C, T, X) appears in CTX  course C can be taught by teacher T and uses X as a reference. For a given course, all possible combinations of teacher and text appear – that is, CTX satisfies the constraint: if tuples (C, T1, X1), (C, T2, X2) both appears, then tuples (C, T1, X2), (C, T2, X1) both appears also

1/22/ Fourth Normal Form CTX contains redundancy CTX is in BCNF as there are no other functional determinants But CTX is not in 4NF as it involves an MVD that is not an FD at all, let alone an FD in which the determinant is a candidate key

1/22/ Anomalies in insert For example, to add the information that the physics course uses a new text called Advanced Mechanism, it is necessary to create three new tuples, one for each of the three teachers.

1/22/ Fourth Normal Form CourseTeacher PhysicsProf. Green PhysicsProf. Brown PhysicsProf. Black MathProf. White CourseText PhysicsBasic Mechanics PhysicsPrinciples of Optics MathModern Algebra MathProjective Geometry Relation CT Relation CX 4NF is an improvement over BCNF, in that it eliminates another form of undesirable structure

1/22/ Fifth Normal Form Join dependency: relation R satisfies the JD (X, Y, … Z)  it is the join of its projections on X, Y, … Z where X, Y, … Z are subsets of the set of attributes of R A relation is in 5NF/PJNF (Projection-join normal form)  every join dependency in R is implied by the candidate keys of R 5NF is the ultimate normal form with respect to projection and join

1/22/ Fifth Normal Form S#P#J# S1P1J2 S1P2J1 S2P1J1 S1P1J1 S#P# S1P1 S1P2 S2P1 J#S# J2S1 J1S1 J1S2 P#J# P1J2 P2J1 P1J1 S#P#J# S1P1J2 S1P1J1 S1P2J1 S2P1J2 S2P1J1 Join over P# Spurious Join over (J#, S#) SPJ is the join of all of its three projections, not of any two! Relation SPJ JSPJ SP

1/22/ Join Dependence constraint Condition: JD(join dependence) in relation R(S#, P#, J#) Constraint: if R1(S#, P#), R2(P#, J#) and R3(J#, S#) exists then R(S#, P#, J#) exists

1/22/ Connection Trap Condition: Without JD(join dependence) in relation (S#, P#, J#) Connect trap: if R1(S#, P#), R2(P#, J#) and R3(J#, S#) exists then R(S#, P#, J#) may not exist and R1, R2 and R3 may not be able to be connected

1/22/ Abnomalies in insert with JD If insert (S1, P1, J2), (S1, P2, J1), and (S2, P1, J1) Then (S1, P1, J1) must also be inserted On the other hand, if one of (S1, P1, J2), (S1, P2, J1) and (S2, P1, J1) is deleted, then (S1, P1, J1) must also be deleted.

1/22/ Fifth Normal Form (5NF) S#P# S1P1 S1P2 S2P1 J#S# J2S1 J1S1 J1S2 P#J# P1J2 P2J1 P1J1 JSPJSP

1/22/ Steps in normalization 1.Decompose all data structures that are not 2D into 2D relations of segments 2.Eliminate any partial dependency 3.Eliminate any transitive dependency 4.Eliminate any remaining FD in which determinant is not a candidate key 5.Eliminate any MVD 6.Eliminate any JD that are implied by candidate keys Unnormalized form 1NF 2NF 3NF BCNF 4NF 5NF/PJNF

1/22/ Lecture Summary The 1NF, 2NF, 3NF, BCNF, 4NF and 5NF are to split the unnormalized table into normalized table(s), and which can eliminate data redundancy and update abnormality. The higher norm form implies the lower norm form.

1/22/ Review Question Explain the differences between Third Normal Form and Boyce Codd Normal Form with respect to functional dependencies. Why Boyce Codd is called “Strong” third normal form? How can one normalize relations of Third Normal Form into Boyce Codd Normal Form?

1/22/ Tutorial Question Describe and derive the unnormal, first, second and third normal form for the following unnormal form including 12 data fields with 4 of them are in repeating groups in a table. Identify the functional dependencies of each normal form.

1/27/ Reading Assignment Chapter 10 Functional Dependencies and Normalization for Relational Databases and Chapter 11 Relational Database Design Algorithms and Further Dependencies of “Fundamentals of Database Systems” fifth edition, by Elmasri & Navathe, Pearson, 2007.