Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 4 Normalization.

Slides:



Advertisements
Similar presentations
3/25/2017.
Advertisements

Mapping ER to Relational Model
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for Relational Databases
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 7- 1.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 7- 1.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
METU Department of Computer Eng Ceng 302 Introduction to DBMS Functional Dependencies and Normalization for Relational Databases by Pinar Senkul resources:
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 7 Relational Database Design by ER- and EER-to-Relational Mapping.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 7 Relational Database Design by ER- Mapping.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 9 Relational Database Design by ER- and EER-to- Relational Mapping.
Chapter 8 Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Database Systems ER and EER to Relational Mapping Toqir Ahmad Rana Database Management Systems 1 Lecture 18.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 9 Relational Database Design by ER- and EER-to-Relational Mapping.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 7- 1.
Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2008
Relational Database Design by ER- and EER-to-Relational Mapping
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Normalization for Relational Databases.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Topic 10 Functional Dependencies and Normalization for Relational Databases Faculty of Information Science and Technology Mahanakorn University of Technology.
Instructor: Churee Techawut Functional Dependencies and Normalization for Relational Databases Chapter 4 CS (204)321 Database System I.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide DESIGNING A SET OF RELATIONS (2) Goals: Lossless join property (a must). Dependency.
Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 11 Relational Database Design Algorithms and Further Dependencies.
Chapter Functional Dependencies and Normalization for Relational Databases.
METU Department of Computer Eng Ceng 302 Introduction to DBMS Relational Database Design by ER to Relational Mapping by Pinar Senkul resources: mostly.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Relational Database Design by ER- and EERR-to-Relational Mapping.
Chapter 6 Relational Database Design by ER- and EERR-to-Relational Mapping Copyright © 2004 Pearson Education, Inc.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
Relational Database Design by ER- and EER-to-Relational Mapping The main reference of this presentation is the textbook and PPT from : Elmasri & Navathe,
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 9 Relational Database Design by ER- and EERR-to-Relational Mapping Slide 9- 1.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 7- 1.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Relational Database Design by ER- and EER-to-Relational Mapping
Al-Imam University Girls Education Center Collage of Computer Science 1 nd Semester, 1432/1433H Chapter 10_part2 Functional Dependencies and Normalization.
Relational Database Design by ER- and EER-to-Relational Mapping
Chapter 7 Relational Database Design by ER- and EERR-to-Relational Mapping.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Databases (CS507) CHAPTER 8
Chapter 7 Relational Database Design by ER- and EERR-to-Relational Mapping Copyright © 2004 Pearson Education, Inc.
Lecture # 21 Chapter # 7 Relational Database Design by ER- and EER-to-Relational Mapping Muhammad Emran Database Systems.
10/3/2017.
Relational Database Design by ER- and ERR-to-Relational Mapping
Relational Database Design by ER- and EER-to- Relational Mapping
Enhanced Entity-Relationship (EER) Modeling
Functional Dependencies and Normalization for Relational Databases
ER- and EER-to-Relational
9/5/2018.
11/15/2018.
Database Management systems Subject Code: 10CS54 Prepared By:
Chapter 8: Mapping a Conceptual Design into a Logical Design
Relational Database Design by ER- and EERR-to-Relational Mapping
4/11/2019.
Relational Database Design by ER- and EER-to-Relational Mapping
CS4222 Principles of Database System
7/19/2019.
Presentation transcript:

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 4 Normalization

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 4- 2 Chapter Outline 3 Normal Forms Based on Primary Keys 3.1 Normalization of Relations 3.2 Practical Use of Normal Forms 3.3 First Normal Form 3.4 Second Normal Form 3.5 Third Normal Form 5 BCNF (Boyce-Codd Normal Form) 3.Multivalued Dependencies and Fourth Normal Form Mapping EER Model Constructs to Relations Step 8: Options for Mapping Specialization or Generalization. Step 9: Mapping of Union Types (Categories).

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Normalization of Relations (1) Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations Normal form: Condition using keys and FDs of a relation to certify whether a relation schema is in a particular normal form

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 4- 4 Normalization of Relations (2) 2NF, 3NF, BCNF based on keys and FDs of a relation schema 4NF based on keys, multi-valued dependencies : MVDs; 5NF based on keys, join dependencies : JDs (Chapter 11) Additional properties may be needed to ensure a good relational design (lossless join, dependency preservation; Chapter 11)

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Practical Use of Normal Forms Normalization is carried out in practice so that the resulting designs are of high quality and meet the desirable properties The practical utility of these normal forms becomes questionable when the constraints on which they are based are hard to understand or to detect The database designers need not normalize to the highest possible normal form (usually up to 3NF, BCNF or 4NF) Denormalization: The process of storing the join of higher normal form relations as a base relation—which is in a lower normal form

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide First Normal Form Disallows composite attributes multivalued attributes nested relations; attributes whose values for an individual tuple are non-atomic Considered to be part of the definition of relation

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 4- 7 Figure 10.8 Normalization into 1NF

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 4- 8 Figure 10.9 Normalization nested relations into 1NF

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Second Normal Form (1) Uses the concepts of FDs, primary key Definitions Prime attribute: An attribute that is member of the primary key K Full functional dependency: a FD Y -> Z where removal of any attribute from Y means the FD does not hold any more Examples: {SSN, PNUMBER} -> HOURS is a full FD since neither SSN -> HOURS nor PNUMBER -> HOURS hold {SSN, PNUMBER} -> ENAME is not a full FD (it is called a partial dependency ) since SSN -> ENAME also holds

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Second Normal Form (2) A relation schema R is in second normal form (2NF) if every non-prime attribute A in R is fully functionally dependent on the primary key R can be decomposed into 2NF relations via the process of 2NF normalization

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Figure Normalizing into 2NF and 3NF

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Figure Normalization into 2NF and 3NF

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Third Normal Form (1) Definition: Transitive functional dependency: a FD X -> Z that can be derived from two FDs X -> Y and Y -> Z Examples: SSN -> DMGRSSN is a transitive FD Since SSN -> DNUMBER and DNUMBER -> DMGRSSN hold SSN -> ENAME is non-transitive Since there is no set of attributes X where SSN -> X and X -> ENAME

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Third Normal Form (2) A relation schema R is in third normal form (3NF) if it is in 2NF and no non-prime attribute A in R is transitively dependent on the primary key R can be decomposed into 3NF relations via the process of 3NF normalization NOTE: In X -> Y and Y -> Z, with X as the primary key, we consider this a problem only if Y is not a candidate key. When Y is a candidate key, there is no problem with the transitive dependency. E.g., Consider EMP (SSN, Emp#, Salary ). Here, SSN -> Emp# -> Salary and Emp# is a candidate key.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide General Normal Form Definitions (2) Definition: Superkey of relation schema R - a set of attributes S of R that contains a key of R A relation schema R is in third normal form (3NF) if whenever a FD X -> A holds in R, then either: (a) X is a superkey of R, or (b) A is a prime attribute of R NOTE: Boyce-Codd normal form disallows condition (b) above

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide BCNF (Boyce-Codd Normal Form) A relation schema R is in Boyce-Codd Normal Form (BCNF) if whenever an FD X -> A holds in R, then X is a superkey of R Each normal form is strictly stronger than the previous one Every 2NF relation is in 1NF Every 3NF relation is in 2NF Every BCNF relation is in 3NF There exist relations that are in 3NF but not in BCNF The goal is to have each relation in BCNF (or 3NF)

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Figure Boyce-Codd normal form

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Figure a relation TEACH that is in 3NF but not in BCNF

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Achieving the BCNF by Decomposition (1) Two FDs exist in the relation TEACH: fd1: { student, course} -> instructor fd2: instructor -> course {student, course} is a candidate key for this relation and that the dependencies shown follow the pattern in Figure (b). So this relation is in 3NF but not in BCNF A relation NOT in BCNF should be decomposed so as to meet this property, while possibly forgoing the preservation of all functional dependencies in the decomposed relations. (See Algorithm 11.3)

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Achieving the BCNF by Decomposition (2) Three possible decompositions for relation TEACH {student, instructor} and {student, course} {course, instructor } and {course, student} {instructor, course } and {instructor, student} All three decompositions will lose fd1. We have to settle for sacrificing the functional dependency preservation. But we cannot sacrifice the non-additivity property after decomposition. Out of the above three, only the 3rd decomposition will not generate spurious tuples after join.(and hence has the non-additivity property). A test to determine whether a binary decomposition (decomposition into two relations) is non-additive (lossless) is discussed in section under Property LJ1. Verify that the third decomposition above meets the property.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Multivalued Dependencies and Fourth Normal Form (1) (a)The EMP relation with two MVDs: ENAME —>> PNAME and ENAME —>> DNAME. (b)Decomposing the EMP relation into two 4NF relations EMP_PROJECTS and EMP_DEPENDENTS.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Multivalued Dependencies and Fourth Normal Form (1) (c) The relation SUPPLY with no MVDs is in 4NF but not in 5NF if it has the JD(R1, R2, R3). (d) Decomposing the relation SUPPLY into the 5NF relations R1, R2, and R3.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Multivalued Dependencies and Fourth Normal Form (2) Definition: A multivalued dependecy (MVD) X —>> Y specified on relation schema R, where X and Y are both subsets of R, specifies the following constraint on any relation state r of R: If two tuples t 1 and t 2 exist in r such that t 1 [X] = t 2 [X], then two tuples t 3 and t 4 should also exist in r with the following properties, where we use Z to denote (R 2 (X υ Y)): t 3 [X] = t 4 [X] = t 1 [X] = t 2 [X]. t 3 [X] = t 1 [X] = t 4 [X] = t 2 [X]. t 3 [X] = t 2 [X] = t 4 [X] = t 1 [X]. An (MVD) X —>> Y in R is called a trivial MVD if (a) Y is a subset of X or (b) X υ Y = R.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Multivalued Dependencies and Fourth Normal Form (4) Definition: A relation schema R is in 4NF with respect to a set of dependencies F (that includes functional dependencies and multivalued dependencies) if, for every nontrivial multivalued dependency X —>> Y in F +, X is a superkey for R. Note: F + is the (complete) set of all dependencies (functional or multivalued) that will hold in every relation state r of R that satisfies F. It is also called the closure of F.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Multivalued Dependencies and Fourth Normal Form (5) Decomposing a relation state of EMP that is not in 4NF: (a)EMP relation with additional tuples. (b)Two corresponding 4NF relations EMP_PROJECTS and EMP_DEPENDENTS.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Mapping EER Model Constructs to Relations Step8: Options for Mapping Specialization or Generalization. Convert each specialization with m subclasses {S1,S2,…,Sm} and generalized superclass C, where the attributes of C are {k,a1,…an} and k is the (primary) key, into relational schemas using one of the four following options: Option 8A: Multiple relations-Superclass and subclasses Option 8B: Multiple relations-Subclass relations only Option 8C: Single relation with one type attribute Option 8D: Single relation with multiple type attributes

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Mapping EER Model Constructs to Relations Option 8A: Multiple relations-Superclass and subclasses Create a relation L for C with attributes Attrs(L) = {k,a1,…an} and PK(L) = k. Create a relation Li for each subclass Si, 1 < i < m, with the attributes Attrs(Li) = {k} U {attributes of Si} and PK(Li) = k. This option works for any specialization (total or partial, disjoint of over-lapping.) Option 8B: Multiple relations-Subclass relation only Create a relation Li for each subclass Si, 1 < i < m, with the attributes Attrs(Li) = {attributes of Si} U {k,a1,…an} and PK(Li) = k. This option only works for a specialization whose subclasses are total (every entity in the superclass must belong to (at least) one of the subclasses).

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide FIGURE 4.4 EER diagram notation for an attribute-defined specialization on JobType.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide FIGURE 7.4 Options for mapping specialization or generalization. (a) Mapping the EER schema in Figure 4.4 using option 8A.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide FIGURE 4.3 Generalization. (b) Generalizing CAR and TRUCK into the superclass VEHICLE.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide FIGURE 7.4 Options for mapping specialization or generalization. (b) Mapping the EER schema in Figure 4.3b using option 8B.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Mapping EER Model Constructs to Relations (contd.) Option 8C: Single relation with one type attribute Create a single relation L with attributes Attrs(L) = {k,a 1,…a n } U {attributes of S 1 } U … U {attributes of S m } U {t} and PK(L) = k. The attribute t is called a type (or discriminating) attribute that indicates the subclass to which each tuple belongs Option 8D: Single relation with multiple type attributes Create a single relation schema L with attributes Attrs(L) = {k,a 1,…a n } U {attributes of S 1 } U…U {attributes of S m } U {t 1, t 2, …,t m } and PK(L) = k. Each t i, 1 < I < m, is a Boolean type attribute indicating whether a tuple belongs to the subclass S i.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide FIGURE 4.4 EER diagram notation for an attribute-defined specialization on JobType.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide FIGURE 7.4 Options for mapping specialization or generalization. (c) Mapping the EER schema in Figure 4.4 using option 8C.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide FIGURE 4.5 EER diagram notation for an overlapping (non-disjoint) specialization.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide FIGURE 7.4 Options for mapping specialization or generalization. (d) Mapping Figure 4.5 using option 8D with Boolean type fields Mflag and Pflag.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Mapping EER Model Constructs to Relations (contd.) Mapping of Shared Subclasses (Multiple Inheritance) A shared subclass, such as STUDENT_ASSISTANT, is a subclass of several classes, indicating multiple inheritance. These classes must all have the same key attribute; otherwise, the shared subclass would be modeled as a category. We can apply any of the options discussed in Step 8 to a shared subclass, subject to the restriction discussed in Step 8 of the mapping algorithm. Below both 8C and 8D are used for the shared class STUDENT_ASSISTANT.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide FIGURE 4.7 A specialization lattice with multiple inheritance for a UNIVERSITY database.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide FIGURE 7.5 Mapping the EER specialization lattice in Figure 4.6 using multiple options.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Mapping EER Model Constructs to Relations (contd.) Step 9: Mapping of Union Types (Categories). For mapping a category whose defining superclass have different keys, it is customary to specify a new key attribute, called a surrogate key, when creating a relation to correspond to the category. In the example below we can create a relation OWNER to correspond to the OWNER category and include any attributes of the category in this relation. The primary key of the OWNER relation is the surrogate key, which we called OwnerId.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide FIGURE 4.8 Two categories (union types): OWNER and REGISTERED_VEHICLE.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide FIGURE 7.6 Mapping the EER categories (union types) in Figure 4.7 to relations.

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide Mapping Exercise Exercise 7.4. FIGURE 7.7 An ER schema for a SHIP_TRACKING database.