Databases 6: Normalization

Slides:



Advertisements
Similar presentations
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
Advertisements

Shantanu Narang.  Background  Why and What of Normalization  Quick Overview of Lower Normal Forms  Higher Order Normal Forms.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Database Management COP4540, SCS, FIU Functional Dependencies (Chapter 14)
Ch 10, Functional Dependencies and Normal forms
Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for Relational Databases
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Part 6 Chapter 15 Normalization of Relational Database Csci455 r 1.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
METU Department of Computer Eng Ceng 302 Introduction to DBMS Functional Dependencies and Normalization for Relational Databases by Pinar Senkul resources:
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Introduction to Normalization CPSC 356 Database Ellen Walker Hiram College.
King Saud University College of Computer & Information Sciences Computer Science Department CS 380 Introduction to Database Systems Functional Dependencies.
Your name here. Improving Schemas and Normalization What are redundancies and anomalies? What are functional dependencies and how are they related to.
DatabaseIM ISU1 Chapter 10 Functional Dependencies and Normalization for RDBs Fundamentals of Database Systems.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Lecture 5: Functional dependencies and normalization Jose M. Peña
Topic 10 Functional Dependencies and Normalization for Relational Databases Faculty of Information Science and Technology Mahanakorn University of Technology.
Instructor: Churee Techawut Functional Dependencies and Normalization for Relational Databases Chapter 4 CS (204)321 Database System I.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Chapter Functional Dependencies and Normalization for Relational Databases.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
Functional Dependencies and Normalization Jose M. Peña
Logical Database Design (1 of 3) John Ortiz Lecture 6Logical Database Design (1)2 Introduction  The logical design is a process of refining DB schema.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
1 Functional Dependencies and Normalization Chapter 15.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Lecture 8: Database Concepts May 4, Outline From last lecture: creating views Normalization.
1 Functional Dependencies. 2 Motivation v E/R  Relational translation problems : –Often discover more “detailed” constraints after translation (upcoming.
Chapter 5.1 and 5.2 Brian Cobarrubia Database Management Systems II January 31, 2008.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 340 Introduction to Database Systems.
CS 338Database Design and Normal Forms9-1 Database Design and Normal Forms Lecture Topics Measuring the quality of a schema Schema design with normalization.
Ch 7: Normalization-Part 1
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
11/06/97J-1 Principles of Relational Design Chapter 12.
1 CS 430 Database Theory Winter 2005 Lecture 8: Functional Dependencies Second, Third, and Boyce-Codd Normal Forms.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
10/3/2017.
Functional Dependency and Normalization
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Database Management systems Subject Code: 10CS54 Prepared By:
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Schema Refinement and Normal Forms
Lecture 5: Functional dependencies and normalization
Relational Database Design
Presentation transcript:

Databases 6: Normalization Hossein Rahmani

Design of a Relational DB-Schema What is a good relational database schema? Conceptual level (ER-schema, Views) Implementation level (consistency, efficiency of base relations = stored tables) How do you achieve a good schema? Bottom-up (start with relations between attributes) design by synthesis Top-down (start with ER-schema and then further decomposition): design by analysis 2 Databases lecture 6

Obtaining a good design We first give (informally) 4 guidelines Then we will give formal requirements incorporating the guidelines (based on functional dependencies) Databases lecture 6

Guideline 1: Semantics Make sure that the semantics (meaning) of all base relations and attributes is clear Tuples must be easily interpreted as ‘facts’ Do not mix, if possible, attributes of more than one entity or relation type in one base relation 4 Databases lecture 6

Examples of poor design Databases lecture 6

Guideline 2: Redundancy and anomalies Avoid redundancy: reduce the space that is needed to store the database as much as possible Prevent anomalies when changing data in the database update (insertion / deletion / modification) anomalies 6 Databases lecture 6

Redundancy - example Databases lecture 6

Update Anomalies Cause: doubly stored data, wrong design (example: see next slide) insertion anomalies: new tuple contains incorrect attribute value for an already stored entity new entity has a null key deletion anomalies Incomplete deletion of an entity unwanted deletion of an entity modification anomalies incomplete modification of an entity 8 Databases lecture 6

Guideline 3: NULL-values Some base relations contain many attributes that often are ‘NULL’ Unnecessary use of space Multiple meanings of ‘NULL’ JOIN operations can have undesired effects COUNT and SUM can go wrong SO: place an attribute in a base relation in which it is as least as possible ‘NULL’ 9 Databases lecture 6

Guideline 4: False (Spurious) Tuples If we select base relations wrong, a (NATURAL-)JOIN can create tuples that do not have any connection with the mini world (see next slides) So: select base relations such that at a JOIN on primary or foreign keys, no spurious tuples can occur. Don’t JOIN on other attributes 10 Databases lecture 6

Wrong choice - relations Databases lecture 6

Wrong choice: states Databases lecture 6

Natural join -> spurious tuples (marked *) Databases lecture 6

Normalization Using ‘normalization’, you can adhere to these guidelines for a large part In a number of steps (algorithms) you transfer a given relational database schema into an ever higher normal form Base concept: functional dependency 14 Databases lecture 6

Functional dependency Start with one universal relation schema R containing all attributes A1,..,An Given two attribute sets X and Y in R Functional dependency X  Y exists (X functionally determines Y; Y is functionally dependent on X) if: r(R): t1,t2  r: t1[X] = t2[X]  t1[Y] = t2[Y] i.e.: component X determines component Y 15 Databases lecture 6

Functional dependency AB  C A B C 16 Databases lecture 6

Functional dependency If X is a superkey of R then X  Y holds for each set Y of attributes in R If X  Y, then nothing can be concluded on the existence of Y  X X  Y follows from the semantics of the attributes in X and Y (which means that the designer should note and declare it) r(R) is legal if it agrees with all functional dependencies (FDs) declared on R 17 Databases lecture 6

Inference rules for FDs Six rules for deriving FDs: IR1 (reflexive): if X  Y then X  Y (trivial) As a special case: X  X IR2 (extension): {X  Y} |= XZ  YZ IR3 (transitive): {X  Y, Y  Z} |= X  Z IR4 (project): {X  YZ} |= X  Y IR5 (combine): {X  Y, X  Z} |= X  YZ IR6 (pseudotransitive): {X  Y, WY  Z} |= WX  Z 18 Databases lecture 6

Closure F+ of F Given a set of FDs for R: F(R) IR1-3 is sound & complete (Armstrong) Sound: If a new FD f can be derived from F(R) using IR1-3, and r(R) is legal for F, then r(R) is also legal for F  {f} Complete: If FD f holds on R then f can be derived from F using IR1-3 The set F+(R) of all FDs that can be derived from F, is called the closure F+ of F(R) 19 Databases lecture 6

Closure X+ under F Given X  Y  F. The closure X+ of X under F is the set of all attributes that are also functionally dependent on X Algorithm to determine X+: X+ := X ; do { oldX+ = X+ ; for all Y  Z  F : if X+  Y then X+ = X+  Z; } while (oldX+  X+) 20 Databases lecture 6

Equivalence Two sets of FDs, F and E are equivalent (F  E) iff F+ = E+ Semantically: if F  E, then r(R) is legal for F iff r(R) is legal for E By definition: F |= f iff F  F  {f} For each set F there exist many equivalent sets of FDs. We prefer simplicity: minimal cover 21 Databases lecture 6

Minimal Cover We can translate any F into an equivalent minimal cover G A set FDs G is a minimal cover of F if G  F and for all X  Y  G, Y has exactly one attribute (so, if X  YZ, then split into X  Y and X  Z) We cannot remove any X  Y from G without loosing equivalence with F We cannot replace any X  Y in G by W  Y with W  X, without loosing equivalence with F 22 Databases lecture 6

Algorithm for Minimal Cover 1) Start with G := F ; 2) Replace all X  Y with Y = {A1,..,An} by X  Ai ; (IR4) 3) For all XY  A: if G - {XY  A}  G  {X  A} then replace XY  A with X  A; 4) If G - {X  A}  G then remove X  A; Example: 1) AB  CD; C  D; A  CB; 2) AB  C; AB  D; C  D; A  C; A  B; 3) A  C; A  D; C  D ; A  B; 4) A  C; C  D ; A  B; 23 Databases lecture 6

Normal forms Invented by Codd as a test on relational database schemas The tests (‘normal forms’) grow more severe. The more severe the test, the higher the normal form, the more robust the database If a schema does not pass the test, it is decomposed in partial schemas that do pass the test It is not always necessary to reach the highest possible normal form 24 Databases lecture 6

1NF - First Normal Form Attributes can only be single-valued Is a basic demand of most relational databases Example of a non-1NF relation (see next slide). This normally is already ruled out by the definition of a relation (so using the relational database model automatically ensures 1NF) 25 Databases lecture 6

Non-1NF - example Databases lecture 6

1NF - First Normal Form Solutions for a multi-valued attribute A in R: Preferred: create new relation S with A and a foreign key to R Extend the key of R with an index number for the values of A (redundancy!); e.g. department has no. 5A, or 5B, or 5C Determine the maximum number of values per tuple for A (say k) and replace attribute by k attributes (say, loc1, loc2, and loc3). This introduces null-values! 27 Databases lecture 6

2NF - Second Normal form Definition: X  Y is a partial functional dependency if there is an attribute A in X s.t. X-{A}  Y X  Y is total if it is not partial 2NF: each non-primary attribute is totally dependent on primary key (and not on parts of the primary key) 28 Databases lecture 6

2NF - Normalizing Break up the relation such that every partial key with their dependent attributes is in a separate relation. Only keep those attributes that depend totally on the primary key Example (see next slide) 29 Databases lecture 6

2NF - example Databases lecture 6

3NF - Third Normal Form Definition: X  Y is a transitive dependency if there is a Z that is not (part of) a candidate key s.t. X  Z and Z  Y 3NF: no non-primary attribute is transitively depending on the primary key 31 Databases lecture 6

3NF - Normalizing Break up the relation such that the attributes that are depending on not-key attributes appear in a separate table (together with the attributes on which they depend) Example (see next slide) 32 Databases lecture 6

3NF - example Databases lecture 6

General form 2NF and 3NF Put the same demands on all candidate keys (super keys) – which is more severe 2NF: every non-key attribute is totally dependent on all keys 3NF: no non-key attribute is transitively dependent on any key Other formulation: if X  A then A is prime or X is a super key Example (see next slide) 34 Databases lecture 6

General form 2NF and 3NF - example Databases lecture 6

General form 2NF and 3NF - example Databases lecture 6

Boyce-Codd Normal Form Simpler, but stronger than 3NF BCNF: for each non-trivial dependency X  A holds that X is a super key Difference: in 3NF if A is a prime attribute, X does not have to be super key In many cases a 3NF schema is also BCNF 37 Databases lecture 6

BCNF example Databases lecture 6

Decompositions Only adhering to a normal form is not enough We must not lose attributes in the process! Non-additive Join-property: a natural join of the result of a decomposition should result in the original table, without spurious tuples There exist algorithms to automatically find good decompositions 39 Databases lecture 6

ER-schema to relational schemas A relational database schema that is mapped from an ER-schema is often in BCNF, but always in 3NF (so, check if BCNF is applicable and useful) Many CASE-tools can map an ER-schema automatically into a good relational schema (e.g., SQL create-table commands) 40 Databases lecture 6