7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

Slides:



Advertisements
Similar presentations
Chapter 5 Normalization of Database Tables
Advertisements

1 Week 4: Normalisation: Redundant data becomes inconsistent data; therefore … “The key, the whole key, and nothing but the key,so help me, Codd”
Chapter 8 Normalization. © 2001 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Outline Modification anomalies Functional dependencies.
Fundamentals, Design, and Implementation, 9/e Chapter 4 The Relational Model and Normalization.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Developing Data Models for Business Databases.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 7 Normalization of Relational Tables.
Chapter 5 Normalization Transparencies © Pearson Education Limited 1995, 2005.
7-1 Normalization - Outline  Modification anomalies  Functional dependencies  Major normal forms  Practical concerns.
Chapter 5 Normalization of Database Tables
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 The Relational Data Model.
Michael F. Price College of Business Chapter 6: Logical database design and the relational model.
Normalization Rules for Database Tables Northern Arizona University College of Business Administration.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 5 Normalization of Database Tables.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Concepts and Terminology Introduction to Database.
Chapter 6 Normalization 正規化. 6-2 In This Chapter You Will Learn:  更動異常  How tables that contain redundant data can suffer from update anomalies ( 更動異常.
Fundamentals, Design, and Implementation, 9/e. Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 4/2 Copyright.
CMPE 226 Database Systems September 16 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
Avoiding Database Anomalies
Normalization. 2 Objectives u Purpose of normalization. u Problems associated with redundant data. u Identification of various types of update anomalies.
Concepts of Database Management, Fifth Edition
5 1 Chapter 5 Normalization of Database Tables Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Normalization. Learners Support Publications 2 Objectives u The purpose of normalization. u The problems associated with redundant data.
Chapter 7 Normalization. Outline Modification anomalies Functional dependencies Major normal forms Relationship independence Practical concerns.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
Concepts of Relational Databases. Fundamental Concepts Relational data model – A data model representing data in the form of tables Relations – A 2-dimensional.
資料庫正規化 Database Normalization 取材自 AIS, 6 th edition By Gelinas et al.
Chapter 7 Normalization. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Modification anomalies Functional dependencies.
Normalization Transparencies
Copyright © 2011 by Michael V. Mannino All rights reserved. Database Design, Application Development, and Administration, 5 th Edition Chapter 3 The Relational.
Chapter 2 The Relational Data Model. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Relational model basics Integrity.
5 Copyright © 2006, Oracle. All rights reserved. Understanding Entity Relationship Diagrams ( 實體關係圖 ) Part I.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
Chapter 10 Application Development with Views. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Background Creating.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems.
Lecture 5 Normalization. Objectives The purpose of normalization. How normalization can be used when designing a relational database. The potential problems.
Chapter 13 Normalization Transparencies Last Updated: 08 th May 2011 By M. Arief
Chapter 10 Normalization Pearson Education © 2009.
ITN Table Normalization1 ITN 170 MySQL Database Programming Lecture 3 :Database Analysis and Design (III) Normalization.
Database Design – Lecture 8
1 Functional Dependencies and Normalization Chapter 15.
3 Copyright © 2006, Oracle. All rights reserved. Relational Data Model Part II.
Normalization of Database Tables
©NIIT Normalizing and Denormalizing Data Lesson 2B / Slide 1 of 18 Objectives In this section, you will learn to: Describe the Top-down and Bottom-up approach.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
Lecture Nine: Normalization
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 10 Application Development with Views.
IST Database Normalization Todd Bacastow IST 210.
Ch 7: Normalization-Part 1
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
Normalisation 1NF to 3NF Ashima Wadhwa. In This Lecture Normalisation to 3NF Data redundancy Functional dependencies Normal forms First, Second, and Third.
Lecture 4: Logical Database Design and the Relational Model 1.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
Chapter 2 The Relational Model. 2-2 In This Chapter You Will Learn   What a data model is and what its uses are   The terminology of the relational.
Logical Database Design and Relational Data Model Muhammad Nasir
7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part II)
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Chapter 8 Relational Database Design Topic 1: Normalization Chuan Li 1 © Pearson Education Limited 1995, 2005.
Chapter 2 The Relational Data Model. Outline Relational model basics Integrity rules Rules about referenced rows Relational Algebra.
Normalization Karolina muszyńska
Chapter 4: Logical Database Design and the Relational Model
Chapter 9 Designing Databases
Normalization Dale-Marie Wilson, Ph.D..
* 07/16/96 Normalization 2/16/2019 *.
國立臺北科技大學 課程:資料庫系統 2015 fall Chapter 14 Normalization.
Presentation transcript:

7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

Outline Modification anomalies ( 修改的異常 ) Functional dependencies ( 函數性的依賴 ) Major normal forms Practical concerns ( 實務的考量 )

Modification Anomalies ( 修改的異常 : 修改資料時所發生的異常現象 ) Definition: –Unexpected side effects ( 未預期到的副作用 ) that occurs when changing the data in a table designed with excessive redundancy ( 額外的多餘性、累贅性 ). Result of side effect –Insert, update, and delete more data than desired Types –Insertion Anomaly ( 新增的異常 ) –Update Anomaly ( 更新的異常 ) –Deletion Anomaly ( 刪除的異常 )

Example of a Poor Table Design (Big University Database) PK design: combination of StdSSN and OfferNo Pros Easier to query (no join is needed) –enrollments of student S1 or S2 –students of offering O2 –students or offerings of course C2 Cons Table has obvious redundancies (shown by blocks in colors) –Result : more difficult to change StdSSNStdCityStdClassOfferNoOffTermOffYearEnrGradeCourseNoCrsDesc S1SeattleJUNO1Fall C1DB S1SeattleJUNO2Fall C2VB S2BothellJUNO3Spring C3OO S2BothellJUNO2Fall C2VB

Insertion Anomaly ( 新增的異常 ) Definition: In an insertion, extra data beyond the desired data may be added to the database. Example: Cannot insert a new student without enrolling in an offering (Because OfferNo is part of PK) –Insert more column data than desired Other example ? Why ? each data row denotes student, offering, course, enrollment. PK consists of StdSSN denoting student and OfferNo denoting offering StdSSNStdCityStdClassOfferNoOffTermOffYearEnrGradeCourseNoCrsDesc S1SeattleJUNO1Fall C1DB S1SeattleJUNO2Fall C2VB S2BothellJUNO3Spring C3OO S2BothellJUNO2Fall C2VB

Update Anomaly ( 更新的異常 ) Definition: In order to modify only a single fact, it may be necessary to change multiple rows. Example: If changing a course description, it must change every enrollment of the course –Try to change C2’s course description, …. Other example ? StdSSNStdCityStdClassOfferNoOffTermOffYearEnrGradeCourseNoCrsDesc S1SeattleJUNO1Fall C1DB S1SeattleJUNO2Fall C2VB S2BothellJUNO3Spring C3OO S2BothellJUNO2Fall C2VB colored table

Deletion Anomaly ( 刪除的異常 ) Definition: Deleting a row may inadvertently ( 不注意地 ) cause other data to be deleted. Example: If we remove enrollment of student S2 in offering O3, causing loss of information about offering O3 and course C3 Other example ? StdSSNStdCityStdClassOfferNoOffTermOffYearEnrGradeCourseNoCrsDesc S1SeattleJUNO1Fall C1DB S1SeattleJUNO2Fall C2VB S2BothellJUNO3Spring C3OO S2BothellJUNO2Fall C2VB

StdSSNOfferNoEnrGrade S1O13.5 S1O23.3 S2O33.1 S2O23.4 OfferNoOffYearCourseNo O1MWC1 O2MWC2 O3MWC3 Table Name: Offering Table Name: Enrollment Table Name: Student StdSSNStdLastNameStdClass S1WELLSJUN S2NORBERTJUN S3KENDALLJUN CourseNoCreDesc C1DB C2VB C3OO Table Name: Course Example of a Better Table Design (Big University Database : 4 Tables Denoting 4 Objects+FKs ) Delete Anomaly? Update Anomaly? Insert Anomaly? Insert Anomaly?

Normalization ( 正規化 ) A good database design ensures the users can change the contents of a database without unexpected side effects (modification anomalies). –A better solution is to modify the table design to remove the redundancies that cause the anomalies. Normalization: –The process of removing redundancies in a table so that the table is easier to modify.

Constraints of Database Content Value-based constraints A comparison of a column to a constant –Example: Age >= 21 Value-neutral constraints A comparison of columns (column to column) –PK (entity integrity constraint) — Constraint about the PK column of one or more rows –FK (referential integrity constraint) — Constraint about parent PK and child FK of one or more rows –Functional dependency ( 函數性的依賴 ) — Constraint about two or more columns of a table

Functional Dependency ( 函數性的依賴 ) “X determines Y” is denoted as X  Y For each X value, there is at most one Y value X : left-hand-side (LHS) or determinant ( 決定項 ) Y : right-hand-side (RHS) Like a mathematical function: Y = f ( X ) – f : like a table – X : like the key of a table – Y : like a column of a table Example: StdSSN  StdName StdSSN  StdClass

Functional Dependency (FD) Think about functional dependencies as identifying potential candidate keys X  Y denotes an FD between columns X and Y –If X and Y are placed together in a table without other columns, X is a candidate key.

StdSSN, OfferNo  EnrGrade StdSSN  StdCity, StdClass OfferNo  OffTerm, OffYear, CourseNo, CrsDesc CourseNo  CrsDesc Functional Dependency Diagram and List ( 函數依賴圖和函數依賴清單 ) Functional Dependency Diagram List of Functional Dependencies Table Scheme

How to Identify Functional Dependencies Deriving from uniqueness statement Deriving from 1-M relationships Considering minimalism ( 極簡化 ) of FD’s LHS (Determinant)

How to Identify Functional Dependencies Deriving from uniqueness statement Example: –A user may state that each course offering has a unique offering number along with the year and term of the offering : OfferNo  OfferYear, OfferTerm

How to Identify Functional Dependencies Deriving from 1-M relationships For an 1-M relationship, an FD exists in –the child table-to-parent table direction (not the parent-to-child direction) –Because each LHS value of an FD can be associated with at most one RHS value. –Example: A faculty teaches many offerings, but an offering is taught by one teacher : OfferNo  FacNo

Minimalism ( 極簡化 ) of FD’s LHS (Determinant) The determinant of an FD (Columns appearing at the LHS of an FD) –Must be minimal (can not contain extra columns) One column vs. a combination of columns –An FD in which the LHS contains more than one column may represent an M-N relationship. –Example : OrdNo, ProdNo  OrdQty Order quantity depends on the combination of order number and product number. How to Identify Functional Dependencies

Eliminating FDs Using Sample Data An FD cannot exist, If –two rows of a table have the same value for the LHS but different values for the RHS of the FD A FD cannot be proven to exist by only examining the rows of a table. However you can falsify ( 否定 ) an FD by examining the content of a table. –Using sample data to eliminate potential FDs

Eliminating FDs Using Sample Data Disprove X  Y Two rows that have the same X value but a different Y value Example: OfferNo  StdSSN (?) StdSSN  OffYear (?) StdSSNStdClassOfferNoOffYearEnrGradeCourseNoCrsDesc S1JUNO C1DB S1JUNO C2VB S2JUNO C3OO S2JUNO C2VB

Normal Forms Normalization : the process of removing redundancies in a table so that the table is easier to modify A normal form is a rule about allowable FDs in tables. Each normal form removes certain kinds of redundancies. First normal form (1NF) is the starting point. Second Normal Form (2NF) is stronger ( 嚴格 ) than 1NF. –Only a subset of the 1NF tables is in 2NF. 3NF/BCNF is the most important in practice because higher normal forms than 3NF/BCNF involve other kinds of FDs that are less common and more difficult to understand.

Relationships of Normal Forms

First Normal Form (1NF, 第一正規化型式 ) 1NF prohibits nesting or repeating data groups in a table Starting point of normalization for most relational DBMSs –Most commercial DBMSs use 1NF tables A table not in 1NF is unnormalized ( 未正規化的 ) or nonormalized ( 無正規化的 ).

First Normal Form Table above is not normalized (not in 1NF) The table has 2 rows Containing repeating groups or nested columns. –S1 row has 5 nested columns ( OfferNo, OffYear, … ) –S2 row has 5 nested columns ( OfferNo, OffYear, … ) StdSSNStdClassOfferNoOffYearEnrGrade CourseNoCrsDesc S1JUNO C1DB O C2VB S2JUNO C3OO O C2VB

Convert to First Normal Form Replace each repeating group with a row In a new row, copy the nonrepeating columns –(S1, JUN) for row two with (O2, 2006, 3.3, C2, VB) –(S2, JUN) for row four with (O2, 2006, 3.4, C2, VB) Redefine PK if necessary StdSSNStdClassOfferNoOffYearEnrGradeCourseNoCrsDesc S1JUNO C1DB S1JUNO C2VB S2JUNO C3OO S2JUNO C2VB

Second Normal Form (2NF, 第二正規化型式 ) Goal of 2NF and 3NF produces tables in which every key determines the other columns The definition of 2NF and 3NF distinguish between key and nonkey columns. –A column is a key column if it is a candidate key or a part of candidate key –A nonkey column is any other column.

Second Normal Form (2NF, 第二正規化型式 ) Partial Dependency ( 部分 的 依賴 ) A nonkey column depends on a subset of columns in any candidate key (A part of a compound key → A nonkey column) A table is in 2NF if (no partial dependency exists) –the key contains only one column, or –each nonkey column depends on all of the columns in any candidate key, not a subset of columns in any candidate key

Second Normal Form Violation of 2NF (partial dependency exists) –A part of a compound key  A nonkey column –Only for checking compound keys ( 組合索引鍵 ) A key containing only one column cannot violate 2NF (A table containing a sinlge-column key cannot violate 2NF) Steps for converting to 2NF 1.Analyze FDs 2.Find violating FDs of 2NF (FD1, FD2 in next slide) 3.Splitting the original table into small tables that satisfy the 2NF definition (Split the columns of every violating FD into a new table)

Convert to Second Normal Form ( Analyze FDs ) FD 1 FD 2 FD 3 FD 4 StdSSNStdCityStdClassOfferNoOffTermOffYearEnrGradeCourseNoCrsDesc S1SeattleJUNO1Fall C1DB S1SeattleJUNO2Fall C2VB S2BothellJUNO3Spring C3OO S2BothellJUNO2Fall C2VB PK = ? Any Partial Dependency? FD1, FD2, FD3, FD4 ?

Convert to Second Normal Form (Splitting Original Table ) Splitting the original table into small tables that satisfy the 2NF definition –In each smaller table, the entire primary key should determine the nonkey columns –The original table should be recoverable by using natural join operations on the smaller tables –The FDs in the original table should be derivable from the FDs in the smaller tables. The splitting process involves the project operator of relational algebra

Convert to Second Normal Form After splitting, you should add referential integrity constraints to connect the tables. UnivTable1 (StdSSN, StdCity, StdClass) UnivTable2 (OfferNo, OffTerm, OffYear, CourseNo, CrsDesc) UnivTable3 (StdSSN, OfferNo, EnrGrade) FOREIGN KEY (StdSSN) REFERENCES UnivTable1 FOREIGN KEY (OfferNo) REFERENCES UnivTable2

Third Normal Form (3NF) A table is in 3NF if –It is in 2NF (no partial dependency) and –Each nonkey column depends only on candidate keys, not on other nonkey columns. (no transitive dependency) Transitive Dependency ( 傳遞 / 遞移的依賴 ) –Nonkey column depends on other nonkey columns –If A  B, B  C, then A  C. So, A  C is a transitive dependency, and B  C causes a violation of 3NF. OfferNo  CourseNo, CourseNo  CrsDesc OfferNo  CrsDesc

Convert to Third Normal Form Consider UnivTable2 OfferNo  CourseNo CourseNo  CrsDesc OfferNo  CrsDesc causes a violation of 3NF in UnivTable2 CourseNo  CrsDesc UnivTable2 (OfferNo, OffTerm, OffYear, CourseNo, CrsDesc)

Convert to Third Normal Form UnivTable2-1 (CourseNo, CrsDesc) UnivTable2-2 (OfferNo, OffTerm, OffYear, CourseNo) FOREIGN KEY (CourseNo) REFERENCES UnivTable2-1 UnivTable2 (OfferNo, OffTerm, OffYear, CourseNo, CrsDesc) Steps for converting to 3NF 1.Find violating FDs of 3NF 2.Splitting the original table into small tables that satisfy the 3NF definition (Split the columns of every violation FD into a new table)

自我練習作業 HW 第七章 239 頁 Questions: 1, 2, 3, 14, 15, 24