Presentation is loading. Please wait.

Presentation is loading. Please wait.

7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

Similar presentations


Presentation on theme: "7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)"— Presentation transcript:

1 7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)

2 7.1 - 2 Outline Modification anomalies ( 修改的異常 ) Functional dependencies ( 函數性的依賴 ) Major normal forms Practical concerns ( 實務的考量 )

3 7.1 - 3 Modification Anomalies ( 修改的異常 : 修改資料時所發生的異常現象 ) Definition: –Unexpected side effects ( 未預期到的副作用 ) that occurs when changing the data in a table designed with excessive redundancy ( 額外的多餘性、累贅性 ). Result of side effect –Insert, update, and delete more data than desired Types –Insertion Anomaly ( 新增的異常 ) –Update Anomaly ( 更新的異常 ) –Deletion Anomaly ( 刪除的異常 )

4 7.1 - 4 Example of a Poor Table Design (Big University Database) PK design: combination of StdSSN and OfferNo Pros Easier to query (no join is needed) –enrollments of student S1 or S2 –students of offering O2 –students or offerings of course C2 Cons Table has obvious redundancies (shown by blocks in colors) –Result : more difficult to change StdSSNStdCityStdClassOfferNoOffTermOffYearEnrGradeCourseNoCrsDesc S1SeattleJUNO1Fall20063.5C1DB S1SeattleJUNO2Fall20063.3C2VB S2BothellJUNO3Spring20073.1C3OO S2BothellJUNO2Fall20063.4C2VB

5 7.1 - 5 Insertion Anomaly ( 新增的異常 ) Definition: In an insertion, extra data beyond the desired data may be added to the database. Example: Cannot insert a new student without enrolling in an offering (Because OfferNo is part of PK) –Insert more column data than desired Other example ? Why ? each data row denotes student, offering, course, enrollment. PK consists of StdSSN denoting student and OfferNo denoting offering StdSSNStdCityStdClassOfferNoOffTermOffYearEnrGradeCourseNoCrsDesc S1SeattleJUNO1Fall20063.5C1DB S1SeattleJUNO2Fall20063.3C2VB S2BothellJUNO3Spring20073.1C3OO S2BothellJUNO2Fall20063.4C2VB

6 7.1 - 6 Update Anomaly ( 更新的異常 ) Definition: In order to modify only a single fact, it may be necessary to change multiple rows. Example: If changing a course description, it must change every enrollment of the course –Try to change C2’s course description, …. Other example ? StdSSNStdCityStdClassOfferNoOffTermOffYearEnrGradeCourseNoCrsDesc S1SeattleJUNO1Fall20063.5C1DB S1SeattleJUNO2Fall20063.3C2VB S2BothellJUNO3Spring20073.1C3OO S2BothellJUNO2Fall20063.4C2VB colored table

7 7.1 - 7 Deletion Anomaly ( 刪除的異常 ) Definition: Deleting a row may inadvertently ( 不注意地 ) cause other data to be deleted. Example: If we remove enrollment of student S2 in offering O3, causing loss of information about offering O3 and course C3 Other example ? StdSSNStdCityStdClassOfferNoOffTermOffYearEnrGradeCourseNoCrsDesc S1SeattleJUNO1Fall20063.5C1DB S1SeattleJUNO2Fall20063.3C2VB S2BothellJUNO3Spring20073.1C3OO S2BothellJUNO2Fall20063.4C2VB

8 7.1 - 8 StdSSNOfferNoEnrGrade S1O13.5 S1O23.3 S2O33.1 S2O23.4 OfferNoOffYearCourseNo O1MWC1 O2MWC2 O3MWC3 Table Name: Offering Table Name: Enrollment Table Name: Student StdSSNStdLastNameStdClass S1WELLSJUN S2NORBERTJUN S3KENDALLJUN CourseNoCreDesc C1DB C2VB C3OO Table Name: Course Example of a Better Table Design (Big University Database : 4 Tables Denoting 4 Objects+FKs ) Delete Anomaly? Update Anomaly? Insert Anomaly? Insert Anomaly?

9 7.1 - 9 Normalization ( 正規化 ) A good database design ensures the users can change the contents of a database without unexpected side effects (modification anomalies). –A better solution is to modify the table design to remove the redundancies that cause the anomalies. Normalization: –The process of removing redundancies in a table so that the table is easier to modify.

10 7.1 - 10 Constraints of Database Content Value-based constraints A comparison of a column to a constant –Example: Age >= 21 Value-neutral constraints A comparison of columns (column to column) –PK (entity integrity constraint) — Constraint about the PK column of one or more rows –FK (referential integrity constraint) — Constraint about parent PK and child FK of one or more rows –Functional dependency ( 函數性的依賴 ) — Constraint about two or more columns of a table

11 7.1 - 11 Functional Dependency ( 函數性的依賴 ) “X determines Y” is denoted as X  Y For each X value, there is at most one Y value X : left-hand-side (LHS) or determinant ( 決定項 ) Y : right-hand-side (RHS) Like a mathematical function: Y = f ( X ) – f : like a table – X : like the key of a table – Y : like a column of a table Example: StdSSN  StdName StdSSN  StdClass

12 7.1 - 12 Functional Dependency (FD) Think about functional dependencies as identifying potential candidate keys X  Y denotes an FD between columns X and Y –If X and Y are placed together in a table without other columns, X is a candidate key.

13 7.1 - 13 StdSSN, OfferNo  EnrGrade StdSSN  StdCity, StdClass OfferNo  OffTerm, OffYear, CourseNo, CrsDesc CourseNo  CrsDesc Functional Dependency Diagram and List ( 函數依賴圖和函數依賴清單 ) Functional Dependency Diagram List of Functional Dependencies Table Scheme

14 7.1 - 14 How to Identify Functional Dependencies Deriving from uniqueness statement Deriving from 1-M relationships Considering minimalism ( 極簡化 ) of FD’s LHS (Determinant)

15 7.1 - 15 How to Identify Functional Dependencies Deriving from uniqueness statement Example: –A user may state that each course offering has a unique offering number along with the year and term of the offering : OfferNo  OfferYear, OfferTerm

16 7.1 - 16 How to Identify Functional Dependencies Deriving from 1-M relationships For an 1-M relationship, an FD exists in –the child table-to-parent table direction (not the parent-to-child direction) –Because each LHS value of an FD can be associated with at most one RHS value. –Example: A faculty teaches many offerings, but an offering is taught by one teacher : OfferNo  FacNo

17 7.1 - 17 Minimalism ( 極簡化 ) of FD’s LHS (Determinant) The determinant of an FD (Columns appearing at the LHS of an FD) –Must be minimal (can not contain extra columns) One column vs. a combination of columns –An FD in which the LHS contains more than one column may represent an M-N relationship. –Example : OrdNo, ProdNo  OrdQty Order quantity depends on the combination of order number and product number. How to Identify Functional Dependencies

18 7.1 - 18 Eliminating FDs Using Sample Data An FD cannot exist, If –two rows of a table have the same value for the LHS but different values for the RHS of the FD A FD cannot be proven to exist by only examining the rows of a table. However you can falsify ( 否定 ) an FD by examining the content of a table. –Using sample data to eliminate potential FDs

19 7.1 - 19 Eliminating FDs Using Sample Data Disprove X  Y Two rows that have the same X value but a different Y value Example: OfferNo  StdSSN (?) StdSSN  OffYear (?) StdSSNStdClassOfferNoOffYearEnrGradeCourseNoCrsDesc S1JUNO120063.5C1DB S1JUNO220063.3C2VB S2JUNO320073.1C3OO S2JUNO220063.4C2VB

20 7.1 - 20 Normal Forms Normalization : the process of removing redundancies in a table so that the table is easier to modify A normal form is a rule about allowable FDs in tables. Each normal form removes certain kinds of redundancies. First normal form (1NF) is the starting point. Second Normal Form (2NF) is stronger ( 嚴格 ) than 1NF. –Only a subset of the 1NF tables is in 2NF. 3NF/BCNF is the most important in practice because higher normal forms than 3NF/BCNF involve other kinds of FDs that are less common and more difficult to understand.

21 7.1 - 21 Relationships of Normal Forms

22 7.1 - 22 First Normal Form (1NF, 第一正規化型式 ) 1NF prohibits nesting or repeating data groups in a table Starting point of normalization for most relational DBMSs –Most commercial DBMSs use 1NF tables A table not in 1NF is unnormalized ( 未正規化的 ) or nonormalized ( 無正規化的 ).

23 7.1 - 23 First Normal Form Table above is not normalized (not in 1NF) The table has 2 rows Containing repeating groups or nested columns. –S1 row has 5 nested columns ( OfferNo, OffYear, … ) –S2 row has 5 nested columns ( OfferNo, OffYear, … ) StdSSNStdClassOfferNoOffYearEnrGrade CourseNoCrsDesc S1JUNO120063.5C1DB O220063.3C2VB S2JUNO320073.1C3OO O220063.4C2VB

24 7.1 - 24 Convert to First Normal Form Replace each repeating group with a row In a new row, copy the nonrepeating columns –(S1, JUN) for row two with (O2, 2006, 3.3, C2, VB) –(S2, JUN) for row four with (O2, 2006, 3.4, C2, VB) Redefine PK if necessary StdSSNStdClassOfferNoOffYearEnrGradeCourseNoCrsDesc S1JUNO120063.5C1DB S1JUNO220063.3C2VB S2JUNO320073.1C3OO S2JUNO220063.4C2VB

25 7.1 - 25 Second Normal Form (2NF, 第二正規化型式 ) Goal of 2NF and 3NF produces tables in which every key determines the other columns The definition of 2NF and 3NF distinguish between key and nonkey columns. –A column is a key column if it is a candidate key or a part of candidate key –A nonkey column is any other column.

26 7.1 - 26 Second Normal Form (2NF, 第二正規化型式 ) Partial Dependency ( 部分 的 依賴 ) A nonkey column depends on a subset of columns in any candidate key (A part of a compound key → A nonkey column) A table is in 2NF if (no partial dependency exists) –the key contains only one column, or –each nonkey column depends on all of the columns in any candidate key, not a subset of columns in any candidate key

27 7.1 - 27 Second Normal Form Violation of 2NF (partial dependency exists) –A part of a compound key  A nonkey column –Only for checking compound keys ( 組合索引鍵 ) A key containing only one column cannot violate 2NF (A table containing a sinlge-column key cannot violate 2NF) Steps for converting to 2NF 1.Analyze FDs 2.Find violating FDs of 2NF (FD1, FD2 in next slide) 3.Splitting the original table into small tables that satisfy the 2NF definition (Split the columns of every violating FD into a new table)

28 7.1 - 28 Convert to Second Normal Form ( Analyze FDs ) FD 1 FD 2 FD 3 FD 4 StdSSNStdCityStdClassOfferNoOffTermOffYearEnrGradeCourseNoCrsDesc S1SeattleJUNO1Fall20063.5C1DB S1SeattleJUNO2Fall20063.3C2VB S2BothellJUNO3Spring20073.1C3OO S2BothellJUNO2Fall20063.4C2VB PK = ? Any Partial Dependency? FD1, FD2, FD3, FD4 ?

29 7.1 - 29 Convert to Second Normal Form (Splitting Original Table ) Splitting the original table into small tables that satisfy the 2NF definition –In each smaller table, the entire primary key should determine the nonkey columns –The original table should be recoverable by using natural join operations on the smaller tables –The FDs in the original table should be derivable from the FDs in the smaller tables. The splitting process involves the project operator of relational algebra

30 7.1 - 30 Convert to Second Normal Form After splitting, you should add referential integrity constraints to connect the tables. UnivTable1 (StdSSN, StdCity, StdClass) UnivTable2 (OfferNo, OffTerm, OffYear, CourseNo, CrsDesc) UnivTable3 (StdSSN, OfferNo, EnrGrade) FOREIGN KEY (StdSSN) REFERENCES UnivTable1 FOREIGN KEY (OfferNo) REFERENCES UnivTable2

31 7.1 - 31 Third Normal Form (3NF) A table is in 3NF if –It is in 2NF (no partial dependency) and –Each nonkey column depends only on candidate keys, not on other nonkey columns. (no transitive dependency) Transitive Dependency ( 傳遞 / 遞移的依賴 ) –Nonkey column depends on other nonkey columns –If A  B, B  C, then A  C. So, A  C is a transitive dependency, and B  C causes a violation of 3NF. OfferNo  CourseNo, CourseNo  CrsDesc OfferNo  CrsDesc

32 7.1 - 32 Convert to Third Normal Form Consider UnivTable2 OfferNo  CourseNo CourseNo  CrsDesc OfferNo  CrsDesc causes a violation of 3NF in UnivTable2 CourseNo  CrsDesc UnivTable2 (OfferNo, OffTerm, OffYear, CourseNo, CrsDesc)

33 7.1 - 33 Convert to Third Normal Form UnivTable2-1 (CourseNo, CrsDesc) UnivTable2-2 (OfferNo, OffTerm, OffYear, CourseNo) FOREIGN KEY (CourseNo) REFERENCES UnivTable2-1 UnivTable2 (OfferNo, OffTerm, OffYear, CourseNo, CrsDesc) Steps for converting to 3NF 1.Find violating FDs of 3NF 2.Splitting the original table into small tables that satisfy the 3NF definition (Split the columns of every violation FD into a new table)

34 7.1 - 34 自我練習作業 HW 第七章 239 頁 Questions: 1, 2, 3, 14, 15, 24


Download ppt "7 Copyright © 2006, Oracle. All rights reserved. Normalization of Relational Tables (Part I)"

Similar presentations


Ads by Google