Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Chapter 3 Database Normalization 1.

Similar presentations


Presentation on theme: "Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Chapter 3 Database Normalization 1."— Presentation transcript:

1 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Chapter 3 Database Normalization 1

2 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Sections Database Anomalies What is Normalization? The Normal Forms 2

3 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Database Design Database design demands the decision of a suitable logical structure Most importantly What relations are needed to store the values What attributes they should use And the optimization of relation design for clarity and efficiency 3

4 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Data Anomalies Edgar Codd, inventor of the relational database, described data anomalies in the 70s They are unintended consequences of a database modification There are 3 kinds of anomalies: Insert Anomaly Delete Anomaly Update Anomaly 4

5 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Insert Anomaly Insert anomalies happen when data is inserted into the relation that has attributes missing (null attributes) If we view the relation as a set where every tuple is its own key, then this is an illegal operation We don’t want the database to have holes in its information 5

6 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Delete Anomaly If database is not normalized, then deleting a from a relation could result in a deletion of other wanted information Example of insert and delete anomalies on next page: 6

7 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Insert anomaly: Because P# is empty Delete anomaly: If S5 is deleted, then all information about S5, 60000 salary, status 30, Athens is lost 7 S #Salary STATUS CITYP # QTY S1 40000 20LONDONP1300 S1 40000 20LONDONP2200 S1 40000 20LONDONP3400 S1 40000 20LONDONP4200 S1 40000 20LONDONP5100 S1 40000 20LONDONP6100 S2 30000 10PARISP1300 S2 30000 10PARISP2400 S3 30000 10PARISP2200 S4 40000 20LONDONP2200 S4 40000 20LONDONP4300 S4 40000 20LONDONP5400 S5 60000 30ATHENS -

8 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Update Anomaly If a database is not normalized, updating a single fact in the database becomes very inefficient and sometimes incorrect That fact can be in many relations, so only updating one relation would not be sufficient Therefore, many relations must be updated to accurately reflect the update… if this is not done then the update is not accurate. 8

9 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 9 If S1 is changed then many updates have to be issued for a single attribute change Why not issue single change to single relation? S #Salary STATUS CITYP # QTY S1 40000 20LONDONP1300 S1 40000 20LONDONP2200 S1 40000 20LONDONP3400 S1 40000 20LONDONP4200 S1 40000 20LONDONP5100 S1 40000 20LONDONP6100 S2 30000 10PARISP1300 S2 30000 10PARISP2400 S3 30000 10PARISP2200 S4 40000 20LONDONP2200 S4 40000 20LONDONP4300 S4 40000 20LONDONP5400

10 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 What is Normalization? Normalization is a formalized process of decomposing relations Normalized relations aim to remove redundancy and dependencies from relations By doing this, data anomalies are prevented But also, normalization is also the basis for designing simpler, clearer, faster, and more efficient RDBMS’s 10

11 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Normal Forms Data anomalies were described by Codd in 1970s He and others (Boyce, Fagin, more) also began defining Normal Forms that could describe how rigorous the normalization is-- Normal Forms A Normal Form is the specific form a relation is in when it satisfies specific properties These properties provide a systematic way of formulating non-normalized relations into normalized relations 11

12 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Many Normal Forms Relations in higher Normal Forms will be more normalized than relations in lower Normal Forms Every higher Normal Form satisfies every Normal Form lower than it Ex. 2NF is also 1NF, and 3NF is also 2NF and 1NF 1NF, 2NF, 3NF, BCNF, and 4NF will be discussed, however there are even more Normal Forms than these 5. 12

13 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 13

14 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 1NF (First Normal Form) For a relation to be in 1NF it must have: Any related values must be decomposed into separate tables All rows must be unique (relational set) All columns must be unique (no repeating groups) Any value in any tuple must be atomic (cannot be divided) A private key must be defined (usually formally defined as the entire tuple, assuming it is unique and the relation is a set) 14

15 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Functional Dependencies (FD) Given a relation R, attribute Y of R is functionally dependent on attribute X of R if each X - value in R has associated with it precisely one Y - value in R (at any one time). (no X-values are mapped to two Y-values) A functional dependency is a special form of integrity constraint. In other words, every legal extension ( tabulation ) of that relation satisfies that constraint. An attribute Y is said to be fully functionally dependent on X if Y functionally depends on X but not any proper subset of X. From now on, by FD, we mean full FD. 15

16 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 2NF (Second Normal Form) SQL is automatically in 1NF, but it is not good enough, in Codd’s own words 16

17 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 3NF (Third Normal Form) Functional dependencies 17

18 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 BCNF (Boyce-Codd Normal Form) Also known as 3.5 Normal form 18

19 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 4NF (Fourth Normal Form) 19


Download ppt "Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Chapter 3 Database Normalization 1."

Similar presentations


Ads by Google