Download presentation

Presentation is loading. Please wait.

Published byDeshaun Halden Modified over 2 years ago

1
Normalization Sridhar Narayan narayans@uncw.edu

2
SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW EMP_PROJ Something feels wrong about this design Try adding a row – Insertion anomaly Try deleting a row – Deletion anomaly Try updating a row – Update anomaly Need a formal way to reason about what is wrong with it and how to fix it

3
Functional Dependency Constraints between attribute sets in a relation If X and Y are sets of attributes of a relation R, and whenever two tuples in R have the same X-values they also have the same Y-values, we say that X functionally determines Y.

4
Functional Dependency Written as X -> Y – X functionally determines Y – Y is functionally determined by X – X is the determinant, Y is the dependent Examples – SSN -> SSN (trivial dependency) – PNUMBER -> PNAME – SSN -> ENAME – SSN, PNUMBER -> HOURS

5
Functional Dependency Between sets of attributes, not just single attributes Holds for all time, not just for a particular instance (snapshot) of a relation Formally states constraints that exist for the relation – These constraints are in addition to those imposed by primary keys and foreign keys

6
Functional dependencies and keys If X functionally determines all attributes of R, then X is a super key If X is irreducible, i.e. every member of X is essential for the functional dependencies to hold, then X is a candidate key. Attributes that are a part of a candidate key are key attributes

7
Examples Super key: – SSN, PNUMBER, PNAME -> SSN, PNUMBER, HOURS, ENAME, PNAME, PLOC Candidate key: – SSN, PNUMBER -> SSN, PNUMBER, HOURS, ENAME, PNAME, PLOC SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW

8
Redundancy If in a relation R, A -> B and A is not a candidate key for R, then R will involve some redundancy. SSNPNUMBERHOURSENAMEPNAMEPLOC Intuitively, all functional dependencies in a relation should involve candidate keys to eliminate redundancy

9
Normalization A process that utilizes functional dependencies to identify relation schemas that have an undesirable form (redundancy) and decomposes them into smaller schema in which the redundancy has been eliminated.

10
Decomposition Decomposition should be – Lossless join Allow exact recovery of the original schema (without spurious tuples) – Dependency preserving Allow dependencies to be checked without requiring a join

11
Lossy decomposition SSNPNUMBERHOURSENAME E1P120Joe E1P220Joe E2P140Joe ENAMEPNAMEPLOC JoeCIS RoofUNCW JoeRestaurantMayfaire JoeCIS RoofUNCW

12
Natural join to recover original SSNPNUMBERHOURSENAMEPNAMEPLOC E1P120JoeCIS RoofUNCW E1P220JoeRestaurantMayfaire E2P140JoeCIS RoofUNCW E2P140JoeRestaurantMayfaire

13
Heath’s Theorem If relation R = {A,B,C} where A,B,C are attribute sets and A -> B then R 1 = {A, B} and R 2 = {A, C} represents a lossless decomposition

14
Levels of normalization First normal form – 1NF Second normal form – 2NF Third normal form – 3NF Boyce-Codd Normal Form - BCNF Increasingly stringent requirements

15
Normal Forms 1NF 2NF 3NF BCNF

16
First normal form Relation is in 1NF if all attribute values are atomic (By definition, all relations are in 1NF) D_NAMED_NUMMGR_SSND_LOCATIONS RESEARCH5334619276{Lumberton, Red Springs, Raeford} Assume that a department can have multiple locations, like {Lumberton, Red Springs, Raeford} Relation not in 1NF

17
Resolution? D_NAMED_NUMMGR_SSND_LOCATIONS RESEARCH5334619276Lumberton RESEARCH5334619276Red Springs RESEARCH5334619276Raeford

18
Decomposition D_NAMED_NUMMGR_SSND_LOCATIONS D_NAMED_NUMMGR_SSND_NUMD_LOCATIONS

19
Second Normal Form: 2NF A relation is in 2NF if – It is in 1NF, and – If the non-key attributes are fully (irreducibly) dependent on the primary key

20
Example: EMP_PROJ SSNPNUMBERHOURSENAMEPNAMEPLOC Functional Dependencies? SSN -> ENAME PNUMBER -> PNAME, PLOC {SSN, PNUMBER} -> HOURS Relation not in 2NF Non-key attributes ENAME, and PLOC and PNAME, are not fully dependent on the primary key

21
Solution? Decompose SSNPNUMBERENAMEPNAMEPLOC SSNPNUMBERHOURS

22
Decompose further… SSNPNUMBERPNAMEPLOC SSNENAME

23
And a little more… SSNPNUMBER 3b is a part of 1a, so drop it. PNUMBERPNAMEPLOC

24
2NF Normalization SSNPNUMBERHOURS SSNENAMEPNUMBERPNAMEPLOC

25
More than one way to get here SSNPNUMBERHOURSENAMEPNAMEPLOC PNUMBERPNAMEPLOC SSNPNUMBERHOURSENAME

26
Decompose further… SSNPNUMBERHOURSSSNPNUMBERENAME

27
And a little bit more SSNPNUMBER SSNENAME

28
3NF Normalization A relation is in 3NF if – It is in 2NF, and – If the non-key attributes are mutually independent. That is, no functional dependencies exist between non-key attributes.

29
Example: EMP_DEPT Functional Dependencies? SSN -> {ENAME, DOB, ADDRESS, DNUM} DNUM -> {DNAME, DMGRSSN} Redundancy? Relation in 1NF ? 2NF ? 3NF ? SSNENAMEDOBADDRESSDNUMDNAMEDMGRSSN

30
3NF Normalization DNUMDNAMEDMGRSSN SSNENAMEDOBADDRESSDNUM

31
BCNF Normalization S# and SNAME – Supplier# and Supplier Name are unique FDs – S# -> SNAME – SNAME -> S# – S#,P# -> QTY – SNAME, P# -> QTY Candidate keys – S#, P# and SNAME, P# S#SNAMEP#QTY S1Acme SupplyP1100 S2Gem MfgP1200 S1Acme SupplyP2400

32
BCNF Normalization Redundancy? 1NF? 2NF? 3NF? S#SNAMEP#QTY S1Acme SupplyP1100 S2Gem MfgP1200 S1Acme SupplyP2400

33
BCNF Relation is in BCNF if and only if the only determinants are candidate keys FDs – S# -> SNAME – SNAME -> S# – S#,P# -> QTY – SNAME, P# -> QTY

34
BCNF Normalization S#P#QTY S1P1100 S2P1200 S1P2400 S#SNAME S1Acme Supply S2Gem Mfg S1Acme Supply Two candidate keys: S# SNAME

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google