Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sridhar Narayan narayans@uncw.edu Normalization Sridhar Narayan narayans@uncw.edu.

Similar presentations


Presentation on theme: "Sridhar Narayan narayans@uncw.edu Normalization Sridhar Narayan narayans@uncw.edu."— Presentation transcript:

1 Sridhar Narayan narayans@uncw.edu
Normalization Sridhar Narayan

2 EMP_PROJ Something feels wrong about this design
SSN PNUMBER HOURS ENAME PNAME PLOC E1 P1 20 Joe CIS Roof UNCW P2 Restaurant Mayfaire E2 40 Something feels wrong about this design Try adding a row – Insertion anomaly Try deleting a row – Deletion anomaly Try updating a row – Update anomaly Need a formal way to reason about what is wrong with it and how to fix it

3 Functional Dependency
Constraints between attribute sets in a relation If X and Y are sets of attributes of a relation R, and whenever two tuples in R have the same X-values they also have the same Y-values, we say that X functionally determines Y.

4 Functional Dependency
Written as X -> Y X functionally determines Y Y is functionally determined by X X is the determinant, Y is the dependent Examples SSN -> SSN (trivial dependency) PNUMBER -> PNAME SSN -> ENAME SSN, PNUMBER -> HOURS

5 Functional Dependency
Between sets of attributes, not just single attributes Holds for all time, not just for a particular instance (snapshot) of a relation Formally states constraints that exist for the relation These constraints are in addition to those imposed by primary keys and foreign keys

6 Functional dependencies and keys
If X functionally determines all attributes of R, then X is a super key If X is irreducible, i.e. every member of X is essential for the functional dependencies to hold, then X is a candidate key. Attributes that are a part of a candidate key are key attributes

7 Examples Super key: Candidate key:
SSN PNUMBER HOURS ENAME PNAME PLOC E1 P1 20 Joe CIS Roof UNCW P2 Restaurant Mayfaire E2 40 Super key: SSN, PNUMBER, PNAME -> SSN, PNUMBER, HOURS, ENAME, PNAME, PLOC Candidate key: SSN, PNUMBER -> SSN, PNUMBER, HOURS, ENAME, PNAME, PLOC

8 Redundancy If in a relation R, A -> B and A is not a candidate key for R, then R will involve some redundancy. SSN PNUMBER HOURS ENAME PNAME PLOC Intuitively, all functional dependencies in a relation should involve candidate keys to eliminate redundancy

9 Normalization A process that utilizes functional dependencies to identify relation schemas that have an undesirable form (redundancy) and decomposes them into smaller schema in which the redundancy has been eliminated.

10 Decomposition Decomposition should be Lossless join
Allow exact recovery of the original schema (without spurious tuples) Dependency preserving Allow dependencies to be checked without requiring a join

11 Lossy decomposition SSN PNUMBER HOURS ENAME PNAME PLOC E1 P1 20 Joe
CIS Roof UNCW P2 Restaurant Mayfaire E2 40 SSN PNUMBER HOURS ENAME E1 P1 20 Joe P2 E2 40 ENAME PNAME PLOC Joe CIS Roof UNCW Restaurant Mayfaire

12 Natural join to recover original
SSN PNUMBER HOURS ENAME PNAME PLOC E1 P1 20 Joe CIS Roof UNCW P2 Restaurant Mayfaire E2 40 E2 P1 40 Joe Restaurant Mayfaire

13 Heath’s Theorem If relation R = {A,B,C} where A,B,C are attribute sets
and A -> B then R1= {A, B} and R2 = {A, C} represents a lossless join decomposition

14 Levels of normalization
First normal form – 1NF Second normal form – 2NF Third normal form – 3NF Boyce-Codd Normal Form - BCNF Increasingly stringent requirements

15 Normal Forms 1NF 2NF BCNF 3NF

16 First normal form Relation is in 1NF if all attribute values are atomic (By definition, all relations are in 1NF) D_NAME D_NUM MGR_SSN D_LOCATIONS RESEARCH 5 {Lumberton, Red Springs, Raeford} Assume that a department can have multiple locations, like {Lumberton, Red Springs, Raeford} Relation not in 1NF

17 Is this a resolution? D_NAME D_NUM MGR_SSN D_LOCATIONS RESEARCH 5
Lumberton Red Springs Raeford

18 Decomposition D_NAME D_NUM MGR_SSN D_LOCATIONS D_NAME D_NUM MGR_SSN

19 Second Normal Form: 2NF A relation is in 2NF if It is in 1NF, and
If the non-key attributes are fully (irreducibly) dependent on the primary key

20 Example: EMP_PROJ SSN -> ENAME PNUMBER -> PNAME, PLOC
HOURS ENAME PNAME PLOC Functional Dependencies? SSN -> ENAME PNUMBER -> PNAME, PLOC {SSN, PNUMBER} -> HOURS Relation not in 2NF Non-key attributes ENAME, and PLOC and PNAME, are not fully dependent on the primary key

21 2NF 1a 1b 2nF ? Solution? Decompose SSN PNUMBER HOURS SSN PNUMBER
ENAME PNAME PLOC 2nF ?

22 Decompose further… SSN ENAME 2a 2NF 2b SSN PNUMBER PNAME PLOC 2nF ?

23 3a 2NF 3b And a little more… 3b is a part of 1a, so drop it. PNUMBER
PNAME PLOC 3a 2NF 3b 3b is a part of 1a, so drop it. SSN PNUMBER

24 1a 2NF 2a 2NF 3a 2NF 2NF Normalization SSN PNUMBER HOURS SSN ENAME
PNAME PLOC 3a 2NF

25 More than one way to get here
SSN PNUMBER HOURS ENAME PNAME PLOC 1a 2NF PNUMBER PNAME PLOC Not2NF SSN PNUMBER HOURS ENAME 1b

26 2a 2NF Not2NF 2b Decompose further… SSN PNUMBER HOURS SSN PNUMBER
ENAME 2b

27 And a little bit more 3a 2NF SSN ENAME Redundant 3b SSN PNUMBER

28 3NF Normalization A relation is in 3NF if It is in 2NF, and
If the non-key attributes are mutually independent. That is, no functional dependencies exist between non-key attributes.

29 Example: EMP_DEPT SSN -> {ENAME, DOB, ADDRESS, DNUM}
DNAME DMGRSSN Functional Dependencies? SSN -> {ENAME, DOB, ADDRESS, DNUM} DNUM -> {DNAME, DMGRSSN} Redundancy? Relation in 1NF ? 2NF ? 3NF ?

30 3NF Normalization SSN ENAME DOB ADDRESS DNUM 1a 1b DNUM DNAME DMGRSSN

31 BCNF Normalization S# SNAME P# QTY S1 Acme Supply P1 100 S2 Gem Mfg 200 P2 400 S# and SNAME – Supplier# and Supplier Name are unique FDs S# -> SNAME SNAME -> S# S#,P# -> QTY SNAME, P# -> QTY Candidate keys S#, P# and SNAME, P#

32 BCNF Normalization Redundancy? 1NF? 2NF? 3NF? S# SNAME P# QTY S1
Acme Supply P1 100 S2 Gem Mfg 200 P2 400 Redundancy? 1NF? 2NF? 3NF?

33 BCNF Relation is in BCNF if and only if the only determinants are candidate keys FDs S# -> SNAME SNAME -> S# S#,P# -> QTY SNAME, P# -> QTY

34 BCNF Normalization S# P# QTY S1 P1 100 S2 200 P2 400 S# SNAME S1
Acme Supply S2 Gem Mfg Two candidate keys: S# SNAME


Download ppt "Sridhar Narayan narayans@uncw.edu Normalization Sridhar Narayan narayans@uncw.edu."

Similar presentations


Ads by Google