Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E.

Similar presentations


Presentation on theme: "Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E."— Presentation transcript:

1 cs3431 Normalization Part II

2 cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E in F+ ?) Rephrase as : Is E in A+ ? Let us compute {A} + {A} + = {A, B, C} Therefore, A  E is false

3 cs3431 Decomposition Decomposition: Must be Lossless (no spurious tuples!)

4 cs3431 Decomposing Relations sNumbersNamepNumberpName s1Davep1MM s2Gregp2MM StudentProf FDs: pNumber  pName sNumbersNamepNumber s1Davep1 s2Gregp2 Student pNumberpName p1MM p2MM Professor

5 cs3431 Decomposition: Lossless Join Property sNumbersNamepName S1DaveMM S2GregMM Student pNumberpName p1MM p2MM Professor sNumbersNamepNumberpName s1Davep1MM s1Davep2MM s2Gregp1MM s2Gregp2MM StudentProf Spurious Tuples

6 cs3431 Normalization What is the algorithm for correct (lossless) decomposition ?

7 cs3431 Normalization Step : Decompose Consider relation R with set of attributes A R. Consider a FD : A  B (such that no other attribute in (A R – A – B) is functionally determined by A). If A is not a superkey for R, we may decompose R as: Create R’ (A R – B) Create R’’ with attributes A  B Key for R’’ = A Foreign key : R’ (A) references R’’ (A)

8 cs3431 Example Decomposition Revisited sNumbersNamepNumberpName s1Davep1MM s2Gregp2MM StudentProf FDs: pNumber  pName sNumbersNamepNumber s1Davep1 s2Gregp2 Student pNumberpName p1MM p2MM Professor FOREIGN KEY: Student (PNum) references Professor (PNum)

9 cs3431 Schema Refinement : Normal Forms Question : How decide if any refinement of schema is needed ? If a relation is in a certain normal form, then it is known that certain kinds of problems are avoided or minimized.

10 cs3431 Normal Form: BCNF Boyce Codd Normal Form (BCNF): For every non-trivial FD X  B in R, X is a superkey of R.

11 cs3431 BCNF Example Relation: SCI (student, course, instructor) FDs: student, course  instructor instructor  course Decomposition: SI (student, instructor) Instructor (instructor, course)

12 cs3431 Decomposition Algo into BCNF Repeated application of decomposition will result in: relations that are in BCNF; lossless join decomposition, and guaranteed to terminate.

13 cs3431 Decomposition : Dependency Preserving ? Intuition: Can we locally in each decomposed relation check the functional dependencies ? Consider relation CSJDPQV, C is key, JP C and SD P. Decomposition: CSJDQV and SDP Is it lossless ? Yes ! Is it in BCNF ? Yes ! Problem: Checking JP C requires a join!

14 cs3431 Dependency Preserving Decomposition Intuition : If R is decomposed into X, Y and Z, and we enforce FDs that hold on X, on Y and on Z, then all FDs that were given to hold on R must also hold. Formal Definition : Decomposition of R into X and Y is dependency preserving if (F X union F Y ) + = F + Projection of set of FDs F: If R is decomposed into X, Y,..., then projection of F onto X (denoted F X ) is the set of FDs U -> V in F + (closure of F ) such that U, V are in X.

15 cs3431 Dependency Preserving Decompositions Decomposition of R into X and Y is dependency preserving if (F X union F Y ) + = F + Important to consider F +, not F, in this definition: ABC, A B, B C, C A, decomposed into AB and BC. Is this dependency preserving? Is C A preserved ?

16 cs3431 BCNF and Dependency Preservation In general, a dependency preserving decomposition into BCNF may not exist ! Example : CSZ, CS Z, Z C Not in BCNF. Can’t decompose while preserving 1st FD.

17 cs3431 Dependency Preservation BCNF does not necessarily preserve FDs. But 3NF is guaranteed to be able to preserve FDs.

18 cs3431 Normal Form : 3NF Third Normal Form (3NF): For every non-trivial FD X  B in R, either X is a superkey of R, or B is a prime attribute (B is part of a key).

19 cs3431 3NF vs BCNF ? If R is in BCNF, obviously R is in 3NF. If R is in 3NF, R may not be in BCNF. If R is in 3NF, some redundancy is possible. 3NF is a compromise used when BCNF not achievable, i.e., when no ``good’’ decomp exists Important: Lossless-join, dependency-preserving decomposition of R into a collection of 3NF relations always possible !

20 cs3431 Algorithm : Decomposition into 3NF Decomposition algorithm again used, but typically can stop earlier). But how to ensure dependency preservation? Idea 1: If X Y is not preserved, add relation XY. Problem is that XY may violate 3NF! Idea 2 : Instead of the given set of FDs F, use a minimal cover for F.

21 cs3431 Minimal Cover for a Set of FDs Minimal cover G for a set of FDs F: Closure of F = closure of G. Right hand side of each FD in G is single attribute. If we modify G by deleting a FD or by deleting attributes from an FD in G, the closure changes. Intuition: every FD in G is needed, and ``as small as possible’’ in order to get the same closure as F. Example : If both J  C and JP  C, then only keep the first one.

22 cs3431 Minimal Cover for a Set of FDs Theorem : Use minimum cover of FD+ in decomposition guarantees that the decomposition is lossless-join and dependency-preserving. Example : Given : A  B, ABCD  E, EF  GH, ACDF  EG Then the minimal cover is: A  B, ACD  E, EF  G and EF  H

23 cs3431 Algorithm for Minimal Cover Decompose FD into one attribute on RHS Minimize left side of each FD Check each attribute on LHS to see if deleted while still preserving the equivalence to F+. Delete redundant FDs. Note: Several minimal covers may exist.

24 cs3431 3NF Decomposition Algorithm Compute minimal cover G of F Decompose R using minimal cover G of FD into lossless decomposition of R. Each Ri is in 3NF Fi is projection of F onto Ri Identify dependencies in F not preserved now, X  A Create relation XA : New relation XA preserves X  A X is key of XA, because G is minimal cover. Hence no Y subset X exists, with Y  A If another dependency exists in XA; can only imply attribute of X.

25 cs3431 Summary Step 1: BCNF is a good form for relation If a relation is in BCNF, it is free of redundancies that can be detected using FDs. Step 2 : If a relation is not in BCNF, we can try to decompose it into a collection of BCNF relations. Step 3: If a lossless-join dependency-preserving decomposition into BCNF is not possible (or unsuitable given typical queries), consider decomposition into 3NF. Note: Decompositions should be carried out while keeping performance requirements in mind.


Download ppt "Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E."

Similar presentations


Ads by Google