Download presentation

Presentation is loading. Please wait.

Published byLilly Benjamin Modified over 2 years ago

1
Schema Refinement: Normal Forms

2
Given, a relation schema R together with a set of FD’s, we want to determine if R is in a “good” shape! If not, we need to decompose R into smaller “good” relations; How to measure this goodness and how to achieve it? To address these issues, we need to study normal forms If a relation schema is in some normal form, we know that it is in some “good” shape, in the sense that it won’t suffer from certain kinds of (redundancy) problems.

3
1NF 2NF 3NF BCNF Normal Forms The normal forms based on FD’s are First normal form (1NF) Second normal form (2NF) Third normal form (3NF) Boyce-Codd normal form (BCNF) These normal forms have increasingly restrictive requirements

4
First & Second Normal Forms A relation scheme is said to be in first normal from (1NF) if the values in the domain of each attribute of the relation are atomic. In other words, only one value is associated with each attribute and the value is not a set of values or a list of values. A database scheme is in first normal form if every relation scheme included in the database scheme is in 1NF. A relation scheme R is in second normal from (2NF) if it is in the 1NF and if all nonprime attributes are fully functionally dependent on the relation key(s). A database scheme is in second normal form if every relation scheme included in the database scheme is in second normal form.

5
Third Normal Form Let R be a relation schema, F a set of FD’s on R, X ⊆ R, and A ∈ R.A ∈ R. We say R w.r.t. F is in third normal form (3NF), if for each FD X A in F, at least one of the following conditions holds: A X (that is, X A is a trivial FD), or X is a superkey, or If X is not a key, then A is part of some key of R To determine whether is in 3NF : For every non-trivial FD X A in F, we check whether X is a superkey. If not, we then check whether its RHS, A, is part of any key of R. If both conditions fail, we conclude that R is not in 3NF w.r.t. F.

6
Boyce-Codd Normal Form Let R be a relation schema, F a set of FD’s on R, X ⊆ R, and A ∈ R.A ∈ R. We say R w.r.t. F is in Boyce-Codd normal form (BCNF), if for each FD X A in F, at least one of the following holds : A X (that is the FD is trivial) or X is a superkey To determine whether is in BCNF or not, we check every non-trivial FD in F. If there exists a FD X A in F such that X + ≠ R, then R is not in BCNF. Otherwise, we say R is BCNF w.r.t. F

7
Decomposition into BCNF Consider, where R is in 1NF. If R is not in BCNF, we can always obtain a lossless-join decomposition of R into a collection of BCNF relations However, it may not always be dependency preserving The basic step of a BCNF algorithm: Suppose X A F is a FD violating the BCNF requirement, where X R and A R Decompose R into XA and R – A If either R – A or XA is not in BCNF, decompose it further

8
Example R = ABCDE F = { A B, C D } A B R1 = AB F 1 = { A B } R2 = ACDE F 2 = { C D } R 21 = CD F 21 = { C D } R 22 = ACE F 22 = { } C D

9
Decomposition into 3NF We can always obtain a lossless-join, dependency-preserving decomposition of a relation into 3NF relations. How? We discuss 2 approaches to decompose. First: Approach 1: Follow the binary decomposition method for BCNF Let R = { R 1, R 2,... R n } be the result. Recall that this is always lossless-join, but may not preserve the FD’s ; so need to fix it? Identify the set N of FD’s in F that are lost (i.e., not preserved) For each FD X A in N, create a relation schema XA and add it to R A refinement step: if there are several FD’s with the same LHS, e.g., X A 1, X A 2,..., X A k, we create just one relation with schema XA 1 …A k That is, we replace these k FD’s (having the same LHS) with a single equivalent FD X A 1 …A k and create just one relation instead of k relation schemas XA 1, …,XA k

10
Example (3NF Decomposition) R = ABCDE F = { BD E, C B, CE A } BD E R 1 = BDE F 1 = { BD E } R 2 = ABCD F 2 = { C B, CD A } R 21 = CB F 21 = { C B } R 22 = ACD F 22 = { CD A } C B CE A is not preserved, since A ∉ { CE } + w.r.t. F 1 ⋃ F 21 ⋃ F 22 We add to R, a new relation R 3 = CEA with F 3 = { CE A }

11
Example (using a different order) R = ABCDE F = { BD E, C B, CE A } CE A R 1 = CEA F 1 = { CE A } R 2 = BCED F 2 = { C B, BD E } R 21 = BDE F 21 = { BD E } R 22 = BCD F 22 = { C B } BD E This decomposition is dependency preserving, and of course lossless-join R 221 = BC F 221 = { C B } R 222 = CD F 222 = C B

12
Decomposition into 3NF Previous (binary decomposition approach): Lossless-join √ May not be dependency preserving. If so, then add extra relations XA, one for each FD X → A we lost Now, the synthesis approach Dependency preservation √ However, may not be lossless-join. If so, we need to add to R, only one extra relation schema that includes the attributes that form any key of R What would be the FDs on this newly added relation?

13
Decomposition into 3NF (synthesis) Consider relation schema The synthesis approach: Get a canonical cover F c of F For each FD X A in F c, add schema XA to R If the decomposition R is not lossless, need to fix it. Add to R an extra relation schema containing just those attributes that form any key of R

14
Example R = ( A, B, C ) F = { A B, C B } Decompose R into R 1 = ( A, B ) and R 2 = ( B, C ) This decomposition is not lossless Add R 3 = ( A, C ) The decomposition R = {R 1, R 2, R 3 } is both lossless and dependency-preserving

15
Ann Algorithm to Check Lossless join Suppose relation R {A 1,..., A k } is decomposed into R 1,..., R n To determine if this decomposition is lossless, we use a table, L[ 1 … n ] [ 1... k ] Initializing the table: for each relation R i do for each attribute A j do if A j is an attribute in R i then L [ i ][ j ] a A j else L [ i ][ j ] b i A j

16
Algorithm to Check Lossless (cont’d) repeat for each FD X Y in F do: if ∃ rows i and j such that L [ i ] == L [ j ], for each attribute in X, then for ∀ column t corresponding to an attribute A t in Y do: if L [ i ][ t ] == a A t then L [ j ][ t ] a A t else if L [ j ][ t ] == a A t then L [ i ][ t ] a A t else L [ j ][ t ] L [ i ][ t ] until no change The decomposition is lossless if, after performing this algorithm, L contains a row of all a’s. That is, if there exists a row i in L such that: L [ i ][ t ] == a A t for every column t corresponding to each attribute A t in R

17
Examples Given ≺ R, F ≻, where R = ( A, B, C, D ), and F = { A B, A C, C D } is a set of FD’s on R Is the decomposition R = {R 1, R 2 } lossless, where R 1 = ( A, B, C ) and R 2 = ( C, D )? To be discussed in class Now consider S = ( A, B, C, D, E ) and the set G of FD’s on S, where G = { AB CD, A E, C D } Is decomposition of S = {S 1, S 2, S 3 } lossless, where S 1 = ( A, B, C ), S 2 = ( B, C, D ), and S 3 = ( C, D, E )? To be discussed in class

18
Dependency-Preserving Checking Let ≺ R,F ≻, where F = { X 1 Y 1,…, X n Y n }. Let R ={ R 1,…, R k } be a decomposition of R and F i be the projection of F on R i Below is an algorithm that decides dependency preservation. preserved TRUE for each FD X Y in F and while preserved == TRUE do begin compute X + under F 1 ... F n ; if Y ⊈ X + then preserved FALSE ; end

19
Example Consider R = ( A, B, C, D ), F = { A B, B C, C D } Is the decomposition R = {R 1, R 2 } dependency-preserving, where R 1 = ( A, B ), F 1 = { A B }, R 2 = ( A, C, D ), and F 2 = { C D, A D, A C }? Check if A B is preserved Compute A + under { A B } { C D, A D, A C } A + = { A, B, D } Check if B A + Yes A B is preserved Check if B C is preserved Compute B + under { A B } { C D, A D, A C } B + = { B } Check if C B + No B C is not preserved The decomposition is not dependency-preserving

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google