Download presentation
Presentation is loading. Please wait.
1
CSC 453 Database Systems Lecture
Tanu Malik College of CDM DePaul University
2
Today Quiz Errata Normalization Functional Dependencies Review Closure
Minimal Cover Equivalence
3
Why Normalization? Databases are published in bulk.
City of Chicago datasets Property Tax Bill data Scientific data
4
Transaction Data TransID Amount # Shares Investment Risk Category
YTD Yield 1 100 10 Gold Low 3.0% 2 200 Pirates+ High 50.2% 3 202 20 4 505 25 5 900 40 US Bonds 0.1% 6 650 Treasure Hunt Medium 12.17% 7 350 14 8 420
5
Redundancy in Data TransID Amount # Shares Investment Risk Category
YTD Yield 1 100 10 Gold Low 3.0% 2 200 Pirates+ High 50.2% 3 202 20 4 505 25 5 900 40 US Bonds 0.1% 6 650 Treasure Hunt Medium 12.17% 7 350 14 8 420
6
Repeating Combinations
TransID Amount # Shares Investment Risk Category YTD Yield 1 100 10 Gold Low 3.0% 2 200 Pirates+ High 50.2% 3 202 20 4 505 25 5 900 40 US Bonds 0.1% 6 650 Treasure Hunt Medium 12.17% 7 350 14 8 420 Leads to: Wasted space Insert/Update/Delete Anomalies
7
Anomalies Update Anomaly: The hourly_wages in the first tuple could be updated without making a similar change in the second tuple. Insertion Anomaly: Cannot insert a tuple for the hourly_wages unless have an employee with that wage. Deletion Anomaly: If we delete all tuples with a given rating value (e.g., we delete the tuples for Charlie and Don) we lose the association between that rating value and its hourly_wage value.
8
Functional Dependencies
9
Functional Dependencies
Set of attributes Y is functionally dependent on set of attributes X if and only if the values of X uniquely determine the values of Y Also “X determines Y”, “X Y” X is called the determinant A functional dependency must hold for all possible relation states to be valid
10
Do not Trust the Data Cannot derive FDs from data
It appears that City State AddressID City State ZipCode 1 Chicago IL 60604 2 Portland ME 04104 3 60611 4 Boston MA 02468 Move lower?
11
False Function Dependency
New data may violate this “FD” Need other ways to validate AddressID City State ZipCode 1 Chicago IL 60604 2 Portland ME 04104 3 60611 4 Boston MA 02468 5 OR 97232
12
Where do FDs Come From? Given by domain experts
Deduced by the database designer Laws of Physics X, Y, Z Location in 3D space Not based on data
13
Inference Rules Reflexivity: If X includes Y, then X Y
Augmentation: If X Y, then XZ YZ Transitivity: If X Y and Y Z, then X Z [Decomposition: If X YZ, then X Y] [Union: If X Y and X Z, then X YZ] [Pseudo-transitivity: If X Y and WY Z, then WX Z]
14
Search notion of a f.d Accessing ‘x’ requires a column name and rowid
B C 1 a x 2 b y 3 4 c A BC BC A
15
Closures For a set of functional dependencies F, the closure of F (F+) is the set of all functional dependencies that can be derived from F F+ can be constructed from the closures under F of all possible sets of attributes X For F and a set of attributes X, the closure of X under F (X+) is the set of all attributes that can be determined from X
16
Finding a Closure of X under F
To find the closure of X under F: set X+ = X repeat set oldX+ = X+ for each Y Z in F do if X+ includes Y, then set X+ = X+ U Z until oldX+ = X+
17
Types of Keys Super Key: Candidate Key Primary Key
For each set X, find the closure of X under F If X+ contains all attributes, then X is a superkey Candidate Key If X is a superkey, but no subset Y of X is a superkey, then X is a candidate key Primary Key Choose a candidate key to be the primary key
18
Find Candidate Keys R(ABCDEFGH) F = {AB →C, A → DE, B → F, F → GH}
F = {AB →C, BD → EF, AD → G, A → H} R(ABCDE) F = {BC → ACE, D →B} F = {AB → CD, D →A, BC →DE}
19
More examples R(WXYZ) Z → W Y → XZ WX → Y
20
Equivalent Set of F.Ds Given a relation R and two set of functional dependencies F and G on R, the equivalence question is if F ⊆ G, G ⊆ F, F = G, or F ̸= G.
21
Example F={A→B,AB→C,D→AC,D→E} G={A→BC,D→AE}
22
Algorithm Must bring F and G on the same starting point. How?
Compute the closure set of left hand attributes in F, but by using the f.ds in G. Compute the closure set of left hand attributes in G but using the f.ds in F. Compare F’s search power as stated by its f.d set is equivalent to the search power of G.
23
Minimal Cover F is a minimal cover if F no dependency can be removed from F In other words, F is an irreducible funcitonal dependency set
24
Finding Minimal Cover Find a minimal cover by eliminating the redundant/extraneous set of attributes in the set F . Given a f.d α → β, 3 kinds of redundancies: the redundancy is on the r.h.s, the redundancy is on l.h.s, or the entire f.d is redundant.
25
Eliminate redundancy on R.H.S
Use decomposition rule to decompose all f.d’s in F. α → βγ to α → β, and α → γ We have only one attribute on r.h.s, either the entire f.d will be redundant or the f.d will not be redundant.
26
Eliminate entire f.d Eliminate a f.d if you see it exactly repeated again or can be inferred transitively. Given a α → β in F, Find α+ first using F, compute α+ but this time using F − {α → β}. If α+ does not change then surely the f.d is redundant.
27
Eliminate redundancy on l.h.s
Given a f.d such αγ → β, determine if α → γ holds in F′ = F −{αγ → β}. If γ can be searched by just α in F’ without the given f.d, then clearly γ is redundant in the given f.d to search for β.
28
Finding Minimal Cover Minimal/Canonical Cover Algorithm (CCA)
Given FD set, F, CCA finds minimal FD set equivalent to F minimal: can’t find another equivalent FD set with fewer FD’s
29
Minimal Cover Algorithm
ALGORITHM minimal-cover (X: FD Set) BEGIN REPEAT UNTIL STABLE 1. Remove all trivial FDs. 2. Where possible, apply DECOMPOSITION rule (A’s Axioms) 3. Remove all extraneous attributes: a. Test if B extraneous in A → BC (B extraneous if (A → B) (F – {A → BC} U {A → C})+) b. Test if B extraneous in AB → C (B extraneous if (A → C) F+) END
30
Extraneous Attributes
1. Extraneous in RHS? e.g.: Can we replace A → BC with A → C? (i.e.: Is B extraneous in A → BC?) 2. Extraneous in LHS? e.g.: Can we replace AB → C with A → C? (i.e.: Is B extraneous in AB → C?) Simple (but expensive) test: 1. Replace A → BC (or AB → C) with A → C in F Define F2 = F – {A → BC} {A → C} OR F2 = F – {AB → C} {A → C} 2. Test: Is F2+ = F+? If yes, then B was extraneous
31
Extraneous Attributes
A. RHS: Is B extraneous in A → BC? Step 1: F2 = F – {A → BC} {A → C} Step 2: F+ = F2+? To simplify step 2, observe that F2+ F+ (i.e.: no new FD’s in F2+) Why? Have effectively removed A → B from F When is F+ = F2+? A: When (A → B) F2+ Idea: If F2+ includes: A → B and A → C, then it includes A → BC
32
Extraneous Attributes
A. RHS: Given F = {A → BC, B → C}, is C extraneous in A → BC? Why or why not? A: Yes, because (A → C) {A → B, B → C}+ Proof: 1. A → B Given 2. B → C Given 3. A → C transitivity, (1) and (2) Use Armstrong’s axioms in proof
33
Extraneous Attributes
B. LHS: Is B extraneous in AB → C? Step 1: F2 = F – {AB → C} U {A → C} Step 2: F+ = F2+? To Simplify step 2, observe that F2+ F+ (i.e.: there may be new FD’s in F2+) Why? A → C “implies” AB → C. Thus, all FD’s in F+ also in F2+. But AB → C does not “imply” A → C. Thus, all FD’s in F2+, not necessarily in F+. When is F+ = F2+? A: When (A → C) F+ Idea: If (A → C) F+ , then it will include all FD’s of F2+
34
Minimal Cover Algorithm
Example: Determine the minimal cover of F = {A → BC, B → CE, A → E} Iteration 1: a. F = {A → B A → C, A → E, B → E, B → C} Must check for up to 5 extraneous attributes Is A → B? extraneous No Is A → C? extraneous Yes: (A → C) {A → BE, B → CE}+ 1. A → BE Given 2. A → B Decomposition (1) 3. B → CE Given 4. B → C Decomposition (3) 5. A C Trans (2,4) Is B → E? extraneous …
35
Minimal Cover Algorithm
Example (cont.): F = {A → BC, B → CE, A → E} Iteration 1: Iteration 1: a. F = {A → B A → C, A → E, B → E, B → C} Extraneous atts: B extraneous in A → BCE? No C extraneous in A → BCE? Yes… F = {A → B, A → E, B → E, B → C} Is A → E? extraneous Yes: (A → E) {A → B, B → E, B → C}+ 1. A → B Given 2. B → CE Given 3. B → E Decomposition (2) 4. A → E Trans (1,3) F = {A → B, B → E, B → C} E extraneous in B → E? No C extraneous in B → C? No
36
Minimal Cover Algorithm
Example (cont.): F = {A → BC, B → CE, A → E} Iteration 1: a. F = {A → BCE, B → CE} b. Extraneous atts: B extraneous in A → BCE? No C extraneous in A → BCE? Yes… E extraneous in A → BE? Yes… E extraneous in B → CE? No C extraneous in B → CE? No Iteration 2: a. F = {A → B, B → CE} b. Extraneous atts: E extraneous in B → CE? No C extraneous in B → CE? No DONE!
37
Minimal Cover of a set of FDs
F = {A → BC, B → CE, A → E, AC → H, D → B} F = {A → BC, B → C, A → B, AB → C, AC → D} A-> B, B -> CE, AC-> H, D->B
38
Minimal Cover Algorithm
Given: Determine canonical cover of F: F = {A → BC, B → CE, A → E, AC → H, D → B} Fc = {A → BH, B → CE, D → B} Another Example: F = {A → BC, B → C, A → B, AB → C, AC → D} cc algorithm Fc = {A → BD, B → C}
39
Minimal Cover Exercise
Relation R (A, B, C, D) Set F ABCD BC AB ABC BD Compute canonical cover
40
Normalization
41
Relational Table A B C D E H I TransID Amount # Shares Investment
Risk Category YTD Yield Transaction Date 1 100 10 Gold Low 3.0% 10/01/2016 2 200 Pirates+ High 50.2% 3 202 20 10/05/2016 4 505 25 10/06/2017 5 900 40 US Bonds 0.1% 10/08/2017 6 650 Treasure Medium 12.17% 7 350 14 8 420 A-> B, C, D D-> E,H, A,D->I A->B,C,D D->E,H A->I TransID → Amount, # of Shares, Investment Investment → Risk Category, YTD Yield TransID, Investment → Transaction Date
42
Option 1 TransID Amount # Shares Transaction Date 1 100 10 10/01/2016
200 3 202 20 10/05/2016 4 505 25 10/06/2017 5 900 40 10/08/2017 6 650 7 350 14 8 420 Investment Risk Category YTD Yield Gold Low 3.0% Pirates+ High 50.2% US Bonds 0.1% Treasure Medium 12.17% A->B,C,I D->E,H TransID → Amount, # of Shares, Investment Investment → Risk Category, YTD Yield TransID, Investment → Transaction Date
43
Option 2 TransID Amount # Shares Transaction Date 1 100 10 10/01/2016
200 3 202 20 10/05/2016 4 505 25 10/06/2017 5 900 40 10/08/2017 6 650 7 350 14 8 420 Investment Risk Category YTD Yield Transaction Date Gold Low 3.0% 10/01/2016 Pirates+ High 50.2% 10/05/2016 10/06/2017 US Bonds 0.1% 10/08/2017 Treasure Medium 12.17% A->B,C,I D->E,H TransID → Amount, # of Shares, Investment Investment → Risk Category, YTD Yield TransID, Investment → Transaction Date
44
Issues with Options Option1: This option does not generate the original table as there is no common attribute Option2: This option has a common attribute but generates spurious tuples
45
Goals of Decomposition
Redundancy avoidance Avoid any unnecessary data duplication Want to be able to reconstruct what you started with (lossless) Avoid Option1 and Option2 Dependency preservation Minimize the cost of integrity constrains
46
Nonadditive OR Lossless Join
If we decompose a relation state and combine the resulting restrictions with a natural join, we would expect to get the original relation state There should not be any spurious tuples that were not present in the original relation state… For the nonadditive join property to hold, this must be the case for every relation state
47
Ensuring Lossless Joins
A Decomposition of R, R = R1 R2 is Lossless iff R1 ∩ R2 → R1 or R1 ∩ R2 → R2 or both (i.e.: Intersecting atts must form a super key for one of the relations) Intuition: Original relation R has n tuples A A R1 ● … ● R2 ▪ A a key |R2| n, & Relationship with R1 is n:1 ▪ A not a key |R1| = n n tuples in result
48
Dependency Preservation
The union over all i in {1,…,m} of the projection of F on Ri is equivalent to F We want the set of all projections to be equivalent to F – the decomposition neither destroys any functional dependencies in R nor introduces any new ones… We can always ensure that this property is preserved, but it is not automatic…
49
Projections Suppose we have a set of functional dependencies F over R
For each Ri, consider all XY in F such that both X and Y are subsets of Ri This set is called the projection of F on Ri The projection represents the set of all constraints that F puts on the attributes of Ri
50
Projections: Intuition
R = (A, B, C, D), set of FDs - F R1 = (A, B, C), R2 = (C, D) Projection of F on R1 is all “local” FDs for R1 All FDs in F that use attributes (A, B, C) Projection of F on R2 is all “local” FDs for R2 All FDs in F that use attributes (C, D)
51
Dependency Preservation Test
The union over all i in {1,…,m} of the projection of F on Ri is equivalent to F We want the set of all projections to be equivalent to F – the decomposition does not destroy any functional dependencies in R. We can always ensure that this property is preserved, but it is not automatic…
52
Dependency Preservation
The projection of F on Ri is the set of all XY in F+ such that both X and Y are subsets of Ri The projection represents the full set of constraints that F puts on the attributes of Ri For the dependency preservation property to hold, the union of the projections must be equivalent to F
53
Projections and Dep. Preservation
R = {A, B, C, D} F = {A BD, B DC, C D} Let R = R1, R2 where R1 = {A,B,C}, R2 = {C,D} Compute the projection of F on R1 Compute the projection of F on R2 Is the decomposition dependency preserving?
54
Normalization Database normalization, or simply normalization, is the process of organizing the columns (attributes) and tables (relations) of a relational database to reduce data redundancy and improve data integrity.
55
4 kinds of Normal Forms First Normal Form Second Normal Form
Third Normal Form Boyce-Codd Normal Form Fourth Normal Form
56
Prime Vs Non-Prime Attributes
Prime Attribute: Any attribute that is specified in the candidate key Non-Prime Attribute: Any attribute not part of the candidate key
57
First Normal Form (1NF) First Normal Form: Each domain of a column must contain only atomic values, and each column in a record must have at most one value from its domain To get 1NF, remove composite attributes and multi-valued attributes Composite attributes are easy – just subdivide them into multiple attributes…
58
Removing Multi-Valued Attributes
Remove the multi-valued attribute from the relation Create a new relation with the primary key of the original relation and the multi-valued attribute For each tuple in original relation with k values, add k tuples to the new relation Primary key of new relation contains all attributes; primary key of original relation becomes foreign key in new relation referencing original relation
59
Second Normal Form (2NF)
Second Normal Form: 1NF, plus every non-prime attribute in the relation is determined by the entire primary key (but not by any subset) To get 2NF, eliminate partial dependencies on the primary key XY is a partial dependency if ZY for some subset Z of X
60
Removing Partial Dependencies
Find all dependencies where a subset of the primary key determines some non-prime attribute(s) Starting with the smallest subset, do the following: 1. Remove all attributes on the right-hand side from the relation and put them in a new relation 2. Add the attributes in the determinant(l.h.s) to the new relation; make them the primary key, and make them a foreign key in the original relation referencing the new table 3. Remove from any remaining partial dependencies any attributes removed from the original relation
61
Converting 1NF to 2NF (EmpID, EmpLName, EmpFName, Dept, ProjCode, Hours) Functional Dependencies EmpID EmpLName, EmpFName, Dept EmpID, ProjCode Hours
62
1NF vs 2NF Data EmpID EmpLName EmpFName Dept ProjCode Hours 550 Smith
Winston Accounting 101 20 252 10 601 Barney Finance 5 390 Hammond Evey Personnel 995 25 001 Preston Bill Special Events 100 Logan Ted 007 Bond James 505 Lane Lois Media Relations
63
Converting 1NF to 2NF Decompose (EmpID, EmpLName, EmpFName, Dept, ProjCode, Hours) Remove the offending FD EmpID EmpLName, EmpFName, Dept Two tables (EmpID, EmpLName, EmpFName, Dept) (EmpID, ProjCode, Hours)
64
Towards 2NF EmpID EmpLName EmpFName Dept 550 Smith Winston Accounting
601 Barney Finance 390 Hammond Evey Personnel 001 Preston Bill Special Events 100 Logan Ted 007 Bond James 505 Lane Lois Media Relations EmpID ProjCode Hours 550 101 20 252 10 601 5 390 995 25 001 100 007 505
65
Towards 2NF EmpID EmpLName, EmpFName, Dept EmpID, ProjCode Hours
550 Smith Winston Accounting 601 Barney Finance 390 Hammond Evey Personnel 001 Preston Bill Special Events 100 Logan Ted 007 Bond James 505 Lane Lois Media Relations EmpID ProjCode Hours 550 101 20 252 10 601 5 390 995 25 001 100 007 505 EmpID EmpLName, EmpFName, Dept EmpID, ProjCode Hours
66
Third Normal Form (3NF) Third Normal Form: 2NF, plus every non-prime attribute in the relation is determined only by the primary key of the relation. To get 3NF, eliminate transitive dependencies on the primary key XY is a transitive dependency if XZ and ZY for some Z that is disjoint from X
67
Removing Transitive Dependencies
Find all dependencies where a set of attributes disjoint from the primary key determines some non-prime attribute(s) Starting with the smallest such set, do the following: 1. Remove all attributes on the right-hand side from the relation and put them in a new relation 2. Add the attributes in the determinant to the new relation; make them the primary key, and make them a foreign key in the original relation referencing the new relation 3. Remove from any remaining transitive dependencies any attributes removed from the original relation
68
Converting 2NF to 3NF Schema: (First , Last, Address, City, State, Zip) First, Last Address, City, State, Zip Transitive functional dependency: Zip City, State
69
2NF Data Table First Last Address City State ZipCode Henry Bienen
2145 Sheridan Rd Evanston IL 60202 Helmut Epp 55 E. Jackson St Chicago 60604 Denis Stein 55 E. Jackson St. David Miller 243 S. Wabash Av. Gary 5000 Forbes Av. Pittsburgh PA 15213 Leighton 77 Beacon St Cambridge MA 02139
70
Decompose the Tables City State ZipCode Evanston IL 60202 Chicago
First Last Address ZipCode Henry Bienen 2145 Sheridan Rd 60202 Helmut Epp 55 E. Jackson St 60604 Denis Stein 55 E. Jackson St. David Miller 243 S. Wabash Av. Gary 5000 Forbes Av. 15213 Leighton 77 Beacon St 02139 City State ZipCode Evanston IL 60202 Chicago 60604 Pittsburgh PA 15213 Cambridge MA 02139
71
Decompose the Tables ZipCode City, State
First Last Address ZipCode Henry Bienen 2145 Sheridan Rd 60202 Helmut Epp 55 E. Jackson St 60604 Denis Stein 55 E. Jackson St. David Miller 243 S. Wabash Av. Gary 5000 Forbes Av. 15213 Leighton 77 Beacon St 02139 City State ZipCode Evanston IL 60202 Chicago 60604 Pittsburgh PA 15213 Cambridge MA 02139 ZipCode City, State First, Last First, Last, Address, ZipCode
72
Third Normal Form Third Normal Form: Every non-prime attribute is fully functionally dependent on every candidate key, and no non-prime attribute is transitively dependent on any candidate key “For every non-trivial functional dependency XA, either X is a superkey or A is a prime attribute.”
73
Third Normal Form A relation schema is in third nomal form (3NF) if for all: X → Y in F+ at least one of the following holds: X → Y is trivial (i.e. X ∈ Y) X is a superkey for R each attribute A in Y – X is contained in a candidate key for R, i.e. A is a prime attribute.
74
Example R = (A,B,C,D) A→ C D BA→ C The candiate key is ?
3NF decomposition is ? R1 = (C, A, D) R2 = (B, A, C)
75
3NF Decomposition R(A, B, C, D) R(A,B,C,D,E) R(A, B, C, D, E, F)
FDs: A B, C D R(A,B,C,D,E) FDs: CE, BC R(A, B, C, D, E, F) F = { AB CD, D A, C EF} (AB) (AC) (AB) (DB)
76
3NF-Checking Order What is the candidate key? Which FDs violate 3NF?
Decompose to 3NF. Is the decomposition lossless? Is it dependency preserving?
77
3NF Decomposition Input: A universal relation R and a set of functional dependencies F on R Output: A decomposition D of R into 3NF schemas with dependency preservation and nonadditive join
78
3NF-Normalization Algorithm (3NF Normalization):
Input: Relation R with FDs F c Output: 3NF decomposition D of R D = {} For every XY in F add sub-relation Q =(XY) to D, unless some sub-relation in D already contains all of XY: don’t add Q some sub-relation(S) in D is contained in XY: replace S with Q(XY) If no relation in D contains a key of R, then add new relation Q(X) on some key X of R
79
The End Result A collection of relations, each in 3NF
Each relation has a primary key (We are assuming that there is only one candidate key…) Every non-prime attribute in a relation is determined by its entire primary key No non-prime attribute in a relation is determined by any attributes other than its entire primary key Information can reconstructed using joins, and stored in views if desired
80
3NF Decomposition R={A, B, C, D, E, F} F = { AB CD, C EF, D A }
Non-prime determines a prime attribute
81
Boyce-Codd Normal Form
Boyce-Codd Normal Form (BCNF): For every non-trivial functional dependency XA, it must be the case that X is a superkey “Every determinant must contain a candidate key” X must be a superkey even if A is a prime attribute
82
BCNF Decomposition Input: A universal relation R and a set of functional dependencies F on R Output: A decomposition D of R into BCNF schemas with nonadditive join Algorithm on next page Algorithm does not guarantee dependency preservation
83
BCNF Decomposition Algorithm ALGORITHM BCNF (R: Relation, F: FD set)
BEGIN 1. D {R} 3. While some X → Y holds in some Ri(A1,…,An) in D and (X → Y) is not trivial, X is not a superkey of Ri Ri1 X+ ∩({A1,…,An}) Ri2 X ({A1,…,An} - X+ ) Result Result – {Ri} {Ri1,Ri2} 4. Return result END
84
BCNF Example: R = (A, B, C) F = {A → B, B → C} Is R in BCNF?
A: Consider the nontrivial dependencies in F: 1. A → B, A → R (A is a key) 2. B → C, B → A (B is not a key) Therefore, R not in BCNF
85
BCNF Example: Q: Is the decomposition lossless?
R = R1 R2 R1 = (A, B); R2 = (B, C) F = {A → B, B → C} Are R1, R2 in BCNF? A: 1. Test R1: A → B covered, A → R1 (all other FD’s covered trivial) 2. Test R2: B → C covered, B → R2 (all other FD’s covered trivial) R1, R2 in BCNF Q: Is the decomposition lossless?
86
BCNF Decompose R into BCNF: R = (A, B, C, D, E, H)
F = {A → BC, E → HA} Decompose R into BCNF:
87
BCNF Decomposition Decomposition #1: R = R1 R3 R4 Q: Is this DP?
R = (A, B, C, D, E, H) F = {A → BC, E → HA} (Note: Fc = F) Decomposition #1: R = R1 R3 R4 R = (A, B, C, D, E, H) Decompose on A → BC R1 = (A, B, C) R2 = (A, D, E, H) Decompose on E → HA R3 = (A, E, H) R4 = (D, E) Q: Is this DP? A: Yes. All Fc covered by R1, R3, R4. Therefore F+ covered
88
BCNF Decomposition (cont.)
R = (A, B, C, D, E, H) F = {A → BC, E → HA} (Note: Fc = F) Decomposition #2: R = R1 R3 R5 R6 R = (A, B, C, D, E, H) Decompose on A → B R1 = (A, B) R2 = (A, C, D, E, H) Decompose on E → HA R3 = (A, E, H) R4 = (C, D, E) Decompose on E → C R5 = (C, E) R6 = (E, D) Q: Not DP. Why? A: A → C not covered by R1, R3, R5 , R6.
89
More BCNF (cont.) Q: Can we decompose on FD’s in Fc to get a DP BCNF decomposition? A: Sometimes, BCNF + DP not possible Example: R = (J, K, L) F = {JK → L, L → K} (Fc = F) Decompose on Or: JK → L L → K L → K JK → L
90
Not DP: JK → L not covered
More BCNF (cont.) Q: Can we decompose on FD’s in Fc to get a DP BCNF decomposition? A: Sometimes, BCNF + DP not possible R = (J, K, L) F = {JK → L, L → K} Decomposition #1: Decomposition #2: R = (J, K, L) Decompose on L → K R = (J, K, L) Decompose on JK → L R1 = (L, K) R2 = (J, L) R2 = (J, K, L) R2 = (J, K) Not DP: JK → L not covered
91
BCNF Decomposition R (S, P, Q, X, Y, N, C)
F = { S NC, P XY, SP Q , QP } Decompose to BCNF Is it dependency preserving?
92
BCNF Decomposition R(A, B, C, D) R(A, B, C, D, E, F) FDs: A B, C D
FDs: AC B, C D R(A, B, C, D, E, F) F = { AB CD, C EF, D A } (AB) (AC) (AB) (DB) (CB)
93
Properties of Decompositions
When we work with BCNF, we must look at properties involving multiple relations: Nonadditive (Lossless) Join: No tuples that are not in the original relation (spurious tuples) are generated when decomposed relations are joined Dependency Preservation: Every functional dependency in the original relation is represented somewhere in the decomposition
94
BCNF vs. 3NF Every relation in BCNF is in 3NF
Not every relation in 3NF is in BCNF 3NF relations that are not in BCNF fail because some prime attribute is determined by something that is not a superkey – this is allowed by 3NF but not by BCNF Decomposing tables into BCNF can be tricky – functional dependencies can be lost!
95
Example 3NF vs BCNF Client, Office (Client, Office, Account)
Joe 1 B Mary John C 2 Client, Office (Client, Office, Account) Account Office
96
Remarks on Algorithms Different runs may yield different results, depending on the order in which attributes and functional dependencies are considered We must know all functional dependencies We can’t always guarantee dependency preservation for BCNF, but we can generate a 3NF decomposition and then consider the individual relations in the result
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.