CSC 453 Database Systems Lecture

CSC 453 Database Systems Lecture
Tanu Malik College of CDM DePaul University

Today Quiz Errata Normalization Functional Dependencies Review Closure
Minimal Cover Equivalence

Why Normalization? Databases are published in bulk.
City of Chicago datasets Property Tax Bill data Scientific data

Transaction Data TransID Amount # Shares Investment Risk Category
YTD Yield 1 100 10 Gold Low 3.0% 2 200 Pirates+ High 50.2% 3 202 20 4 505 25 5 900 40 US Bonds 0.1% 6 650 Treasure Hunt Medium 12.17% 7 350 14 8 420

Redundancy in Data TransID Amount # Shares Investment Risk Category
YTD Yield 1 100 10 Gold Low 3.0% 2 200 Pirates+ High 50.2% 3 202 20 4 505 25 5 900 40 US Bonds 0.1% 6 650 Treasure Hunt Medium 12.17% 7 350 14 8 420

Repeating Combinations
TransID Amount # Shares Investment Risk Category YTD Yield 1 100 10 Gold Low 3.0% 2 200 Pirates+ High 50.2% 3 202 20 4 505 25 5 900 40 US Bonds 0.1% 6 650 Treasure Hunt Medium 12.17% 7 350 14 8 420 Leads to: Wasted space Insert/Update/Delete Anomalies

Anomalies Update Anomaly: The hourly_wages in the first tuple could be updated without making a similar change in the second tuple. Insertion Anomaly: Cannot insert a tuple for the hourly_wages unless have an employee with that wage. Deletion Anomaly: If we delete all tuples with a given rating value (e.g., we delete the tuples for Charlie and Don) we lose the association between that rating value and its hourly_wage value.

Functional Dependencies

Functional Dependencies
Set of attributes Y is functionally dependent on set of attributes X if and only if the values of X uniquely determine the values of Y Also “X determines Y”, “X  Y” X is called the determinant A functional dependency must hold for all possible relation states to be valid

Do not Trust the Data Cannot derive FDs from data
It appears that City  State AddressID City State ZipCode 1 Chicago IL 60604 2 Portland ME 04104 3 60611 4 Boston MA 02468 Move lower?

False Function Dependency
New data may violate this “FD” Need other ways to validate AddressID City State ZipCode 1 Chicago IL 60604 2 Portland ME 04104 3 60611 4 Boston MA 02468 5 OR 97232

Where do FDs Come From? Given by domain experts
Deduced by the database designer Laws of Physics X, Y, Z  Location in 3D space Not based on data

Inference Rules Reflexivity: If X includes Y, then X  Y
Augmentation: If X  Y, then XZ  YZ Transitivity: If X  Y and Y  Z, then X  Z [Decomposition: If X  YZ, then X  Y] [Union: If X  Y and X  Z, then X  YZ] [Pseudo-transitivity: If X  Y and WY  Z, then WX  Z]

Search notion of a f.d Accessing ‘x’ requires a column name and rowid
B C 1 a x 2 b y 3 4 c A BC BC A

Closures For a set of functional dependencies F, the closure of F (F+) is the set of all functional dependencies that can be derived from F F+ can be constructed from the closures under F of all possible sets of attributes X For F and a set of attributes X, the closure of X under F (X+) is the set of all attributes that can be determined from X

Finding a Closure of X under F
To find the closure of X under F: set X+ = X repeat set oldX+ = X+ for each Y  Z in F do if X+ includes Y, then set X+ = X+ U Z until oldX+ = X+

Types of Keys Super Key: Candidate Key Primary Key
For each set X, find the closure of X under F If X+ contains all attributes, then X is a superkey Candidate Key If X is a superkey, but no subset Y of X is a superkey, then X is a candidate key Primary Key Choose a candidate key to be the primary key

Find Candidate Keys R(ABCDEFGH) F = {AB →C, A → DE, B → F, F → GH}
F = {AB →C, BD → EF, AD → G, A → H} R(ABCDE) F = {BC → ACE, D →B} F = {AB → CD, D →A, BC →DE}

More examples R(WXYZ) Z → W Y → XZ WX → Y

Equivalent Set of F.Ds Given a relation R and two set of functional dependencies F and G on R, the equivalence question is if F ⊆ G, G ⊆ F, F = G, or F ̸= G.

Example F={A→B,AB→C,D→AC,D→E} G={A→BC,D→AE}

Algorithm Must bring F and G on the same starting point. How?
Compute the closure set of left hand attributes in F, but by using the f.ds in G. Compute the closure set of left hand attributes in G but using the f.ds in F. Compare F’s search power as stated by its f.d set is equivalent to the search power of G.

Minimal Cover F is a minimal cover if F no dependency can be removed from F In other words, F is an irreducible funcitonal dependency set

Finding Minimal Cover Find a minimal cover by eliminating the redundant/extraneous set of attributes in the set F . Given a f.d α → β, 3 kinds of redundancies: the redundancy is on the r.h.s, the redundancy is on l.h.s, or the entire f.d is redundant.

Eliminate redundancy on R.H.S
Use decomposition rule to decompose all f.d’s in F. α → βγ to α → β, and α → γ We have only one attribute on r.h.s, either the entire f.d will be redundant or the f.d will not be redundant.

Eliminate entire f.d Eliminate a f.d if you see it exactly repeated again or can be inferred transitively. Given a α → β in F, Find α+ first using F, compute α+ but this time using F − {α → β}. If α+ does not change then surely the f.d is redundant.

Eliminate redundancy on l.h.s
Given a f.d such αγ → β, determine if α → γ holds in F′ = F −{αγ → β}. If γ can be searched by just α in F’ without the given f.d, then clearly γ is redundant in the given f.d to search for β.

Finding Minimal Cover Minimal/Canonical Cover Algorithm (CCA)
Given FD set, F, CCA finds minimal FD set equivalent to F minimal: can’t find another equivalent FD set with fewer FD’s

Minimal Cover Algorithm
ALGORITHM minimal-cover (X: FD Set) BEGIN REPEAT UNTIL STABLE 1. Remove all trivial FDs. 2. Where possible, apply DECOMPOSITION rule (A’s Axioms) 3. Remove all extraneous attributes: a. Test if B extraneous in A → BC (B extraneous if (A → B)  (F – {A → BC} U {A → C})+) b. Test if B extraneous in AB → C (B extraneous if (A → C)  F+) END

Extraneous Attributes
1. Extraneous in RHS? e.g.: Can we replace A → BC with A → C? (i.e.: Is B extraneous in A → BC?) 2. Extraneous in LHS? e.g.: Can we replace AB → C with A → C? (i.e.: Is B extraneous in AB → C?) Simple (but expensive) test: 1. Replace A → BC (or AB → C) with A → C in F Define F2 = F – {A → BC}  {A → C} OR F2 = F – {AB → C}  {A → C} 2. Test: Is F2+ = F+? If yes, then B was extraneous

A. RHS: Is B extraneous in A → BC? Step 1: F2 = F – {A → BC}  {A → C} Step 2: F+ = F2+? To simplify step 2, observe that F2+  F+ (i.e.: no new FD’s in F2+) Why? Have effectively removed A → B from F When is F+ = F2+? A: When (A → B)  F2+ Idea: If F2+ includes: A → B and A → C, then it includes A → BC

A. RHS: Given F = {A → BC, B → C}, is C extraneous in A → BC? Why or why not? A: Yes, because (A → C)  {A → B, B → C}+ Proof: 1. A → B Given 2. B → C Given 3. A → C transitivity, (1) and (2) Use Armstrong’s axioms in proof

B. LHS: Is B extraneous in AB → C? Step 1: F2 = F – {AB → C} U {A → C} Step 2: F+ = F2+? To Simplify step 2, observe that F2+  F+ (i.e.: there may be new FD’s in F2+) Why? A → C “implies” AB → C. Thus, all FD’s in F+ also in F2+. But AB → C does not “imply” A → C. Thus, all FD’s in F2+, not necessarily in F+. When is F+ = F2+? A: When (A → C)  F+ Idea: If (A → C) F+ , then it will include all FD’s of F2+

Example: Determine the minimal cover of F = {A → BC, B → CE, A → E} Iteration 1: a. F = {A → B A → C, A → E, B → E, B → C} Must check for up to 5 extraneous attributes Is A → B? extraneous No Is A → C? extraneous Yes: (A → C)  {A → BE, B → CE}+ 1. A → BE Given 2. A → B Decomposition (1) 3. B → CE Given 4. B → C Decomposition (3) 5. A  C Trans (2,4) Is B → E? extraneous …

Example (cont.): F = {A → BC, B → CE, A → E} Iteration 1: Iteration 1: a. F = {A → B A → C, A → E, B → E, B → C} Extraneous atts: B extraneous in A → BCE? No C extraneous in A → BCE? Yes… F = {A → B, A → E, B → E, B → C} Is A → E? extraneous Yes: (A → E)  {A → B, B → E, B → C}+ 1. A → B Given 2. B → CE Given 3. B → E Decomposition (2) 4. A → E Trans (1,3) F = {A → B, B → E, B → C} E extraneous in B → E? No C extraneous in B → C? No

Example (cont.): F = {A → BC, B → CE, A → E} Iteration 1: a. F = {A → BCE, B → CE} b. Extraneous atts: B extraneous in A → BCE? No C extraneous in A → BCE? Yes… E extraneous in A → BE? Yes… E extraneous in B → CE? No C extraneous in B → CE? No Iteration 2: a. F = {A → B, B → CE} b. Extraneous atts: E extraneous in B → CE? No C extraneous in B → CE? No DONE!

Minimal Cover of a set of FDs
F = {A → BC, B → CE, A → E, AC → H, D → B} F = {A → BC, B → C, A → B, AB → C, AC → D} A-> B, B -> CE, AC-> H, D->B

Given: Determine canonical cover of F: F = {A → BC, B → CE, A → E, AC → H, D → B} Fc = {A → BH, B → CE, D → B} Another Example: F = {A → BC, B → C, A → B, AB → C, AC → D} cc algorithm Fc = {A → BD, B → C}

Minimal Cover Exercise
Relation R (A, B, C, D) Set F ABCD BC AB ABC BD Compute canonical cover

Normalization

Relational Table A B C D E H I TransID Amount # Shares Investment
Risk Category YTD Yield Transaction Date 1 100 10 Gold Low 3.0% 10/01/2016 2 200 Pirates+ High 50.2% 3 202 20 10/05/2016 4 505 25 10/06/2017 5 900 40 US Bonds 0.1% 10/08/2017 6 650 Treasure Medium 12.17% 7 350 14 8 420 A-> B, C, D D-> E,H, A,D->I A->B,C,D D->E,H A->I TransID → Amount, # of Shares, Investment Investment → Risk Category, YTD Yield TransID, Investment → Transaction Date

Option 1 TransID Amount # Shares Transaction Date 1 100 10 10/01/2016
200 3 202 20 10/05/2016 4 505 25 10/06/2017 5 900 40 10/08/2017 6 650 7 350 14 8 420 Investment Risk Category YTD Yield Gold Low 3.0% Pirates+ High 50.2% US Bonds 0.1% Treasure Medium 12.17% A->B,C,I D->E,H TransID → Amount, # of Shares, Investment Investment → Risk Category, YTD Yield TransID, Investment → Transaction Date

Option 2 TransID Amount # Shares Transaction Date 1 100 10 10/01/2016
200 3 202 20 10/05/2016 4 505 25 10/06/2017 5 900 40 10/08/2017 6 650 7 350 14 8 420 Investment Risk Category YTD Yield Transaction Date Gold Low 3.0% 10/01/2016 Pirates+ High 50.2% 10/05/2016 10/06/2017 US Bonds 0.1% 10/08/2017 Treasure Medium 12.17% A->B,C,I D->E,H TransID → Amount, # of Shares, Investment Investment → Risk Category, YTD Yield TransID, Investment → Transaction Date

Issues with Options Option1: This option does not generate the original table as there is no common attribute Option2: This option has a common attribute but generates spurious tuples

Goals of Decomposition
Redundancy avoidance Avoid any unnecessary data duplication Want to be able to reconstruct what you started with (lossless) Avoid Option1 and Option2 Dependency preservation Minimize the cost of integrity constrains

Nonadditive OR Lossless Join
If we decompose a relation state and combine the resulting restrictions with a natural join, we would expect to get the original relation state There should not be any spurious tuples that were not present in the original relation state… For the nonadditive join property to hold, this must be the case for every relation state

Ensuring Lossless Joins
A Decomposition of R, R = R1  R2 is Lossless iff R1 ∩ R2 → R1 or R1 ∩ R2 → R2 or both (i.e.: Intersecting atts must form a super key for one of the relations) Intuition: Original relation R has n tuples A A R1 ● …  ● R2   ▪ A a key  |R2|  n, & Relationship with R1 is n:1 ▪ A not a key  |R1| = n  n tuples in result

Dependency Preservation
The union over all i in {1,…,m} of the projection of F on Ri is equivalent to F We want the set of all projections to be equivalent to F – the decomposition neither destroys any functional dependencies in R nor introduces any new ones… We can always ensure that this property is preserved, but it is not automatic…

Projections Suppose we have a set of functional dependencies F over R
For each Ri, consider all XY in F such that both X and Y are subsets of Ri This set is called the projection of F on Ri The projection represents the set of all constraints that F puts on the attributes of Ri

Projections: Intuition
R = (A, B, C, D), set of FDs - F R1 = (A, B, C), R2 = (C, D) Projection of F on R1 is all “local” FDs for R1 All FDs in F that use attributes (A, B, C) Projection of F on R2 is all “local” FDs for R2 All FDs in F that use attributes (C, D)

Dependency Preservation Test
The union over all i in {1,…,m} of the projection of F on Ri is equivalent to F We want the set of all projections to be equivalent to F – the decomposition does not destroy any functional dependencies in R. We can always ensure that this property is preserved, but it is not automatic…

Dependency Preservation
The projection of F on Ri is the set of all XY in F+ such that both X and Y are subsets of Ri The projection represents the full set of constraints that F puts on the attributes of Ri For the dependency preservation property to hold, the union of the projections must be equivalent to F

Projections and Dep. Preservation
R = {A, B, C, D} F = {A  BD, B  DC, C  D} Let R = R1, R2 where R1 = {A,B,C}, R2 = {C,D} Compute the projection of F on R1 Compute the projection of F on R2 Is the decomposition dependency preserving? 

Normalization Database normalization, or simply normalization, is the process of organizing the columns (attributes) and tables (relations) of a relational database to reduce data redundancy and improve data integrity.

4 kinds of Normal Forms First Normal Form Second Normal Form
Third Normal Form Boyce-Codd Normal Form Fourth Normal Form

Prime Vs Non-Prime Attributes
Prime Attribute: Any attribute that is specified in the candidate key Non-Prime Attribute: Any attribute not part of the candidate key

First Normal Form (1NF) First Normal Form: Each domain of a column must contain only atomic values, and each column in a record must have at most one value from its domain To get 1NF, remove composite attributes and multi-valued attributes Composite attributes are easy – just subdivide them into multiple attributes…

Removing Multi-Valued Attributes
Remove the multi-valued attribute from the relation Create a new relation with the primary key of the original relation and the multi-valued attribute For each tuple in original relation with k values, add k tuples to the new relation Primary key of new relation contains all attributes; primary key of original relation becomes foreign key in new relation referencing original relation

Second Normal Form (2NF)
Second Normal Form: 1NF, plus every non-prime attribute in the relation is determined by the entire primary key (but not by any subset) To get 2NF, eliminate partial dependencies on the primary key XY is a partial dependency if ZY for some subset Z of X

Removing Partial Dependencies
Find all dependencies where a subset of the primary key determines some non-prime attribute(s) Starting with the smallest subset, do the following: 1. Remove all attributes on the right-hand side from the relation and put them in a new relation 2. Add the attributes in the determinant(l.h.s) to the new relation; make them the primary key, and make them a foreign key in the original relation referencing the new table 3. Remove from any remaining partial dependencies any attributes removed from the original relation

Converting 1NF to 2NF (EmpID, EmpLName, EmpFName, Dept, ProjCode, Hours) Functional Dependencies EmpID  EmpLName, EmpFName, Dept EmpID, ProjCode  Hours

1NF vs 2NF Data EmpID EmpLName EmpFName Dept ProjCode Hours 550 Smith
Winston Accounting 101 20 252 10 601 Barney Finance 5 390 Hammond Evey Personnel 995 25 001 Preston Bill Special Events 100 Logan Ted 007 Bond James 505 Lane Lois Media Relations

Converting 1NF to 2NF Decompose (EmpID, EmpLName, EmpFName, Dept, ProjCode, Hours) Remove the offending FD EmpID EmpLName, EmpFName, Dept Two tables (EmpID, EmpLName, EmpFName, Dept) (EmpID, ProjCode, Hours)

Towards 2NF EmpID EmpLName EmpFName Dept 550 Smith Winston Accounting
601 Barney Finance 390 Hammond Evey Personnel 001 Preston Bill Special Events 100 Logan Ted 007 Bond James 505 Lane Lois Media Relations EmpID ProjCode Hours 550 101 20 252 10 601 5 390 995 25 001 100 007 505

Towards 2NF EmpID  EmpLName, EmpFName, Dept EmpID, ProjCode  Hours
550 Smith Winston Accounting 601 Barney Finance 390 Hammond Evey Personnel 001 Preston Bill Special Events 100 Logan Ted 007 Bond James 505 Lane Lois Media Relations EmpID ProjCode Hours 550 101 20 252 10 601 5 390 995 25 001 100 007 505 EmpID  EmpLName, EmpFName, Dept EmpID, ProjCode  Hours

Third Normal Form (3NF) Third Normal Form: 2NF, plus every non-prime attribute in the relation is determined only by the primary key of the relation. To get 3NF, eliminate transitive dependencies on the primary key XY is a transitive dependency if XZ and ZY for some Z that is disjoint from X

Removing Transitive Dependencies
Find all dependencies where a set of attributes disjoint from the primary key determines some non-prime attribute(s) Starting with the smallest such set, do the following: 1. Remove all attributes on the right-hand side from the relation and put them in a new relation 2. Add the attributes in the determinant to the new relation; make them the primary key, and make them a foreign key in the original relation referencing the new relation 3. Remove from any remaining transitive dependencies any attributes removed from the original relation

Converting 2NF to 3NF Schema: (First , Last, Address, City, State, Zip) First, Last  Address, City, State, Zip Transitive functional dependency: Zip  City, State

2NF Data Table First Last Address City State ZipCode Henry Bienen
2145 Sheridan Rd Evanston IL 60202 Helmut Epp 55 E. Jackson St Chicago 60604 Denis Stein 55 E. Jackson St. David Miller 243 S. Wabash Av. Gary 5000 Forbes Av. Pittsburgh PA 15213 Leighton 77 Beacon St Cambridge MA 02139

Decompose the Tables City State ZipCode Evanston IL 60202 Chicago
First Last Address ZipCode Henry Bienen 2145 Sheridan Rd 60202 Helmut Epp 55 E. Jackson St 60604 Denis Stein 55 E. Jackson St. David Miller 243 S. Wabash Av. Gary 5000 Forbes Av. 15213 Leighton 77 Beacon St 02139 City State ZipCode Evanston IL 60202 Chicago 60604 Pittsburgh PA 15213 Cambridge MA 02139

Decompose the Tables ZipCode  City, State
First Last Address ZipCode Henry Bienen 2145 Sheridan Rd 60202 Helmut Epp 55 E. Jackson St 60604 Denis Stein 55 E. Jackson St. David Miller 243 S. Wabash Av. Gary 5000 Forbes Av. 15213 Leighton 77 Beacon St 02139 City State ZipCode Evanston IL 60202 Chicago 60604 Pittsburgh PA 15213 Cambridge MA 02139 ZipCode  City, State First, Last  First, Last, Address, ZipCode

Third Normal Form Third Normal Form: Every non-prime attribute is fully functionally dependent on every candidate key, and no non-prime attribute is transitively dependent on any candidate key “For every non-trivial functional dependency XA, either X is a superkey or A is a prime attribute.”

Third Normal Form A relation schema is in third nomal form (3NF) if for all: X → Y in F+ at least one of the following holds: X → Y is trivial (i.e. X ∈ Y) X is a superkey for R each attribute A in Y – X is contained in a candidate key for R, i.e. A is a prime attribute.

Example R = (A,B,C,D) A→ C D BA→ C The candiate key is ?
3NF decomposition is ? R1 = (C, A, D) R2 = (B, A, C)

3NF Decomposition R(A, B, C, D) R(A,B,C,D,E) R(A, B, C, D, E, F)
FDs: A B, C D R(A,B,C,D,E) FDs: CE, BC R(A, B, C, D, E, F) F = { AB  CD, D  A, C  EF} (AB) (AC) (AB) (DB)

3NF-Checking Order What is the candidate key? Which FDs violate 3NF?
Decompose to 3NF. Is the decomposition lossless? Is it dependency preserving?

3NF Decomposition Input: A universal relation R and a set of functional dependencies F on R Output: A decomposition D of R into 3NF schemas with dependency preservation and nonadditive join

3NF-Normalization Algorithm (3NF Normalization):
Input: Relation R with FDs F c Output: 3NF decomposition D of R D = {} For every XY in F add sub-relation Q =(XY) to D, unless some sub-relation in D already contains all of XY: don’t add Q some sub-relation(S) in D is contained in XY: replace S with Q(XY) If no relation in D contains a key of R, then add new relation Q(X) on some key X of R

The End Result A collection of relations, each in 3NF
Each relation has a primary key (We are assuming that there is only one candidate key…) Every non-prime attribute in a relation is determined by its entire primary key No non-prime attribute in a relation is determined by any attributes other than its entire primary key Information can reconstructed using joins, and stored in views if desired

3NF Decomposition R={A, B, C, D, E, F} F = { AB  CD, C  EF, D  A }
Non-prime determines a prime attribute 

Boyce-Codd Normal Form
Boyce-Codd Normal Form (BCNF): For every non-trivial functional dependency XA, it must be the case that X is a superkey “Every determinant must contain a candidate key” X must be a superkey even if A is a prime attribute

BCNF Decomposition Input: A universal relation R and a set of functional dependencies F on R Output: A decomposition D of R into BCNF schemas with nonadditive join Algorithm on next page Algorithm does not guarantee dependency preservation

BCNF Decomposition Algorithm ALGORITHM BCNF (R: Relation, F: FD set)
BEGIN 1. D  {R} 3. While some X → Y holds in some Ri(A1,…,An) in D and (X → Y) is not trivial, X is not a superkey of Ri Ri1  X+ ∩({A1,…,An}) Ri2  X  ({A1,…,An} - X+ ) Result  Result – {Ri}  {Ri1,Ri2} 4. Return result END

BCNF Example: R = (A, B, C) F = {A → B, B → C} Is R in BCNF?
A: Consider the nontrivial dependencies in F: 1. A → B, A → R (A is a key) 2. B → C, B → A (B is not a key) Therefore, R not in BCNF

BCNF Example: Q: Is the decomposition lossless?
R = R1  R2 R1 = (A, B); R2 = (B, C) F = {A → B, B → C} Are R1, R2 in BCNF? A: 1. Test R1: A → B covered, A → R1 (all other FD’s covered trivial) 2. Test R2: B → C covered, B → R2 (all other FD’s covered trivial)  R1, R2 in BCNF Q: Is the decomposition lossless?

BCNF Decompose R into BCNF: R = (A, B, C, D, E, H)
F = {A → BC, E → HA} Decompose R into BCNF:

BCNF Decomposition Decomposition #1: R = R1  R3  R4 Q: Is this DP?
R = (A, B, C, D, E, H) F = {A → BC, E → HA} (Note: Fc = F) Decomposition #1: R = R1  R3  R4 R = (A, B, C, D, E, H) Decompose on A → BC R1 = (A, B, C) R2 = (A, D, E, H) Decompose on E → HA R3 = (A, E, H) R4 = (D, E) Q: Is this DP? A: Yes. All Fc covered by R1, R3, R4. Therefore F+ covered

BCNF Decomposition (cont.)
R = (A, B, C, D, E, H) F = {A → BC, E → HA} (Note: Fc = F) Decomposition #2: R = R1  R3  R5  R6 R = (A, B, C, D, E, H) Decompose on A → B R1 = (A, B) R2 = (A, C, D, E, H) Decompose on E → HA R3 = (A, E, H) R4 = (C, D, E) Decompose on E → C R5 = (C, E) R6 = (E, D) Q: Not DP. Why? A: A → C not covered by R1, R3, R5 , R6.

More BCNF (cont.) Q: Can we decompose on FD’s in Fc to get a DP BCNF decomposition? A: Sometimes, BCNF + DP not possible Example: R = (J, K, L) F = {JK → L, L → K} (Fc = F) Decompose on Or: JK → L L → K L → K JK → L

Not DP: JK → L not covered
More BCNF (cont.) Q: Can we decompose on FD’s in Fc to get a DP BCNF decomposition? A: Sometimes, BCNF + DP not possible R = (J, K, L) F = {JK → L, L → K} Decomposition #1: Decomposition #2: R = (J, K, L) Decompose on L → K R = (J, K, L) Decompose on JK → L R1 = (L, K) R2 = (J, L) R2 = (J, K, L) R2 = (J, K) Not DP: JK → L not covered

BCNF Decomposition R (S, P, Q, X, Y, N, C)
F = { S  NC, P  XY, SP  Q , QP } Decompose to BCNF Is it dependency preserving?

BCNF Decomposition R(A, B, C, D) R(A, B, C, D, E, F) FDs: A B, C D
FDs: AC B, C D R(A, B, C, D, E, F) F = { AB  CD, C  EF, D  A } (AB) (AC) (AB) (DB) (CB)

Properties of Decompositions
When we work with BCNF, we must look at properties involving multiple relations: Nonadditive (Lossless) Join: No tuples that are not in the original relation (spurious tuples) are generated when decomposed relations are joined Dependency Preservation: Every functional dependency in the original relation is represented somewhere in the decomposition

BCNF vs. 3NF Every relation in BCNF is in 3NF
Not every relation in 3NF is in BCNF 3NF relations that are not in BCNF fail because some prime attribute is determined by something that is not a superkey – this is allowed by 3NF but not by BCNF Decomposing tables into BCNF can be tricky – functional dependencies can be lost!

Example 3NF vs BCNF Client, Office  (Client, Office, Account)
Joe 1 B Mary John C 2 Client, Office  (Client, Office, Account) Account  Office

Remarks on Algorithms Different runs may yield different results, depending on the order in which attributes and functional dependencies are considered We must know all functional dependencies We can’t always guarantee dependency preservation for BCNF, but we can generate a 3NF decomposition and then consider the individual relations in the result

CSC 453 Database Systems Lecture

Similar presentations

Presentation on theme: "CSC 453 Database Systems Lecture"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSC 453 Database Systems Lecture

Similar presentations

Presentation on theme: "CSC 453 Database Systems Lecture"— Presentation transcript:

Similar presentations

About project

Feedback