Relational Database Design (Discussion Session) Jianguo Wang Office hour: Friday 5pm – 6pm at CSE 3217
Functional dependency A functional dependency (FD) has the form of X → Y (X implies Y), where X and Y are sets of attributes Whenever two tuples are identical on all the attributes in X, they must also be identical in all the attributes in Y . Examples sid → sname {sid, cid} → gpa
Closure of an attribute set The closure of a set X of attributes is the set of all such attributes A that X → A can be deduced from F R = ABCDEF F = {AC, BCD, ADE} X = AB Computing X+ : Initially: X(0) = AB Applying AC: X(1) = ABC Applying BCD: X(2) = ABCD Applying ADE: X(3) = ABCDE No more changes: X(4) = X(3) X+ = ABCDE
Candidate key A candidate key is a set X of attributes in R such that X+ includes all the attributes in R. There is no proper subset Y of X such that Y+ includes all the attributes in R. A super key is a super set of key Example: R(A, B, C, D), F = {A → B, B → C} A is not a candidate key, because A+ = {A, B, C} AD is a candidate key, because AD+ = {A, B, C, D} ABD is a super key
Normal forms – BCNF For every functional dependency X→Y (YX) Example: X has to be a super key, i.e., X has to contain a candidate key Example: 4 attributes A, B, C, D, and F = {A → B, B → C} R(A, B, C) does not satisfy BCNF A is the key B → C does not include A R(A, B) satisfies BCNF
BCNF decomposition with lossless join For each FD XA, if it does not satisfy BCNF (i.e., X is not a super key), decompose it S1 = XA S2 = X(S A)
FDs: CT, CS G, HR C, HS R, TH R Example FDs: CT, CS G, HR C, HS R, TH R CTHRSG violation: CS G (CS)+ = CSTG, i.e., CS is not a super key CSG CTHRS no violation, (CS)+ = CSG violation: C T C+ = CT CT CHRS no violation: 2 attributes violation: CH R (CH)+ = CHTR CHR CHS no violation: C+ = CT, H+ = H, R+ = R, S+ = S Resulting decomposition: CSG, CT, CHR, CHS
Normal forms – 3NF For every functional dependency X→Y (YX), at least one of the following holds: X is a super key, i.e., X contains a candidate key Y belongs to a candidate key No, A and D are candidate key, B → C violates
3NF decomposition with dependency preserving Step 1: Simplify the set of FDs (eliminate redundancies) , called the minimal cover Rewrite the FDs with single attributes on RHS Replace AB CD with AB C and AB D Eliminate redundant FDs F = {A C, A B, B C} A C is redundant (it is implied by A B and B C) Eliminate redundant attributes from LHS of FDs F = {A B, AB C} B is redundant in AB C because A by itself determines C (A C is implied by F)
3NF decomposition with dependency preserving Step 2: the decomposition is defined as = {XA1 … Am | XAiF}{BR | B does not occur in F} Example R = CTHRSG F = CT CS G HR C HS R HT R Then = {CT, CSG, HRC, HSR, HTR}
Decomposition with lossless join and dependency preserving Step 1: compute 3NF decomposition with dependency preserving Step 2: examine each set to see whether they have already contained a candidate key if not, add it
Example R: ABCDE F = {AC, BCD ADE} = {AC, BCD, ADE} None of AC, BCD, ADE is a superkey AC+ = AC, BCD+ = BCD, ADE+ = ADEC AB is a key, add it to = {AB, AC, BCD, ADE}
Testing lossless join Consider 5 attributes: A, B, C, D, E FD = {ABC, CD, DE} Consider a decomposition is R1(A,B,C); R2(C,D); and R3(D,E) Is this satisfying lossless join decomposition? A B C D E a1 a2 a3 - a4 a5 R1 a4 a5 R2 R3