# 1 Loss-Less Joins. 2 Decompositions uDependency-preservation property: enforce constraints on original relation by enforcing some constraints on resulting.

## Presentation on theme: "1 Loss-Less Joins. 2 Decompositions uDependency-preservation property: enforce constraints on original relation by enforcing some constraints on resulting."— Presentation transcript:

1 Loss-Less Joins

2 Decompositions uDependency-preservation property: enforce constraints on original relation by enforcing some constraints on resulting relations uLossless-join property: get original relation by joining the resulting relations uBoyce-Codd normal form (BCNF): lossless join uThird normal form (3NF): lossless join and dependency preservation

3 Testing for a Dependency Preservation uIf we project R with FD set F, onto R 1 with FD F 1, R 2 with FD F 2,…, R k with FD F k uWe say that dependencies are preserved if and ony if F + = (F 1 + U F 2 + U … U F k + )

4 Testing for a Lossless Join uIf we project R onto R 1, R 2,…, R k, can we recover R by rejoining? uAny tuple in R can be recovered from its projected fragments. uSo the only question is: when we rejoin, do we ever get back something we didn’t have originally?

5 Lossy Decomposition R  R 1 R2R2... RnRn R R 1 SSN Name Address SSN Name NameAddress 1111 Joe 1 Pine 1111 Joe Joe 1 Pine 2222 Alice 2 Oak 2222 Alice Alice 2 Oak 3333 Alice 3 Pine 3333 Alice Alice 3 Pine R2R2 RnRn... R1R1 R2R2 R Problem: Name is not a key Always true! Have to Check ?

6 The Chase Test uSuppose tuple t comes back in the join. uThen t is the join of projections of some tuples of R, one for each R i of the decomposition. uCan we use the given FD’s to show that one of these tuples in R must be t ?

7 The Chase – (2) uStart by assuming t = abc…. uFor each i, there is a tuple s i of R that has a, b, c,… in the attributes of R i. us i can have any values in other attributes. uWe’ll use the same letter as in t, but with a subscript, for these components.

8 Example: The Chase uLet R = ABCD, and the decomposition be AB, BC, and CD. uLet the given FD’s be C->D and B ->A. uSuppose the tuple t = abcd is the join of tuples projected onto AB, BC, CD.

9 The Tableau ABCDabc1d1a2bcd2a3b3cdABCDabc1d1a2bcd2a3b3cd d Use C->D a Use B ->A We’ve proved the second tuple must be t. The tuples of R pro- jected onto AB, BC, CD.

10 Summary of the Chase 1.If two rows agree in the left side of a FD, make their right sides agree too. 2.Always replace a subscripted symbol by the corresponding unsubscripted one, if possible. 3.If we ever get an unsubscripted row, we know any tuple in the project-join is in the original (the join is lossless). 4.Otherwise, the final tableau is a counterexample.

11 Example: Lossy Join uSame relation R = ABCD and same decomposition. uBut with only the FD C->D.

12 The Tableau ABCDabc1d1a2bcd2a3b3cdABCDabc1d1a2bcd2a3b3cd d Use C->D These three tuples are an example R that shows the join lossy. abcd is not in R, but we can project and rejoin to get abcd. These projections rejoin to form abcd.

13 Attribute Closure as Chase ABCDEFabc1d1e1f1abc2d2e2f2ABCDEFabc1d1e1f1abc2d2e2f2 R = ABCDE, AB ->C, BC ->AD, D ->E, CF ->B Compute (AB) +

14 Multivalued Dependencies Fourth Normal Form Reasoning About FD’s + MVD’s

15 Definition of MVD uA multivalued dependency (MVD) on R, X ->->Y says that if two tuples of R agree on all the attributes of X, then their components in Y may be swapped, and the result will be two tuples that are also in the relation. uLet Z = R - (X+Y), then for each value of X, values of Y are independent of values of Z.

16 Multi-valued Dependencies? COURSEPROFESSORBOOKS DBSysZakiLBK DBSysZakiO’Neil DBSysZakiDate DBSysAdaliLBK DBSysAdaliO’Neil DBSysAdaliDate Comp AlgoMusserCLR Comp AlgoMusserBaase Comp AlgoGoldbergCLR Comp AlgoGoldbergBaase Example: The MVD Course →→ Prof holds MVD: X →→ Y holds over R, then for any value of attribute set X = x, the following holds true (let Z = R-XY): ∏ YZ (σ X=x (R)) = ∏ Y (σ X=x (R)) x ∏ Z (σ X = x (R)) That is Y and Z=R-XY are independent of each other given X

17 Example: MVD Drinkers(name, addr, phones, beersLiked) uA drinker’s phones are independent of the beers they like. wname->->phones and name ->->beersLiked. uThus, each of a drinker’s phones appears with each of the beers they like in all combinations. uThis repetition is unlike FD redundancy. wname->addr is the only FD.

18 Tuples Implied by name->->phones If we have tuples: nameaddrphones beersLiked sueap1 b1 sueap2 b2 sueap2 b1 sueap1 b2 Then these tuples must also be in the relation.

19 Picture of MVD X ->->Y XY others equal exchange

20 MVD Rules uEvery FD is an MVD (promotion ). wIf X ->Y, then swapping Y ’s between two tuples that agree on X doesn’t change the tuples (same X values imply same Y values!). wTherefore, the “new” tuples are surely in the relation, and we know X ->->Y. uComplementation : If X ->->Y, and Z is all the other attributes, then X ->->Z.

21 Splitting Doesn’t Hold uUnlike FD’s, we cannot split the right side --- sometimes you have to leave several attributes on the right side.

22 Example: Multiattribute Right Sides Drinkers(name, areaCode, phone, beersLiked, manf) uA drinker can have several phones, with the number divided between areaCode and phone (last 7 digits). uA drinker can like several beers, each with its own manufacturer.

23 Example Continued uSince the areaCode-phone combinations for a drinker are independent of the beersLiked-manf combinations, we expect that the following MVD’s hold: name ->-> areaCode phone name ->-> beersLiked manf

24 Example Data Here is possible data satisfying these MVD’s: nameareaCodephonebeersLikedmanf Sue650555-1111BudA.B. Sue650555-1111WickedAlePete’s Sue415555-9999BudA.B. Sue415555-9999WickedAlePete’s But we cannot swap area codes or phones by themselves. That is, neither name->->areaCode nor name->->phone holds for this relation.

25 Fourth Normal Form uThe redundancy that comes from MVD’s is not removable by putting the database schema in BCNF. uThere is a stronger normal form, called 4NF, that (intuitively) treats MVD’s as FD’s when it comes to decomposition, but not when determining keys of the relation.

26 4NF Definition uA relation R is in 4NF if: whenever X ->->Y is a nontrivial MVD, then X is a superkey. wNontrivial MVD means that: 1.Y is not a subset of X, and 2.X and Y are not, together, all the attributes. wNote that the definition of “superkey” still depends on FD’s only.

27 BCNF Versus 4NF uRemember that every FD X ->Y is also an MVD, X ->->Y. uThus, if R is in 4NF, it is certainly in BCNF. wBecause any BCNF violation is a 4NF violation (after conversion to an MVD). uBut R could be in BCNF and not 4NF, because MVD’s are “invisible” to BCNF.

28 Decomposition and 4NF uIf X ->->Y is a 4NF violation for relation R, we can decompose R using the same technique as for BCNF. 1.XY is one of the decomposed relations. 2.(R – Y) U X is the other.

29 Example: 4NF Decomposition Drinkers(name, addr, phones, beersLiked) FD: name -> addr MVD’s: name ->-> phones name ->-> beersLiked uKey is {name, phones, beersLiked}. uAll dependencies violate 4NF.

30 Example Continued uDecompose using name -> addr: 1.Drinkers1(name, addr) uIn 4NF; only dependency is name -> addr. 2.Drinkers2(name, phones, beersLiked) uNot in 4NF. MVD’s name ->-> phones and name ->-> beersLiked apply. No FD’s, so all three attributes form the key.

31 Example: Decompose Drinkers2 uEither MVD name ->-> phones or name ->-> beersLiked tells us to decompose to: wDrinkers3(name, phones) wDrinkers4(name, beersLiked)

32 Reasoning About MVD’s + FD’s uProblem: given a set of MVD’s and/or FD’s that hold for a relation R, does a certain FD or MVD also hold in R ? uSolution: Use a tableau to explore all inferences from the given set, to see if you can prove the target dependency.

33 Why Do We Care? 1.4NF technically requires an MVD violation. wNeed to infer MVD’s from given FD’s and MVD’s that may not be violations themselves. 2.When we decompose, we need to project FD’s + MVD’s.

34 Example: Chasing a Tableau With MVD’s and FD’s uTo apply a FD, equate symbols, as before. uTo apply an MVD, generate one or both of the tuples we know must also be in the relation represented by the tableau. uWe’ll prove: if A->->BC and D->C, then A->C.

35 The Tableau for A->C ABCDab1c1d1ab2c2d2ABCDab1c1d1ab2c2d2 Goal: prove that c 1 = c 2. A->->BC and D->C ab2c2d1ab2c2d1 Use A->->BC (first row’s D with second row’s BC ). c2c2 Use D->C (first and third row agree on D, therefore agree on C ).

36 Example: Transitive Law for MVD’s uIf A->->B and B->->C, then A->->C. wObvious from the complementation rule if the Schema is ABC. wBut it holds no matter what the schema; we’ll assume ABCD.

37 The Tableau for A->->C ABCDab1c1d1ab2c2d2ABCDab1c1d1ab2c2d2 Goal: derive tuple (a,b 1,c 2,d 1 ). A->->B and B->->C ab1c2d2ab1c2d2 Use A->->B to swap B from the first row into the second. ab1c2d1ab1c2d1 Use B->->C to swap C from the third row into the first.

38 Rules for Inferring MVD’s + FD’s uStart with a tableau of two rows. wThese rows agree on the attributes of the left side of the dependency to be inferred. wAnd they disagree on all other attributes. wUse unsubscripted variables where they agree, subscripts where they disagree.

39 Inference: Applying a FD uApply a FD X->Y by finding rows that agree on all attributes of X. Force the rows to agree on all attributes of Y. wReplace one variable by the other. wIf the replaced variable is part of the goal tuple, replace it there too.

40 Inference: Applying a MVD uApply a MVD X->->Y by finding two rows that agree in X. wAdd to the tableau one or both rows that are formed by swapping the Y-components of these two rows.

41 Inference: Goals uTo test whether U->V holds, we succeed by inferring that the two variables in each column of V are actually the same. uIf we are testing U->->V, we succeed if we infer in the tableau a row that is the original two rows with the components of V swapped.

42 Inference: Endgame uApply all the given FD’s and MVD’s until we cannot change the tableau. uIf we meet the goal, then the dependency is inferred. uIf not, then the final tableau is a counterexample relation. wSatisfies all given dependencies. wOriginal two rows violate target dependency.

Similar presentations