Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms.

Similar presentations


Presentation on theme: "1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms."— Presentation transcript:

1 1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms

2 2 Functional Dependencies uX ->Y is an assertion about a relation R that whenever two tuples of R agree on all the attributes of X, then they must also agree on all attributes in set Y. wSay “X ->Y holds in R.” wConvention: …, X, Y, Z represent sets of attributes; A, B, C,… represent single attributes. wConvention: no set formers in sets of attributes, just ABC, rather than {A,B,C }.

3 3 Splitting Right Sides of FD’s uX->A 1 A 2 …A n holds for R exactly when each of X->A 1, X->A 2,…, X->A n hold for R. uExample: A->BC is equivalent to A->B and A->C. u There is no splitting rule for left sides. uWe’ll generally express FD’s with singleton right sides.

4 4 Example: FD’s Drinkers(name, addr, beersLiked, manf, favBeer) uReasonable FD’s to assert: 1.name -> addr favBeer wNote this FD is the same as name -> addr and name -> favBeer. 2.beersLiked -> manf

5 5 Example: Possible Data nameaddr beersLiked manffavBeer JanewayVoyager Bud A.B.WickedAle JanewayVoyager WickedAle Pete’sWickedAle SpockEnterprise Bud A.B.Bud Because name -> addr Because name -> favBeer Because beersLiked -> manf

6 6 Keys of Relations uK is a superkey for relation R if K functionally determines all of R. uK is a key for R if K is a superkey, but no proper subset of K is a superkey.

7 7 Example: Superkey Drinkers(name, addr, beersLiked, manf, favBeer) u {name, beersLiked} is a superkey because together these attributes determine all the other attributes. wname -> addr favBeer wbeersLiked -> manf

8 8 Example: Key u{name, beersLiked} is a key because neither {name} nor {beersLiked} is a superkey. wname doesn’t -> manf; beersLiked doesn’t -> addr. uThere are no other keys, but lots of superkeys. wAny superset of {name, beersLiked}.

9 9 More FD’s From “Physics” uExample: “no two courses can meet in the same room at the same time” tells us: hour room -> course.

10 10 Inferring FD’s uWe are given FD’s X 1 -> A 1, X 2 -> A 2,…, X n -> A n, and we want to know whether an FD Y -> B must hold in any relation that satisfies the given FD’s. wExample: If A -> B and B -> C hold, surely A -> C holds, even if we don’t say so. uImportant for design of good relation schemas.

11 11 Inference Test uTo test if Y -> B, start by assuming two tuples agree in all attributes of Y. Y 0000000... 0 00000??... ?

12 12 Inference Test – (cont.) uUse the given FD’s to infer that these tuples must also agree in certain other attributes. wIf B is one of these attributes, then Y -> B is true. wOtherwise, the two tuples, with any forced equalities, form a two-tuple relation that proves Y -> B does not follow from the given FD’s.

13 13 Closure Test uAn easier way to test is to compute the closure of Y, denoted Y +. uBasis: Y + = Y. uInduction: Look for an FD’s left side X that is a subset of the current Y +. If the FD is X -> A, add A to Y +.

14 14 Y+Y+ new Y + XA

15 15 Finding All Implied FD’s uMotivation: “normalization,” the process where we break a relation schema into two or more schemas. uExample: ABCD with FD’s AB ->C, C ->D, and D ->A. wDecompose into ABC, AD. What FD’s hold in ABC ? wNot only AB ->C, but also C ->A !

16 16 Why? a1b1ca1b1c ABC ABCD a2b2ca2b2c Thus, tuples in the projection with equal C’s have equal A’s; C -> A. a 1 b 1 cd 1 a 2 b 2 cd 2 comes from d 1 =d 2 because C -> D a 1 =a 2 because D -> A

17 17 Basic Idea 1.Start with given FD’s and find all nontrivial FD’s that follow from the given FD’s. wNontrivial = right side not contained in the left. 2.Restrict to those FD’s that involve only attributes of the projected schema.

18 18 Simple, Exponential Algorithm 1.For each set of attributes X, compute X +. 2.Add X ->A for all A in X + - X. 3.However, drop XY ->A whenever we discover X ->A. uBecause XY ->A follows from X ->A in any projection. 4.Finally, use only FD’s involving projected attributes.

19 19 Example: Projecting FD’s uABC with FD’s A ->B and B ->C. Project onto AC. wA + =ABC ; yields A ->B, A ->C. We do not need to compute AB + or AC +. wB + =BC ; yields B ->C. wC + =C ; yields nothing. wBC + =BC ; yields nothing.

20 20 Example -- Continued uResulting FD’s: A ->B, A ->C, and B ->C. uProjection onto AC : A ->C. wOnly FD that involves a subset of {A,C }.

21 21 Armstrong’s Axioms for FDs This is the syntactic way of computing/testing the various properties of FDs Reflexivity: If Y  X then X  Y (trivial FD) Name, Address  Name Augmentation: If X  Y then X Z  YZ If Town  Zip then Town, Name  Zip, Name Transitivity: If X  Y and Y  Z then X  Z

22 22 Relational Schema Design uGoal of relational schema design is to avoid anomalies and redundancy. wUpdate anomaly : one occurrence of a fact is changed, but not all occurrences. wDeletion anomaly : valid fact is lost when a tuple is deleted.

23 23 Example of Bad Design Drinkers(name, addr, beersLiked, manf, favBeer) nameaddrbeersLikedmanffavBeer JanewayVoyagerBudA.B.WickedAle Janeway???WickedAlePete’s??? SpockEnterpriseBud???Bud Data is redundant, because each of the ???’s can be figured out by using the FD’s name -> addr favBeer and beersLiked -> manf.

24 24 This Bad Design Also Exhibits Anomalies nameaddrbeersLikedmanffavBeer JanewayVoyagerBudA.B.WickedAle JanewayVoyagerWickedAlePete’sWickedAle SpockEnterpriseBudA.B.Bud Update anomaly: if Janeway is transferred to Intrepid, will we remember to change each of her tuples? Deletion anomaly: If nobody likes Bud, we lose track of the fact that Anheuser-Busch manufactures Bud.

25 What is Database Normalization? uDatabase Normalization is a step wise formal process that allows us to decompose Database Tables in such a way that both Data Redundancy and Update Anomalies are minimised. uIt makes use of functional dependencies that exist in a table and the primary key or Candidate Keys in analysing the tables. 25

26 26 Boyce-Codd Normal Form uWe say a relation R is in BCNF if whenever X ->Y is a nontrivial FD that holds in R, X is a superkey. wRemember: nontrivial means Y is not contained in X. wRemember, a superkey is any superset of a key (not necessarily a proper superset).

27 27 Example Drinkers(name, addr, beersLiked, manf, favBeer) FD’s: name->addr favBeer, beersLiked->manf uOnly key is {name, beersLiked}. uIn each FD, the left side is not a superkey. uAny one of these FD’s shows Drinkers is not in BCNF

28 28 Another Example Beers(name, manf, manfAddr) FD’s: name->manf, manf->manfAddr uOnly key is {name}. uname->manf does not violate BCNF, but manf->manfAddr does.

29 29 Decomposition into BCNF uGiven: relation R with FD’s F. uLook among the given FD’s for a BCNF violation X ->Y. wIf any FD following from F violates BCNF, then there will surely be an FD in F itself that violates BCNF. uCompute X +. wNot all attributes, or else X is a superkey.

30 30 Decompose R Using X -> Y uReplace R by relations with schemas: 1. R 1 = X +. 2. R 2 = R – (X + – X ). uProject given FD’s F onto the two new relations.

31 31 Decomposition Picture R-X +R-X + XX +-XX +-X R2R2 R1R1 R

32 32 Example: BCNF Decomposition Drinkers(name, addr, beersLiked, manf, favBeer) F = name->addr, name -> favBeer, beersLiked->manf uPick BCNF violation name->addr. uClose the left side: {name} + = {name, addr, favBeer}. uDecomposed relations: 1.Drinkers1(name, addr, favBeer) 2.Drinkers2(name, beersLiked, manf)

33 33 Example -- Continued uWe are not done; we need to check Drinkers1 and Drinkers2 for BCNF. uProjecting FD’s is easy here. uFor Drinkers1(name, addr, favBeer), relevant FD’s are name->addr and name->favBeer. wThus, {name} is the only key and Drinkers1 is in BCNF.

34 34 Example -- Continued uFor Drinkers2(name, beersLiked, manf), the only FD is beersLiked->manf, and the only key is {name, beersLiked}. wViolation of BCNF. ubeersLiked + = {beersLiked, manf}, so we decompose Drinkers2 into: 1.Drinkers3(beersLiked, manf) 2.Drinkers4(name, beersLiked)

35 35 Example -- Concluded uThe resulting decomposition of Drinkers : 1.Drinkers1(name, addr, favBeer) 2.Drinkers3(beersLiked, manf) 3.Drinkers4(name, beersLiked) uNotice: Drinkers1 tells us about drinkers, Drinkers3 tells us about beers, and Drinkers4 tells us the relationship between drinkers and the beers they like.

36 36 Third Normal Form -- Motivation uThere is one structure of FD’s that causes trouble when we decompose. uAB ->C and C ->B. wExample: A = street address, B = city, C = zip code. uThere are two keys, {A,B } and {A,C }. uC ->B is a BCNF violation, so we must decompose into AC, BC.

37 37 We Cannot Enforce FD’s uThe problem is that if we use AC and BC as our database schema, we cannot enforce the FD AB ->C by checking FD’s in these decomposed relations. uExample with A = street, B = city, and C = zip on the next slide.

38 38 An Unenforceable FD street zip 545 Tech Sq.02138 545 Tech Sq.02139 city zip Cambridge02138 Cambridge02139 Join tuples with equal zip codes. street city zip 545 Tech Sq.Cambridge02138 545 Tech Sq.Cambridge02139 Although no FD’s were violated in the decomposed relations, FD street city -> zip is violated by the database as a whole.

39 39 3NF Let’s Us Avoid This Problem u3 rd Normal Form (3NF) modifies the BCNF condition so we do not have to decompose in this problem situation. uAn attribute is prime if it is a member of any key. uX ->A violates 3NF if and only if X is not a superkey, and also A is not prime.

40 40 Third Normal Form uA relational schema R is in 3NF if for every FD X  Y associated with R either: wY  X (i.e., the FD is trivial); or wX is a superkey of R; or wEvery A  Y is part of some key of R u3NF is weaker than BCNF (every schema that is in BCNF is also in 3NF) BCNF conditions

41 41 Example: 3NF uIn our problem situation with FD’s AB ->C and C ->B, we have keys AB and AC. uThus A, B, and C are each prime. uAlthough C ->B violates BCNF, it does not violate 3NF.

42 42 What 3NF and BCNF Give You uThere are two important properties of a decomposition: 1.Lossless Join : it should be possible to project the original relations onto the decomposed schema, and then reconstruct the original. 2.Dependency Preservation : it should be possible to check in the projected relations whether all the given FD’s are satisfied.

43 43 3NF and BCNF -- Continued uWe can get (1) with a BCNF decomposition. uWe can get both (1) and (2) with a 3NF decomposition. uBut we can’t always get (1) and (2) with a BCNF decomposition. wstreet-city-zip is an example.

44 44 Lossless Schema Decomposition uA decomposition should not lose information lossless uA decomposition (R 1,…,R n ) of a schema, R, is lossless if every valid instance, r, of R can be reconstructed from its components: uwhere each r i =  Ri (r) r = r 1 r2r2 rnrn ……

45 45 Lossy Decomposition r  r1r  r1 r2r2... rnrn SSN Name Address SSN Name Name Address 1111 Joe 1 Pine 1111 Joe Joe 1 Pine 2222 Alice 2 Oak 2222 Alice Alice 2 Oak 3333 Alice 3 Pine 3333 Alice Alice 3 Pine r  r1r  r1 r2r2 rnrn... r1r1 r2r2 r  The following is always the case: But the following is not always true: Example: The tuples (2222, Alice, 3 Pine) and (3333, Alice, 2 Oak) are in the join, but not in the original

46 46 Lossy Decompositions: What is Actually Lost? uIn the previous example, the tuples (2222, Alice, 3 Pine) and (3333, Alice, 2 Oak) were gained, not lost! wWhy do we say that the decomposition was lossy? uWhat was lost is information: wThat 2222 lives at 2 Oak: In the decomposition, 2222 can live at either 2 Oak or 3 Pine wThat 3333 lives at 3 Pine: In the decomposition, 3333 can live at either 2 Oak or 3 Pine

47 47 Testing for Losslessness uA (binary) decomposition of R = (R,F ) into R 1 = (R 1, F 1 ) and R 2 = (R 2, F 2 ) is lossless if and only if : weither the FD (R 1  R 2 )  R 1 is in F + wor the FD (R 1  R 2 )  R 2 is in F +

48 48 Example Schema (R, F ) where R = {SSN, Name, Address, Hobby} F = {SSN  Name, Address} can be decomposed into R 1 = {SSN, Name, Address} F 1 = {SSN  Name, Address} and R 2 = {SSN, Hobby} F 2 = { } Since R 1  R 2 = SSN and SSN  R 1 the decomposition is lossless

49 49 Intuition Behind the Test for Losslessness uSuppose R 1  R 2  R 2. Then a row of r 1 can combine with exactly one row of r 2 in the natural join (since in r 2 a particular set of values for the attributes in R 1  R 2 defines a unique row) R 1  R 2 R 1  R 2 …………. a a ………... ………… a b …………. ………… b c …………. ………… c r 1 r 2

50 50 Testing for a Lossless Join uIf we project R onto R 1, R 2,…, R k, can we recover R by rejoining? uAny tuple in R can be recovered from its projected fragments. uSo the only question is: when we rejoin, do we ever get back something we didn’t have originally?

51 51 The Chase Test uSuppose tuple t comes back in the join. uThen t is the join of projections of some tuples of R, one for each R i of the decomposition. uCan we use the given FD’s to show that one of these tuples must be t ?

52 52 The Chase – (cont.) uStart by assuming t = abc…. uFor each i, there is a tuple s i of R that has a, b, c,… in the attributes of R i. us i can have any values in other attributes. uWe’ll use the same letter as in t, but with a subscript, for these components.

53 53 Example: The Chase uLet R = ABCD, and the decomposition be AB, BC, and CD. uLet the given FD’s be C->D and B ->A. uSuppose the tuple t = abcd is the join of tuples projected onto AB, BC, CD.

54 54 The Tableau ABCDabc1d1a2bcd2a3b3cdABCDabc1d1a2bcd2a3b3cd d Use C->D a Use B ->A We’ve proved the second tuple must be t. The tuples of R pro- jected onto AB, BC, CD.

55 55 Summary of the Chase 1.If two rows agree in the left side of a FD, make their right sides agree too. 2.Always replace a subscripted symbol by the corresponding unsubscripted one, if possible. 3.If we ever get an unsubscripted row, we know any tuple in the project-join is in the original (the join is lossless). 4.Otherwise, the final tableau is a counterexample.

56 56 Example: Lossy Join uSame relation R = ABCD and same decomposition. uBut with only the FD C->D.

57 57 The Tableau ABCDabc1d1a2bcd2a3b3cdABCDabc1d1a2bcd2a3b3cd d Use C->D These three tuples are an example R that shows the join lossy. abcd is not in R, but we can project and rejoin to get abcd. These projections rejoin to form abcd.

58 58 3NF Synthesis Algorithm uWe can always construct a decomposition into 3NF relations with a lossless join and dependency preservation. uNeed minimal basis (minimal cover) for the FD’s: 1.Right sides are single attributes. 2.No FD can be removed. 3.No attribute can be removed from a left side.

59 59 Constructing a Minimal Basis 1.Split right sides. 2.Repeatedly try to remove an FD and see if the remaining FD’s are equivalent to the original. 3.Repeatedly try to remove an attribute from a left side and see if the resulting FD’s are equivalent to the original.

60 60 3NF Synthesis – (cont.) uOne relation for each FD in the minimal basis. wSchema is the union of the left and right sides. uIf no key is contained in the decomposed relations, then add one relation whose schema is some key.

61 61 Example: 3NF Synthesis uRelation R = ABCD. uFD’s A->B and A->C. uDecomposition: AB and AC from the FD’s, plus AD for a key.

62 62 Example: Minimal Cover u T = {ABH  CK, A  D, C  E, BGH  F, F  AD, E  F, BH  E} ustep 1: Make RHS of each FD into a single attribute wAlgorithm: Use the decomposition inference rule for FDs wExample: F  AD replaced by F  A, F  D ; ABH  CK by ABH  C, ABH  K ustep 2: Eliminate redundant attributes from LHS. wAlgorithm: If FD XB  A  T (where B is a single attribute) and X  A is entailed by T, then B was unnecessary wExample: Can an attribute be deleted from ABH  C ? Compute AB + T, AH + T, BH + T. Since C  (BH) + T, BH  C is entailed by T and A is redundant in ABH  C.

63 63 Computing Minimal Cover (con’t) ustep 3: Delete redundant FDs from T wAlgorithm: If T – {f} entails f, then f is redundant If f is X  A then check if A  X + T- {f} wExample: BGH  F is entailed by E  F, BH  E, so it is redundant uNote: The order of steps 2 and 3 cannot be interchanged!!

64 64 Synthesizing a 3NF Schema ustep 1: Compute a minimal cover, U, of T. The decomposition is based on U, but since U + = T + the same functional dependencies will hold wA minimal cover for T={ABH  CK, A  D, C  E, BGH  F, F  AD, E  F, BH  E} is U={BH  C, BH  K, A  D, C  E, F  A, E  F} Starting with a schema R = (R, T )

65 65 Synthesizing a 3NF schema (con’t) ustep 2: Partition U into sets U 1, U 2, … U n such that the LHS of all elements of U i are the same wU 1 = {BH  C, BH  K}, U 2 = {A  D}, U 3 = {C  E}, U 4 = {F  A}, U 5 = {E  F}

66 66 Synthesizing a 3NF schema (con’t) ustep 3: For each U i form schema R i = (R i, U i ), where R i is the set of all attributes mentioned in U i wEach FD of U will be in some R i. Hence the decomposition is dependency preserving wR 1 = (BHCK; BH  C, BH  K), R 2 = (AD; A  D), R 3 = (CE; C  E), R 4 = (FA; F  A), R 5 = (EF; E  F)

67 67 Synthesizing a 3NF schema (con’t) ustep 4: If no R i is a superkey of R, add schema R 0 = (R 0,{}) where R 0 is a key of R. wR 0 = (BGH, {}) R 0 might be needed when not all attributes are necessarily contained in R 1  R 2 …  R n –A missing attribute, A, must be part of all keys (since it’s not in any FD of U, deriving a key constraint from U involves the augmentation axiom) R 0 might be needed even if all attributes are accounted for in R 1  R 2 …  R n –Example: (ABCD; {A  B, C  D}). Step 3 decomposition: R 1 = (AB; {A  B}), R 2 = (CD; {C  D}). Lossy! Need to add (AC; { }), for losslessness wStep 4 guarantees lossless decomposition.

68 Spring 2006CENG 553 Database Management Systems 68 BCNF Design Strategy uThe resulting decomposition, R 0, R 1, … R n, is wDependency preserving (since every FD in U is a FD of some schema) wLossless (although this is not obvious) wIn 3NF (although this is not obvious) uStrategy for decomposing a relation wUse 3NF decomposition first to get lossless, dependency preserving decomposition wIf any resulting schema is not in BCNF, split it using the BCNF algorithm (but this may yield a non-dependency preserving result)

69 69 Multivalued Dependencies Fourth Normal Form Reasoning About FD’s + MVD’s

70 70 Definition of MVD uA multivalued dependency (MVD) on R, X ->->Y, says that if two tuples of R agree on all the attributes of X, then their components in Y may be swapped, and the result will be two tuples that are also in the relation. ui.e., for each value of X, the values of Y are independent of the values of R-X-Y.

71 71 Example: MVD Drinkers(name, addr, phones, beersLiked) uA drinker’s phones are independent of the beers they like. wname->->phones and name ->->beersLiked. uThus, each of a drinker’s phones appears with each of the beers they like in all combinations. uThis repetition is unlike FD redundancy. wname->addr is the only FD.

72 72 Tuples Implied by name->->phones If we have tuples: nameaddrphones beersLiked sueap1 b1 sueap2 b2 sueap2 b1 sueap1 b2 Then these tuples must also be in the relation.

73 73 Picture of MVD X ->->Y XY others equal exchange

74 74 MVD Rules uEvery FD is an MVD (promotion ). wIf X ->Y, then swapping Y ’s between two tuples that agree on X doesn’t change the tuples. wTherefore, the “new” tuples are surely in the relation, and we know X ->->Y. uComplementation : If X ->->Y, and Z is all the other attributes, then X ->->Z.

75 75 Splitting Doesn’t Hold uLike FD’s, we cannot generally split the left side of an MVD. uBut unlike FD’s, we cannot split the right side either --- sometimes you have to leave several attributes on the right side.

76 76 Example: Multiattribute Right Sides Drinkers(name, areaCode, phone, beersLiked, manf) uA drinker can have several phones, with the number divided between areaCode and phone (last 7 digits). uA drinker can like several beers, each with its own manufacturer.

77 77 Example Continued uSince the areaCode-phone combinations for a drinker are independent of the beersLiked-manf combinations, we expect that the following MVD’s hold: name ->-> areaCode phone name ->-> beersLiked manf

78 78 Example Data Here is possible data satisfying these MVD’s: nameareaCodephonebeersLikedmanf Sue650555-1111BudA.B. Sue650555-1111WickedAlePete’s Sue415555-9999BudA.B. Sue415555-9999WickedAlePete’s But we cannot swap area codes or phones by themselves. That is, neither name->->areaCode nor name->->phone holds for this relation.

79 79 Fourth Normal Form uThe redundancy that comes from MVD’s is not removable by putting the database schema in BCNF. uThere is a stronger normal form, called 4NF, that (intuitively) treats MVD’s as FD’s when it comes to decomposition, but not when determining keys of the relation.

80 80 4NF Definition uA relation R is in 4NF if: whenever X ->->Y is a nontrivial MVD, then X is a superkey. wNontrivial MVD means that: 1.Y is not a subset of X, and 2.X and Y are not, together, all the attributes. wNote that the definition of “superkey” still depends on FD’s only.

81 81 BCNF Versus 4NF uRemember that every FD X ->Y is also an MVD, X ->->Y. uThus, if R is in 4NF, it is certainly in BCNF. wBecause any BCNF violation is a 4NF violation (after conversion to an MVD). uBut R could be in BCNF and not 4NF, because MVD’s are “invisible” to BCNF.

82 82 Decomposition and 4NF uIf X ->->Y is a 4NF violation for relation R, we can decompose R using the same technique as for BCNF. 1.XY is one of the decomposed relations. 2.All but Y – X is the other.

83 83 Example: 4NF Decomposition Drinkers(name, addr, phones, beersLiked) FD: name -> addr MVD’s: name ->-> phones name ->-> beersLiked uKey is {name, phones, beersLiked}. uAll dependencies violate 4NF.

84 84 Example Continued uDecompose using name -> addr: 1.Drinkers1(name, addr) uIn 4NF; only dependency is name -> addr. 2.Drinkers2(name, phones, beersLiked) uNot in 4NF. MVD’s name ->-> phones and name ->-> beersLiked apply. No FD’s, so all three attributes form the key.

85 85 Example: Decompose Drinkers2 uEither MVD name ->-> phones or name ->-> beersLiked tells us to decompose to: wDrinkers3(name, phones) wDrinkers4(name, beersLiked)

86 86 Reasoning About MVD’s + FD’s uProblem: given a set of MVD’s and/or FD’s that hold for a relation R, does a certain FD or MVD also hold in R ? uSolution: Use a tableau to explore all inferences from the given set, to see if you can prove the target dependency.

87 87 Example: Chasing a Tableau With MVD’s and FD’s uTo apply a FD, equate symbols, as before. uTo apply an MVD, generate one or both of the tuples we know must also be in the relation represented by the tableau. uWe’ll prove: if A->->BC and D->C, then A->C.

88 88 The Tableau for A->C ABCDab1c1d1ab2c2d2ABCDab1c1d1ab2c2d2 Goal: prove that c 1 = c 2. ab2c2d1ab2c2d1 Use A->->BC (first row’s D with second row’s BC ). c2c2 Use D->C (first and third row agree on D, therefore agree on C ).

89 89 Example: Transitive Law for MVD’s uIf A->->B and B->->C, then A->->C. wObvious from the complementation rule if the Schema is ABC. wBut it holds no matter what the schema; we’ll assume ABCD.

90 90 The Tableau for A->->C ABCDab1c1d1ab2c2d2ABCDab1c1d1ab2c2d2 Goal: derive tuple (a,b 1,c 2,d 1 ). ab1c2d2ab1c2d2 Use A->->B to swap B from the first row into the second. ab 1 c 2 d 1 Use B->->C to swap C from the third row into the first.

91 91 Rules for Inferring MVD’s + FD’s uStart with a tableau of two rows. wThese rows agree on the attributes of the left side of the dependency to be inferred. wAnd they disagree on all other attributes. wUse unsubscripted variables where they agree, subscripts where they disagree.

92 92 Inference: Applying a FD uApply a FD X->Y by finding rows that agree on all attributes of X. Force the rows to agree on all attributes of Y. wReplace one variable by the other. wIf the replaced variable is part of the goal tuple, replace it there too.

93 93 Inference: Applying a MVD uApply a MVD X->->Y by finding two rows that agree in X. wAdd to the tableau one or both rows that are formed by swapping the Y-components of these two rows.

94 94 Inference: Goals uTo test whether U->V holds, we succeed by inferring that the two variables in each column of V are actually the same. uIf we are testing U->->V, we succeed if we infer in the tableau a row that is the original two rows with the components of V swapped.

95 95 Inference: Endgame uApply all the given FD’s and MVD’s until we cannot change the tableau. uIf we meet the goal, then the dependency is inferred. uIf not, then the final tableau is a counterexample relation. wSatisfies all given dependencies. wOriginal two rows violate target dependency.

96 96 Normalization Drawbacks uBy limiting redundancy, normalization helps maintain consistency and saves space uBut performance of querying can suffer because related information that was stored in a single relation is now distributed among several uExample: A join is required to get the names and grades of all students taking CS305 in S2002. SELECT S.Name, T.Grade StudentTranscript FROM Student S, Transcript T WHERE S.Id = T.StudId AND T.CrsCode = ‘CS305’ AND T.Semester = ‘S2002’

97 97 Denormalization uTradeoff: Judiciously introduce redundancy to improve performance of certain queries Transcript uExample: Add attribute Name to Transcript wJoin is avoided Transcript wIf queries are asked more frequently than Transcript is modified, added redundancy might improve average performance Transcript ’  But, Transcript ’ is no longer in BCNF since key is (StudId, CrsCode, Semester) and StudId  Name SELECT T.Name, T.Grade Transcript ’ FROM Transcript ’ T WHERE T.CrsCode = ‘CS305’ AND T.Semester = ‘S2002’

98 98 Exercises 1.A table ABC has attributes A, B, C and a functional dependency A -> BC. Write an SQL assertion that prevents a violation of this functional dependency. CREATE ASSERTION FD CHECK (1 >= ALL ( SELECT COUNT(DISTINCT *) FROM ABC GROUP BY A ) )

99 99 2.Assume we have a relation schema R= (player, salary, team, city). An example relation instance: player salary team city Jeter 15,600,000 Yankees New York Garciaparra 10,500,000 Red Sox Boston We expect the following functional dependencies to hold: player → salary,player → team,team → city u Argue that R is currently not in BCNF. u Decompose R into BCNF. Argue that the decomposition is lossless-join, and that it preserves dependencies. u Find an alternative decomposition of R into BCNF which is still lossless- join, but which not preserve dependencies. (State which dependency it does not preserve.) u Show, by means of an example, that a decomposition into (player, salary, city) and (team, city) is not lossless-join.


Download ppt "1 Design Theory for Relational Databases Functional Dependencies Decompositions Normal Forms."

Similar presentations


Ads by Google