Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dr. A.I. Cristea CS 319: Theory of Databases.

Similar presentations


Presentation on theme: "Dr. A.I. Cristea CS 319: Theory of Databases."— Presentation transcript:

1 Dr. A.I. Cristea CS 319: Theory of Databases

2 2 … previous Armstrong axioms

3 3 Content 1.Generalities DB 2.Integrity constraints (FD revisited) 3.LLJ, DP and applications 4.Relational Algebra (revisited) 5.Query optimisation 6.Temporal Data 7.The Askew Wall 8.Tuple calculus 9.Domain calculus 10.Query equivalence

4 4 Lossless Join Decomposition Lossless Join Definition: –Let { R 1, R 2 } be a decomposition of R (meaning that R 1 R 2 = R); the decomposition is lossless if for every legal instance r of R: r = R1 (r) R2 (r) What is wrong with the following decomposition? –R = {A,B,C} and F = { A B, C B} and we replace R by { R 1, R 2 } where R 1 = {A,B} and R 2 = {C,B}.

5 5 Sufficient Condition for Lossless Join Lossless Join means: –Let { R 1, R 2 } be a decomposition of R (meaning that R 1 R 2 = R); Prove that for all legal instances r: r R1 (r) R2 (r) Prove that this decomposition is lossless if R 1 R 2 R 1 or R 1 R 2 R 2 –Can you give an example of a lossless join decomposition (instance) when neither R 1 R 2 R 1 nor R 1 R 2 R 2 hold?

6 6 Boyce-Codd Normal Form (BCNF) A relation scheme R is in BCNF if (and only if) for every non-trivial fd X Y F +, X is a superkey (for R). A database scheme D = {R 1,..., R n } is in BCNF if (and only if) i {1,...,n}: R i is in BCNF. Let R = {A,B,C} and F = { A B, C B} and let us decompose R into by { R 1, R 2 } where R 1 = {A,B} and R 2 = {C,B}. Is this decomposition in BCNF? Is this the best decomposition in BCNF? (Can you find a better one?)

7 7 BCNF Decomposition Algorithm result := {R}; done := false; compute F + ; while (not done) do if (there is a schema R i in result that is not in BCNF) then begin let α β be a nontrivial functional dependency that holds on R i such that α R i is not in F +, and α β= ; result := (result – R i ) (R i – β) (α, β); end else done:= true;

8 8 Dependencies in a decomposition Which dependencies hold in R 1 and R 2 ? –R = {A,B,C} and F = { A B, B C} and we replace R by { R 1, R 2 } where R 1 = {A,B} and R 2 = {B,C}. –R = {A,B,C} and F = { A B, C B} and we replace R by { R 1, R 2 } where R 1 = {A,B} and R 2 = {A,C}. –R = {A,B,C} and F = { A B, B C} and we replace R by { R 1, R 2 } where R 1 = {A,B} and R 2 = {A,C}

9 9 Third Normal Form (3NF) Third Normal Form –Informal Presentation –Example and Discussion –Formal Definition 3NF Decomposition Algorithm –Principle and Properties Lossless-join, dependency-preserving decomposition into 3NF –Proof of Correctness –Example of 3NF Decomposition Third Normal Form and Boyce-Codd Normal Form

10 10 Informal Presentation Motivation –There are some situations where BCNF decomposition is not dependency preserving, and Efficient checking for FD violation on updates is important –Solution Define a weaker normal form, called Third Normal Form –FDs can be checked on individual relations without computing a join –There is always a lossless-join, dependency-preserving decomposition into 3NF

11 11 Informal Presentation Motivation –Sometimes a relational schema and its FDs are not in BCNF but one does not want to decompose it further –Example: Relation Bookings with attributes: –title, the name of the performance –theater, the name of the theater where the performance is being shown –city, where the theater is located FDs are: theater city, title city theater Is there a BCNF violation? Bookings(title, theater, city) theater city title city theater Yes: (theater city) because theater is not a superkey Note: keys here are: (title, city) and (theater, title) BCNF Decomposition: Bookings1(theater, city) Bookings2(title, theater)

12 12 Informal Presentation Motivation –Decomposition to get to BCNF may not always be desirable BCNF decomposition is not dependency preserving, and Efficient checking for FD violation on updates is important –3NF relaxes BCNF to allow relations that cannot be decomposed into BCNF relations without losing ability to check each FD Informal Definition of 3NF –A relation R is in third normal form if: Whenever A B is a nontrivial FD: either A is a superkey or B is a member of some candidate key Bookings(title, theater, city) theater city title city theater As for BCNF

13 13 Informal Presentation Informal Definition of 3NF –A relation R is in third normal form if: The difference between BCNF and 3NF: –B is a member of some candidate key –Previous example schema is in 3NF Candidate keys here are: (title, city) and (theater, title) Theater is not a superkey but city is a member of a candidate key What is the problem with this schema? Whenever A B is a nontrivial FD: either A is a superkey or B is a member of some candidate key Bookings(title, theater, city) theater city title city theater

14 14 Informal Presentation Informal Definition of 3NF –Previous example schema is in 3NF What is the problem with this schema? –The schema contains redundant information Bookings(title, theater, city) theater city title city theater LondonImperialBeethovens 5 th Symphony LondonNew TheaterCats LondonImperialPhantom of the Opera New YorkBroadwayCats CityTheaterTitle

15 15 Formal Definition 3NF Definition –A relation schema R is in third normal form (3NF) if for all functional dependencies in F + of the form, where R and R, at least one of the following holds: is a trivial functional dependency ( ) contains a key for R –every B is part of some candidate key of R BCNF and 3NF –A BCNF relation is in 3NF –A 3NF relation is not necessary in BCNF BCNF Conditions

16 16 Formal Presentation Example –Consider the two relational schemas R1 = (cust-num, name, house-num, street, city, state) cust-num name, house-num, street, city, state R2 = (house-num, street, city, state, zip) house-num, street, city, state zip zip state –Are these relations in 3NF?

17 17 Formal Presentation Example in 3NF? –For R1 The only nontrivial functional dependencies in F + are those with cust-num as a member of the left-side of the FD As cust-num is a superkey of R1, these functional dependencies satisfy the second condition for 3NF R1= (cust-num, name, house-num, street, city, state) cust-num name, house-num, street, city, state Three conditions for 3NF: is a trivial functional dependency ( ) contains a key for R Every B is part of some candidate key of R

18 18 Formal Presentation Example in 3NF? –For R2 There are two kinds of nontrivial functional dependencies in F+: –Those with (house-num, street, city, state) as a subset of the left hand side of the FD: As (house-num, street, city, state) is a superkey for R2, these functional dependencies satisfy the second condition for 3NF –Those of the form {zip} {state} where For any such functional dependency: ( {state}) – ( {zip}) = {state} (or = ) Because state is part of a candidate key of R2, such functional dependencies satisfy the third condition for 3NF R2 = (house-num, street, city, state, zip) house-num, street, city, state zip zip state Three conditions for 3NF: is a trivial functional dependency contains a key for R Every B is part of some candidate key of R

19 19 Decomposition into 3NF Principles –Input/Output Input –A set of functional dependencies F –A relation schema R Output –A lossless-join, dependency-preserving decomposition in 3NF –Canonical Cover The set of dependencies Fc in the algorithm is a canonical cover of the functional dependencies

20 20 Fc definition a canonical cover Fc for F is a set of dependencies Fc for which: Fc F no fd in Fc is superfluous no fd in Fc contains extraneous attrs each left side of fd in Fc is unique

21 21 Extraneous attribute A in αβ in R A α; F => F – {αβ} α-A)β A β; F – {αβ} α(β -A) => F Computed via attribute closures

22 22 Fc computation algorithm Fc = F Repeat apply union rule (right side of fd) find fd with extraneous attrs (left/right side) & delete these Until Fc doesnt change

23 23 Decomposition into 3NF Principles –The algorithm takes a set of dependencies and adds one schema at a time, instead of decomposing the initial schema repeatedly –The result is not uniquely defined since A set of functional dependencies can have more than one canonical cover In some cases, the result of the algorithm depends on the order which it considers the dependencies in Fc (minor bug in the algorithm, see later)

24 24 Decomposition into 3NF Decomposition –Given: relation R, set F of functional dependencies –Find: decomposition of R into a set of 3NF relation Ri –Algorithm (sketch, real algorithm on next slides): –Decomposition produces a lossless join and preserves dependencies –Prove ! (1)Eliminate redundant fd, resulting in a canonical cover Fc of F (2)Create a relation Ri = XY for each FD X Y in Fc (3)If the key K of R does not occur in any relation Ri, create one more relation Ri=K

25 25 Let Fc be the canonical cover of F; j = 0; for each dependency α β in Fc if none of schemes in Ri (i=1, 2, …, j) contains αβ then j = j+1; Rj = αβ; end-if if any of the schemes in Ri (i=1, 2, …, j-1) is contained in Rj remove Ri end-if end-for if none of the schemes Ri (i=1, 2, …, j) contains a candidate key for R then j = j + 1; Rj = any candidate key for R; end-if return (R1, R2, …, Rj) Decomposition Algorithm into 3NF

26 26 Decomposition into 3NF Example –Semester database of a university –Relational schema R=(L, I, T, R, S, G) –Attributes L: LectureR: Room I: InstructorS: Student G: GradeT: Time –Functional Dependencies L I, TR L, TI R, LS G, TS R, TRI LR R=(L, I, T, R, S, G) L I, TR L, TI R, LS G, TS R, TRI LR

27 27 Decomposition into 3NF Example –R=(L, I, T, R, S, G) –F: {L I, TR L, TI R, LS G, TS R, TRI LR} –Are all FDs necessary? No ! TR L, TI R then TRI LR –Canonical cover of F Fc= {L I, TR L, TI R, TS R, LS G} –Key: (ST) –Key attributes: S, T (1)Eliminate redundant FD, resulting in a canonical cover Fc of F

28 28 Decomposition into 3NF Example –R = (L, I, T, R, S, G) –Fc = {L I, TR L, TI R, TS R, LS G} –Key attributes: S, T –Decomposition in 3NF R1 = (L, I)R2 = (T, R, L) R3 = (T, I, R)R4 = (L, S, G) R5 = (S, T, R) (3) If the key K of R does not occur in any relation Ri, create one more relation Ri=K, but it does. (2) Create a relation Ri = XA for each FD X A in Fc

29 29 Decomposition into 3NF 3NF Decomposition Algorithm –Proof of Correctness 3NF decomposition algorithm is lossless join, dependency preserving decomposition into 3NF 1.Dependency preserving 2.Lossless join 3.3NF

30 30 Proof: Decomposition into 3NF is dependency preserving 3NF Decomposition Algorithm –Decomposition is dependency preserving 3NF decomposition algorithm is dependency preserving since there is a relation for every FD in Fc.

31 31 Proof: Decomposition into 3NF is a lossless join 3NF Decomposition Algorithm –Decomposition is lossless join Lossless join decomposition –A decomposition {R1, R2} is a lossless-join decomposition if R1 R2 R1 or R1 R2 R2 Idea: –A candidate key (K) is in one of the relations R i in decomposition (last step of algorithm guarantees this) –Closure of candidate key under Fc must contain all attributes in R (definition of candidate key) –Follow the steps of attribute closure algorithm (Fig. 7.9) to show that the sufficient lossless join condition is satisfied for K +.

32 32 Proof: Decomposition into 3NF is actually 3NF! 3NF Decomposition Algorithm –Decomposition into 3NF Claim –If a relation Ri is in the decomposition generated by the synthesis algorithm, then Ri is in 3NF Idea –To test for 3NF, it is sufficient to consider the functional dependencies whose right-hand side is a single attribute –Therefore to see that Ri is in 3NF, we must show that any functional dependency that holds in Ri, satisfies the definition of 3NF

33 33 Proof: Decomposition into 3NF is actually 3NF! 3NF Decomposition Algorithm –Decomposition into 3NF Demonstration –Assume is the dependency that generated Ri in the algorithm –B must be in or, since B is in Ri and generated Ri –Let us consider two possible cases »B is in but not

34 34 Proof: 3NF Decomposition is 3NF! 3NF Decomposition Algorithm –Decomposition into 3NF Demonstration –B is in but not in : must be superkey (why?) »The second condition of 3NF is satisfied –B is in but not in is a candidate key »The third alternative in the definition of 3NF is satisfied »Note: we cannot show that is a superkey. This shows exactly why the third alternative is present in the definition of 3NF Three conditions for 3NF: is a trivial functional dependency contains a key for R Every B is part of some candidate key of R

35 35 Decomposition into 3NF B is in Assume is not a superkey must contain some attribute not in 1.Since B is in F + it must be derivable from F c, by using attribute closure on 2.Attribute closure cannot have used if it had been used, must be contained in the attribute closure of, which is not possible, since we assumed is not a superkey 3.Now, using ( - {B}) and B, we can derive B (since, and since B is non-trivial) 4.Then, B is extraneous in the right-hand side of ; which is not possible since is in F c (contradiction!) 5.Thus, if B is in then must be a superkey : FD in R : FD that was used to generated Ri

36 36 Comparison of BCNF and 3NF BCNF or 3NF? –Relations in BCNF and 3NF Relations in BCNF: no repetition of information Relations in 3NF: problem of repetition of information –Decomposition in BCNF and in 3NF It is always possible to decompose a relation into relations in 3NF and –the decomposition is lossless –dependencies are preserved It is always possible to decompose a relation into relations in BCNF and –the decomposition is lossless –the information is not repeated

37 37 Compare BCNF and 3NF To summarize –Design Goals Goal for a relational database design is: –BCNF (no redundant information) –Lossless join –Dependency preservation If we cannot achieve this, we accept: –3NF (possible repetition of information) –Lossless join –Dependency preservation

38 38 Summary We have learned: –LLJ –DP –BCNF + algorithm –3 rd NF + algorithm

39 39 … to follow Relational Algebra, revisited

40 40 Questions?


Download ppt "Dr. A.I. Cristea CS 319: Theory of Databases."

Similar presentations


Ads by Google