Presentation is loading. Please wait.

Presentation is loading. Please wait.

Relational Data Base Design in Practice

Similar presentations


Presentation on theme: "Relational Data Base Design in Practice"— Presentation transcript:

1 Relational Data Base Design in Practice
Normal Forms More about Anomalies Algorithms for Database Design 12/30/2018 CS319 Theory of Databases

2 Normal Forms for Relational Schemes 0
Database design is essentially an informal activity. It relies upon observation and analysis of external state. Some patterns of dependency between attributes recur. Normal forms (NFs) for relation schemes encode rules for design that have been developed from experience. They provide a basis for database “design patterns”. 12/30/2018 CS319 Theory of Databases

3 Normal Forms for Relational Schemes 1
Boyce-Codd Normal Form (BCNF) Use <R,F> to denote relation scheme R(A1, A2, ..., An) with set of dependencies generated by F Definition: <R,F> is in BCNF if for all sets of attributes X È {A}  { A1, A2, ..., An }: X®A holds in R and AÏX Þ X contains a key for R Terminology If X contains a key for R, call X is a superkey for R 12/30/2018 CS319 Theory of Databases

4 Normal Forms for Relational Schemes 2
Examples relating to Boyce-Codd Normal Form: R is SCAIP(S, C, A , I, P) where S is SNAME, C is CITY, A is AGENT, I is ITEM, P is PRICE F is { S ® C, C ® A, S I ® P } Not in BCNF, since C ® A, but C is not a superkey SCAIP = SIP |´| SC |´| CA is decomposition that is lossless and dependency preserving such that all the sub-schemes are in BCNF Desirable kind of decomposition, but can't always achieve it .... need weaker NFs than BCNF 12/30/2018 CS319 Theory of Databases

5 Normal Forms for Relational Schemes 3
Third Normal Form (3NF) Relation scheme <Rº R(A1, A2, ..., An), F> is in 3NF if for all sets of attributes X È {A}  { A1, A2, ..., An }: X®A holds in R and AÏX Þ either X is a superkey for R or A is an attribute contained in a key for R Definition: If A is contained in a key for R then A is a prime attribute for R 12/30/2018 CS319 Theory of Databases

6 Normal Forms for Relational Schemes 4
Example of Third Normal Form: Consider relation scheme STD, where S is local student id, T is tutor, D is department. Assuming that "students in different departments can have same id" have dependencies F = { T ® D, SD ® T } STD is not in BCNF: T ® D, but T not superkey STD is in 3NF: D is a prime attribute in key SD 12/30/2018 CS319 Theory of Databases

7 Desirable properties of decompositions review 1
Lossless decompositions A decomposition of the relation scheme R into subschemes R1, R2, ..., Rn is lossless if, given tuples r1, r2, ..., rn in R1, R2, ..., Rn respectively, such that ri and rj agree on all common attributes for all pairs of indices (i,j), the - uniquely defined - tuple derived by joining r1, r2, ..., rn is in R. Terminology: "lossless join" decomposition 12/30/2018 CS319 Theory of Databases

8 Desirable properties of decompositions review 2
Dependency preserving decompositions A decomposition of the relation scheme R into subschemes R1, R2, ..., Rn is dependency preserving if all the FDs within R can be derived from those within the relations R1, R2, ..., Rn. If F is the set of dependencies defined on R, then the requirement is that the set G of dependencies that can be obtained as projections of dependencies in F+ onto R1, R2, ..., Rn together generate F+. Note carefully that it is not enough to check whether projections of dependencies in F onto R1, R2, ..., Rn together generate F+. 12/30/2018 CS319 Theory of Databases

9 Minimal cover review Representing the set of dependencies (cont.)
Definition: G is a minimal cover for F if a) G+ =F+ b) every RHS in G is a single attribute c) every LHS minimal subject to determining RHS i.e. for no X ® Y Î G is there a proper subset Z of X such that G \ { X ® Y } È { Z ® Y } generates F+ d) no proper subset of G also generates F+. Minimal cover isn't necessarily unique. 12/30/2018 CS319 Theory of Databases

10 Lossless Join Decomposition review
Theorem If r = {S, T} is a decomposition of R, and F is the set of FDs for R, then r is a lossless join decomposition with respect to F if and only if either T\S  (S Ç T)+ or S\T  (S Ç T)+. Corollary to the theorem: If R is a relation scheme, and X ® A is a functional dependency in R, where A is a an attribute, X is a set of attributes not containing A, and XA is a proper subset of R, then R1=XA, R2=R\A is a lossless join decomposition of R. 12/30/2018 CS319 Theory of Databases

11 Some decomposition algorithms 1
Algorithm 1: Decomposing a relation scheme as a lossless join of BCNF subschemes Suppose R is a relation scheme that is not in BCNF Take X ® A, where X is not a superkey and A Ï X Decompose R as S È T, where S = AX and T = R\A. X not a superkey Þ S and T are proper subsets of R where S Ç T = X, and X ® A = S\T so decomposition of R as S È T is a lossless join. \ R = lossless join of arbitrarily small subschemes Every scheme with at most 2 attributes is in BCNF ... 12/30/2018 CS319 Theory of Databases

12 Some decomposition algorithms 2
Algorithm 1: Decomposing a relation scheme as a lossless join of BCNF subschemes (cont.) … R = lossless join of arbitrarily small subschemes Every scheme with at most 2 attributes is in BCNF ... Theorem 1 Every relation scheme can be expressed as a lossless join of BCNF relation schemes. 12/30/2018 CS319 Theory of Databases

13 Some decomposition algorithms 3
Example: A lossless decomposition into BCNF Consider the relation scheme CTHRSG, where C=course, T=teacher, R=room, S=student, G=grade, subject to the dependencies C ® T course determines teacher HR ® C hour and room determine the course HT ® R hour and teacher determine the room CS ® G course and student determine the grade HS ® R hour and student determine the room 12/30/2018 CS319 Theory of Databases

14 Some decomposition algorithms 4
Example: A lossless decomposition into BCNF (cont.) Consider the relation scheme CTHRSG with dependencies C ® T, HR ® C, HT ® R, CS ® G, HS ® R Using Algorithm 1, arrive at a decomposition into the BCNF subschemes CSG, CT, CHR, CHS [NB CH ® R]. This decomposition doesn't preserve dependencies: HT ® R is not consequence of the dependencies on the subschemes CSG, CT, CHR and CHS. There is no decomposition into BCNF that is both lossless join and dependency preserving in general. 12/30/2018 CS319 Theory of Databases

15 Some decomposition algorithms 5
Algorithm 2: Dependency preserving decomposition into 3NF subschemes Given (R, F) and G a minimal cover for F. If an attribute is not involved in any dependency in G, it can form a relation scheme by itself. Suppose R is the set of attributes involved in G. If a dependency in G involves all the attributes in R, then R is in 3NF, else form relational sub-schemes Y1B1, Y2B2, ..., YnBn for each of the dependencies Y1®B1, Y2®B2, ..., Yn®Bn in minimal cover G. 12/30/2018 CS319 Theory of Databases

16 Some decomposition algorithms 6
Theorem 2 Algorithm 2 generates a dependency preserving decomposition into 3NF subschemes Proof of Theorem 2: Projected dependencies involve a cover for F, so the decomposition is dependency-preserving. Claim that each sub-scheme YB where Y ® B belongs to the minimal cover G for F is in 3NF. Must show: if XAYB, where X ® A is non-trivial dependency, then either X is a superkey of YB or A is a prime attribute of YB. 12/30/2018 CS319 Theory of Databases

17 Some decomposition algorithms 7
Proof of Theorem 2 (cont.): if XAYB, where X ® A is non-trivial dependency, then either X is a superkey of YB or A is a prime attribute of YB. If AB, then A Î Y. But Y is a key for YB, since Y ® B and no proper subset of Y determines B as G is a minimal cover. Hence A  B Þ A is prime. If A=B, then X is a subset of Y such that X ® B, and since G is a minimal cover, X=Y. Hence A=B Þ X is a superkey for YB. 12/30/2018 CS319 Theory of Databases

18 Some decomposition algorithms 8
Algorithm 3: Dependency preserving, lossless join decomposition into 3NF subschemes Apply Algorithm 2 to obtain a dependency-preserving decomposition s of R with dependencies F. Now let Z be a key for R, and introduce Z as an additional subscheme to derive the decomposition t = s È {Z}. Theorem 3 The decomposition t of R is both lossless join and dependency preserving into 3NF subschemes 12/30/2018 CS319 Theory of Databases

19 Some decomposition algorithms 9
Sketch proof of Theorem 3: The decomposition t of R is both lossless join and dependency preserving into 3NF subschemes … to prove lossless join: apply lossless join algorithm Construct a table with rows  Y1B1, Y2B2, ..., YnBn where FDs Y1®B1, Y2®B2, ..., Yn®Bn are a minimal cover G, together with row  Z where Z is a key for R. Call these rows row1, row2, ..., rown, and rown+1 12/30/2018 CS319 Theory of Databases

20 Some decomposition algorithms 10
Sketch proof of Theorem 3 (cont): Construct a table with rows  Y1B1, Y2B2, ..., YnBn where FDs Y1®B1, Y2®B2, ..., Yn®Bn are a minimal cover G, together with a row  Z where Z is a key for R. Call these rows row1, row2, ..., rown, and rown+1 In rown+1, have a’s in all columns associated with key Z Suppose that there’s a b in the kth column in rown+1. Z is a key, so there must be a way of inferring the value at the location rown+1k from the a values in rown+1 using the FDs in the minimal cover G in some sequence. Tracing this sequence via matchings on the table will then convert the entry at location rown+1k to an a. 12/30/2018 CS319 Theory of Databases

21 Some decomposition algorithms 12
Split R = ABCDE into R1=AB, R2=BC, R3=CDE, R4=ACE, Z=BDE with the FDs A ® B, B ® C, DE ® C, CE ® A in minimal cover G 12/30/2018 CS319 Theory of Databases

22 Some decomposition algorithms 13
Split R = ABCDE into AB, BC, CDE, ACE, BDE where A®B, B®C, DE®C, CE®A are a minimal cover (G) and BDE is a key (Z): Have a b in column 1 (k = 3), can infer that BDE ® A, since B®C and CE®A Can trace these inferences by matching rows 2 and 5, then matching rows 4 and 5 to replace b51 by a1 etc. 12/30/2018 CS319 Theory of Databases

23 Some decomposition algorithms 14
Split R = ABCDE into R1=AB, R2=BC, R3=CDE, R4=ACE, Z=BDE with the FDs A ® B, B ® C, DE ® C, CE ® A in minimal cover G lossless 12/30/2018 CS319 Theory of Databases


Download ppt "Relational Data Base Design in Practice"

Similar presentations


Ads by Google