Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CS 430 Database Theory Winter 2005 Lecture 9: Fourth and Fifth Normal Forms.

Similar presentations


Presentation on theme: "1 CS 430 Database Theory Winter 2005 Lecture 9: Fourth and Fifth Normal Forms."— Presentation transcript:

1 1 CS 430 Database Theory Winter 2005 Lecture 9: Fourth and Fifth Normal Forms

2 2 Decompositions Given a relation R = {A 1, …, A n } (all of the A i are unique), then a set of relation schemas D = {R 1, …, R m } is a decomposition of R if R is the union of the R i, or That is, all the attributes of R appear in the R i

3 3 Goodness of Decomposition When is a decomposition “good”? Two standards:  Dependency Preservation  Lossless (Nonadditive) Join

4 4 Dependency Preservation Suppose we have a set of FDs F on R and a decomposition D = {R 1, …, R m }, the projection of F on R is the set  Ri (F) = { X  Y  F + | X  Y  R i } That is,  Ri (F) consists of all the FDs in the closure of F which are FDs on R i

5 5 Dependency Preservation D is Dependency Preserving with respect to F if the closure of the union of the projections of F onto the R i is the closure of F. Or, (  R1 (F)  …   Rm (F)) + = F + Or, if we project F onto the individual R i, union the projections together, and compute the closure, we get the original closure of F.  Or, no information contained in F is lost by projecting F onto the individual R i

6 6 Dependency Preservation Notes Claim: It is possible to find a 3NF decomposition of R (each of Ri is 3NF) which is dependency preserving  See Algorithm 11.2, page 340. (No proof.) Why do we want this?  When we update the database, we want to be able verify FDs by verifying them on the individual relations  The alternative is having to do joins to verify that our update is good, slowing system.

7 7 Lossless (Nonadditive) Join Property D has the Lossless (Nonadditive) Join property with respect to a set of FDs F if for every relation state r of R that satisfies F:  R1 (r)  …   Rm (r) = r (  is the natural join) Lossless means no loss of information Nonadditive means that natural join doesn’t add any information

8 8 Lossless (Nonadditive) Join Notes Algorithm 11.1, page 337, provides a way to test for this property If D is a binary decomposition, D = {R1, R2}, D is nonadditive if and only if: (R1  R2)  (R1 - R2) is in F+, or (R1  R2)  (R2 - R1) is in F+ That is, R1  R2 is a key for (at least) one of R1 or R2

9 9 Aside: Null Problems with Nulls See Figures 11.2, 11.3, Text Book Bottom line: If nulls are present, especially nulls in foreign keys then  May have to use outer joins instead of ordinary (inner) joins  Have to be careful if using aggregation (e.g. sum or average)

10 10 Multi-Value Dependencies If X, Y attributes of R there is a Multi-Valued Dependency (MVD) X  > Y, (we let Z = R - ( X  Y )) if for all states r of R, and t 1, t 2 tuples of r such that t 1 [ X ] = t 2 [ X ], then there exist tuples t 3, t 4 of r such that: t 3 [ X ] = t 4 [ X ] = t 1 [ X ] = t 2 [ X ] t 3 [ Y ] = t 1 [ Y ], t 4 [ Y ] = t 2 [ Y ] t 4 [ Z ] = t 1 [ Z ], t 3 [ Z ] = t 2 [ Z ] An MVD X  > Y, is trivial if Y  X, or X  Y = R

11 11 Fourth Normal Form R is 4NF with respect to a set of FDs and MVDs F if for every non-trivial MVD X  > Y, X is a superkey of R. See Figure 11.4(a, b) in Text Book.

12 12 Fourth Normal Form Notes If a relation is not 4NF then there are update anomalies:  If you add a relation you must also add the corresponding relations D is a lossless (nonadditive) decomposition of R, D = {R 1, R 2 }, with respect to a set of FDs and MVDs F if and only if: (R 1  R 2 )  > (R 1 - R 2 ), which is the same as (R 1  R 2 )  > (R 2 - R 1 )

13 13 Fifth Normal Form JD(R 1, …, R m ) is a Join Dependency (JD) for a decomposition {R 1, …, R m } of R if for every legal state r of R:  R1 (r)  …   Rm (r) = r A JD is trivial if some R i = R A relation R is in Fifth Normal Form (5NF) if for every non-trivial JD of R, every R i is a superkey of R

14 14 Notes on Fifth Normal Form An MVD is a JD with m = 2 Finding all the JDs of a database of any size is probably not feasible Example: See Figure 11.4 (c, d) of Text Book

15 15 Products, Salesmen, Territories A Data Design Problem Salesman  Sells specific products  Has specific territories  Has a quota: How much he is supposed to sell Product  Sold by salesmen  Has a price Territory  Worked by salesmen

16 16 ER Model Version 1 Product Salesman Territory Sells Product Works Territory Quota Price A Salesman can sell any Product he sells in any Territory he works. A Product has one Price for all Salesmen and all Territories. A Salesman has one Quota for all his sales. Note: Each Entity and Relation becomes a relation in our database.

17 17 ER Model Version 2 Product Salesman Territory Sells Product Works Territory Quota Price A Salesman has a Quota for each product he sells.

18 18 ER Model Version 3 Product Salesman Territory Sells Product Works Territory Quota Price Products are only sold in specific Territories. A Product has a Price set for each Territory where it is sold. A Salesman can sell any Product he sells in any Territory he works where that Product is sold. Note JD between “Sells Product”, “Sold In”, and “Works Territory”. Sold In

19 19 ER Model Version 4 Product Salesman Territory Sells Product Sells Product in Territory Quota Price A Salesman is assigned to sell specific Products in specific Territories. A Salesman has a Quota for each Product he sells in each Territory. Possible Integrity Constraint: Keys of “Sells Product” and “Sold In” are projections of “Sells Product in Territory”. Sold In

20 20 ER Model Version 4A Product Salesman Territory Sells Product in Territory Quota Price Possible Integrity Constraint: Key of “Sold In” is projection of “Sells Product in Territory”. (But I might want to assign a Price even though no Salemen have yet been assigned that Product in that Territory.) Sold In

21 21 Sample Fields Employee  Employee ID Number  Employee Name  Work Location  Manager Manager ID Number Manager Name Territory  Territory Number  Territory Name  Territory Bonus Product  Product Number  Product Name  Price  Actual_Sales  Target_Sales Other  Quota  Commission Rate  Commission  Manager Commission

22 22 Possible Functional Dependencies {Employee ID Number}   {Employee Name, Work Location, Manager ID Number, Manager Commission(?)} {Manage ID Number}   {Manager Name, Manager Commission(?)} {Territory Number}   {Territory Name, Territory Bonus(?)} {Product Number}   {Product Name, Price(?), Actual Sales(?), Target Sales (?)}

23 23 More Possible FDs {Employee ID Number, Territory Number}   {Territory Bonus(?), Quota(?), Commission Rate(?)} {Employee ID Number, Product Number}   {Quota(?), Commission Rate(?)} {Territory Number, Product Number}   {Price(?), Actual Sales(?), Target Sales(?), Territory Bonus(?), Commission Rate(?), Commission(?), Manager Commission(?)}

24 24 More Possible FDs {Employee ID Number, Product Number, Territory Number}   {Quota(?), Actual Sales(?), Target Sales(?), Commission Rate(?), Commission(?), Manager Commission(?)} {Actual Sales, Commission Rate}   {Commission} {Actual Sales, Manager Commission Rate}   {Manager Commission}

25 25 Proposed Solution Employee(Employee ID Number, Employee Name, Work Location, Manager ID Number) Manager(Manager ID Number, Manager Name, Manager Commission) Territory(Territory Number, Territory Name) Product(Product Number, Product Name)

26 26 More Proposed Solution Product_Territory(Product Number, Territory Number, Price) Employee_Territory(Employee ID Number, Territory Number, Territory Bonus) Employee_Product(Employee ID Number, Product Number, Commission Rate) Employee_Product_Territory(Employee ID Number, Product Number, Territory Number, Quota, Actual Sales, Target Sales, Commission, Manager Commission)


Download ppt "1 CS 430 Database Theory Winter 2005 Lecture 9: Fourth and Fifth Normal Forms."

Similar presentations


Ads by Google