Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 COP 4710 Databases Fall, 2000 Today’s Topic Chapter 5: Improving the Quality of Relational Schemas David A. Gaitros September 18, 2000 Department of.

Similar presentations


Presentation on theme: "1 COP 4710 Databases Fall, 2000 Today’s Topic Chapter 5: Improving the Quality of Relational Schemas David A. Gaitros September 18, 2000 Department of."— Presentation transcript:

1

2 1 COP 4710 Databases Fall, 2000 Today’s Topic Chapter 5: Improving the Quality of Relational Schemas David A. Gaitros September 18, 2000 Department of Computer Science Copyright by Dr. Greg Riccardi

3 2 Representing Weak Entity Classes n Create a relation schema –Add foreign key for each defining relationship type –Key is partial key plus defining foreign keys n Consider Fig. 2.5, weak class Rental n Schema: Rental (videoId,dateDue, dateRented, cost) –key videoId (foreign key)

4 3 Solving a 1 to Many Relationship CustomerTable CustomerTable Video Inventory 1 M

5 4 Interpreting ER Diagrams n Model of a partial representation of real objects n Cardinalities should reflect meaning of representation, not meaning of reality! n Consider Fig. 2.6, class Person and the IsChildOf relationship type

6 5 Solving a 1 to Many Relationship CustomerTable PersonTable 1 M Children Match up Table The relationship with the spouse is taken care of in the person table with a single Column. Note the difference between the schema and table.

7 6 Functional Dependencies and Normalization n Begin by discussing good and bad relation schemas n Informal measures of the quality of relation schema design –Semantics of the attributes –Reducing the redundant values in tuples –Reducing the null values in tuples –Disallowing spurious tuples n Define Normal Forms as formal measures of the quality of schemas –restrictions on the form of relation schemas

8 7 Semantics of the Relation Attributes n How to interpret the attribute values stored in a tuple? n Guideline 1: Design a schema so that it is easy to explain its meaning. n Keep attributes from different entities and relationships distinct. n Example of mixing: –OwnerCar: (Oname, DLNum, CarId, Make, Manuf) –Oname is attribute of owner, Make is attribute of car!

9 8 Redundant Information in Tuples n Previous example of OwnerCar –OwnerCar: (Oname, DLNum, CarId, Make, Manuf) n Consider a table of OwnerCar : –(Joe, 123456789,106, Plymouth, Chrysler) –(Moe, 223456789, 107, Plymouth, Chrysler) n The Manuf attribute is redundant! n This leads to difficulty in updates also called Update Anomalies –E.g. changing the Manuf for Joe requires also changing for Moe.

10 9 Update Anomalies n Insertion Anomalies –When inserting a new owner, we must correctly insert the Manuf field, or will create inconsistencies –Cannot create a car without an owner –Cannot create a make without a car and an owner n Deletion Anomalies –Deletion of owner of a car also deletes make and manufacturer of car –Deletion of owner of the last Plymouth deletes relationship between Plymouth and Chrysler n Modification Anomalies –Changing the make of a car requires consistency check –Cannot change so that a Plymouth is made by Ford n Guideline 2: no insertion, deletion, or modification anomalies allowed!

11 10 Null Values in Tuples n May have many attributes (fat relation) which do not apply to many tuples –Hence, many null values in many tuples –Takes lots of space –Not sure how to treat these in Sum, Count n Nulls can have many interpretations –Attribute does not apply –Attribute value is unknown –Value is known but absent n Guideline 3: Avoid placing attributes whose values may be null in a base relation.

12 11 First Normal Form (1NF) n3n3 normal forms proposed by Codd in 1972 nAnAll attribute values are atomic (or indivisible). nTnThis rule is now part of the definition of relation. nHnHence, the translation from ERD to relational schema requires that multi-valued attributes be transformed into tables. See Step 6, p. 174.

13 12 Normal Forms based on Primary Keys n Normalization includes testing and modifying a schema until it satisfies a set of rules n Hope to ensure that update anomalies do not occur. n Unsatisfactory schemes are decomposed by breaking up attributes in smaller relations. n For each rule, if a particular relation violates the rule, that relation must be broken into smaller relations

14 13 Some definitions n superkey: a set of attributes of a relation whose values are unique within the relation. n key, a superkey in which removal of any attribute makes it not a superkey. If there is more than one key, they are called candidate keys. n primary key, arbitrarily designated candidate key, all other candidate keys are secondary keys. n prime attribute, one which is a member of any key. n nonprime attribute, one which is not prime.

15 14 Definition of Functional Dependency n A functional dependency is a constraint between 2 sets of attributes from the database –For each value of the first set there is a unique value of the second set n X-->Y restricts the tuples that can be instances of R n if t1 and t2 are instances of R –t1(X) = t2(X) then t1(Y) = t2(Y) n For example, –{DLNum} --> {Oname} –{CarId} --> {Make, Manuf} –{Make} --> {Manuf} n Candidate keys are left hand sides of functional dependencies

16 15 Second Normal Form (2NF) n X-->Y is a full functional dependency if the removal of any attribute A from X removes the dependency –not X-{A} --> Y n X-->Y is a partial dependency if some attribute A may be removed without removing the dependency –X-{A} --> Y n A relation schema R is in 2NF if every nonprime attribute is fully functionally dependent on the primary key of R

17 16 Consider the Car Registration Document n Fig. 5.9 Sample car registration form

18 17 Example of Car Registration Schema n This is a different car registration example from Fig. 5.9 n Relation owner –DLNum, Name, Address, City, State, Zip n Relation Car –CarId, DLNum, Make, Model, Manuf, Year, Color, Owner, PurchDate, TagNum, RegisDate n R is set of all attributes of schema n F is set of all functional dependencies –{DLNum} --> {Name, Address, City, State, Zip} –{CarId} --> {Make, Model, Manuf, Year, Color} –{TagNum} --> {RegisDate} –{CarId, DLNum} --> {PurchDate, TagNum,...} –and more!

19 18 Putting the CarReg Schema into 2NF n Consider the Owner relation schema –{DLNum} is the primary key –Hence Owner is in 2NF n Consider the Car relation schema –{CarId, DLNum} is primary key (multiple owners) –{CarId} --> {Make, Model,...} –Hence Car is not 2NF n Create new relations –CarOwner = {CarId, Owner, PurchDate, TagNum, RegisDate} –Car = {CarId, Make, Model, Manuf, Year, Color} n Is it 2NF?

20 19 Rules for Functional Dependencies n Given a particular set of functional dependencies, we can find others using inference rules –Splitting/combining rules A -> B1 B2 A-> B1 and A->B2 –Trivial rules A B -> B, for all A, B –Transitive rule A -> B and B -> C => A B -> C n We are interested in the closure of the set of functional dependencies under these (and other) rules

21 20 Inference Rules for Functional Dependency n There are semantically obvious functional dependencies, usually specified by schema designer n Other functional dependencies can be inferred from those n Inference rules –Reflexive, X includes Y, X-->Y –Augmentation, X-->Y then XZ-->YZ –Transitive, X-->Y-->Z then X-->Z –Decomposition, X-->YZ then X-->Y –Union, X-->Y and X-->Z then X-->YZ –Pseudotransitive, X-->Y and WY-->Z then WX-->Z

22 21 Definition of Key n A set of one or more attributes {A1,...Ak} is a key for a relation R –Those attributes functionally determine all other attributes of R no 2 distinct tuples can agree on the key –no proper subset of {A1,... Ak} is a key of R a key must be minimal n There can be more than one key in a relation –Department (DeptName, DeptNo,...) since both are unique, both are keys n A superkey (superset of a key) is a set of attributes that functionally determine all other attributes of the relation.

23 22 Third Normal Form (3NF) n Based on transitive dependency, or non- key dependency n A functional dependency X-->Y is a transitive dependency if there is a set Z which is not a subset of any key, and for which X-->Z and Z- ->Y n A relation schema is in 3NF if there is no nonprime attribute which is functionally dependent on a non-key set of attributes. n Example of {make}-->{manuf} violates 3NF since make is not a key.

24 23 Transforming Car into 3NF n Car = {CarId, Make, Model, Manuf, Year, Color} n {CarId}-->{Make, Model, Manuf, Year, Color} {Make} --> {Manuf} Not 3NF n Car = {CarId, Make, Model, Year, Color} MakeManuf = {Make, Manuf} n What about {Model}-->{Make}?

25 24 Boyce Codd Normal Form (BCNF) n A relation R is BCNF iff for each non-trivial dependency {A1,…Ak} -> B for R, –A1…Ak is a superkey n Alternatively, collect all similar violations –if A1…Ak -> B1…Bn then {A1,…Ak} is a superkey n A 3NF relation is not BCNF only if there is –X -> A such that X is not a superkey and A is a prime attribute n Any 2-attribute relation is BCNF: e.g. R(a,b) –either a->b but not b->a, {a} is key but not {b} –a->b and b->a, both {a} and {b} are keys –neither a->b nor b->a, {a,b} is key

26 25 Why BCNF? n BCNF schemas do not exhibit anomalies –only redundancy is foreign key –each non-key attribute appears only once –only update and delete problems are update of key attribute must be propagated to foreign keys deletion of tuple must be propagated to foreign keys, either null or delete n All functional dependencies are key dependencies –Functional dependency constraints have been turned into key constraints –Database system can enforce key constraints

27 26 Conversion of DB Schema into BCNF n Consider a single relation schema –Identify a BCNF violation –Decompose the relation to remove the violation –Repeat until no violations occur n Repeat for every relation in the DB schema, including the new relations created by decomposition

28 27 Decomposition into BCNF n Suppose R has a BCNF violation –A1…An -> B1…Bm and {A1,…An} is not superkey –Bs include all attributes that are dependent –let {C1,…Ck} be all other attributes (not As or Bs) n Create 2 new relations –R1(A1,…An, B1,…Bm} and R2={A1,…An,C1,…Ck} –keys must be determined by considering resulting functional dependencies n Consider other examples in class

29 28 Second Normal Form (2NF) n X-->Y is a full functional dependency if the removal of any attribute A from X removes the dependency –not X-{A} --> Y n X-->Y is a partial dependency if some attribute A may be removed without removing the dependency –X-{A} --> Y n A relation schema R is in 2NF if every nonprime attribute is fully functionally dependent on every key of R

30 29 Consider the Car Registration Document n Fig. 5.9 Sample car registration form

31 30 Inference Rules for Functional Dependency n There are semantically obvious functional dependencies, usually specified by schema designer n Other functional dependencies can be inferred from those n Inference rules –Reflexive, X includes Y, X-->Y –Augmentation, X-->Y then XZ-->YZ –Transitive, X-->Y-->Z then X-->Z –Decomposition, X-->YZ then X-->Y –Union, X-->Y and X-->Z then X-->YZ –Pseudotransitive, X-->Y and WY-->Z then WX-->Z

32 31 Third Normal Form (3NF) n Based on transitive dependency, or non- key dependency n A functional dependency X-->Y is a transitive dependency if there is a set Z which is not a subset of any key, and for which X-->Z and Z- ->Y n A relation schema is in 3NF if there is no nonprime attribute which is functionally dependent on a non-key set of attributes. n Example of {make}-->{manuf} violates 3NF since make is not a key.

33 32 Boyce Codd Normal Form (BCNF) n A relation R is BCNF iff for each non-trivial dependency {A1,…Ak} -> B for R, –A1…Ak is a superkey n Alternatively, collect all similar violations –if A1…Ak -> B1…Bn then {A1,…Ak} is a superkey n A 3NF relation is not BCNF only if there is –X -> A such that X is not a superkey and A is a prime attribute n Any 2-attribute relation is BCNF: e.g. R(a,b) –either a->b but not b->a, {a} is key but not {b} –a->b and b->a, both {a} and {b} are keys –neither a->b nor b->a, {a,b} is key

34 33 Why BCNF? n BCNF schemas do not exhibit anomalies –only redundancy is foreign key –each non-key attribute appears only once –only update and delete problems are update of key attribute must be propagated to foreign keys deletion of tuple must be propagated to foreign keys, either null or delete n All functional dependencies are key dependencies –Functional dependency constraints have been turned into key constraints –Database system can enforce key constraints

35 34 Homework 4 n 1. What are the differences between an E-R model and a relational model of an information system? –No representation for relationships –Restrictions on domains –Specific representation as tables n 2. Why must keys be declared? Why is it not always possible to infer a key constraint from the contents of a table? –Key constraints are based on meaning, not on state –Example from Fig. 4.1 n 3. Why is there no such thing as a weak relation schema? –Every schema has a superkey n 4. List the differences between attributes in an E-R diagram and attributes in the relational model. What restrictions are placed on attributes in the relational model? –Suggestions?

36 35 Homework 4, problem 5 n Translate the E-R diagram of Fig. 4.4 into a database schema. –Step 1: Entity class Customer to relation Customer –Step 2: Add simple attributes to Customer –Step 3: Add composite attributes to Customer –Step 4: Weak entity class OtherUser to relation OtherUser –Step 5: Identifying relationship type to attribute –Step 6: Add otherUser partial key as attribute and define key of relation OtherUser

37 36 Homework 5 n 1. Give examples of three reasons why redundancy in schemas creates problems. –Different update anomalies n 2. Give an example (not from the book) of each type (deletion, insertion, modification) of anomaly for the schema and table of Fig. 5.1. –Suggestions? n 3. Is it necessary to declare functional dependencies, or is it possible to infer them from sample tables? n Are there any apparent functional dependencies that can be inferred from the table of Fig. 5.1 that are not functional dependencies?

38 37 Homework 5, problem 5 n 5. Suppose a student registration database has a table for student grades: –Grades: (studentId, lastName, firstName, courseId, courseTitle, sectionNumber, semester, numHours, meetingTime, meetingRoom, grade) n a. Give a sample table for the Grades schema that shows the rendundancy inherent in the meaning of the information. n b. Define appropriate functional dependencies for the Grades schema. n c. List all of the non-trivial dependencies that can be inferred from the dependencies of part b. n c. Identify and remove any 2NF violations in the Grades schema. Show the resulting schemas and tables. n d. Identify and remove any 3NF violations in the result of part c. Show the resulting schemas and tables.

39 38 Homework 5, problems 6 and 7 n 6. With no functional dependencies defined, what is the key of R? –{A, B, C, D, E, F, G, H} n 7. Suppose {A, B} is the key of R and A -> {C, D} and B -> {E, F, H} n a. List all of the non-trivial functional dependencies of R. –{A,B}->{C, D, E, F, G, H}, … n b. What dependencies represent 2NF violations –{A}->{C, D}, {B}->{E, F, H} n c. Eliminate the 2NF violations by decomposition –S1: (A, B, G), S2: (A, C, D), S3: (B, E, F, H) n D. e. No 3NF violations

40 39 Homework 5, problem 7, revised. n Suppose {E} -> {F} n d. What dependencies represent 3NF violations –{E}->{F} n e. Eliminate the 3NF violations –S1: (A, B, G), S2: (A, C, D), –S4: (B, E, H), S5: (E, F) n F. Suppose {E} -> {F, G} –Then {B} -> {E, F, G, H} –Hence, decomposition is different


Download ppt "1 COP 4710 Databases Fall, 2000 Today’s Topic Chapter 5: Improving the Quality of Relational Schemas David A. Gaitros September 18, 2000 Department of."

Similar presentations


Ads by Google