Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSc-340 6b1 Relational Database Design Chapter 8 [1 of 2] Features of Good Relational Designs Atomic Domains and First Normal Form Decomposition Using.

Similar presentations


Presentation on theme: "CSc-340 6b1 Relational Database Design Chapter 8 [1 of 2] Features of Good Relational Designs Atomic Domains and First Normal Form Decomposition Using."— Presentation transcript:

1 CSc-340 6b1 Relational Database Design Chapter 8 [1 of 2] Features of Good Relational Designs Atomic Domains and First Normal Form Decomposition Using Functional Dependencies Functional-Dependency Theory

2 Approaches to Designing a Set of Relations The 1 st approach is (Chapter 7) a Top-Down Design (Relational Design by Analysis): 1. Designing a conceptual schema in a high-level data model, such as the EER model 2. Mapping the conceptual schema into a set of relations using mapping procedures.

3 Approaches to Designing a Set of Relations The 2 nd approach is (Chapter 8) a Bottom-Up Design (Relational Design by Synthesis): 1. First construct a minimal set of FDs Assumes that all possible functional dependencies are known. 2. a normalization algorithm is applied to construct a target set of 3NF or BCNF relations. start by one large relation schema, called the universal relation, which includes all the database attributes. then repeatedly perform decomposition until it is no longer feasible or no longer desirable, based on the functional and other dependencies specified by the database designer.

4 Well-Structured Relations A relation that contains minimal data redundancy A relation that contains minimal data redundancy Allows users to insert, delete, and update rows without causing data inconsistencies Allows users to insert, delete, and update rows without causing data inconsistencies Goal is to avoid "anomalies" Goal is to avoid "anomalies" General rule of thumb: A table should not pertain to more than one entity type

5 Anomalies An anomaly is an inconsistent, incomplete, or contradictory state of the database Insertion anomaly – user is unable to insert a new record when it should be possible to do so Deletion anomaly – when a record is deleted, other information that is tied to it is also deleted Update anomaly –a record is updated, but other appearances of the same items are not updated

6 Example Question–Is this a relation? Answer–Yes: Unique rows Question–What’s the primary key? Answer–Composite: Emp_ID, Course_Title

7 Anomalies in this Table Insertion–we need to specify a class when adding a new person, need to add person when adding a class Insertion–we need to specify a class when adding a new person, need to add person when adding a class Deletion–if we remove employee 140, we lose information about the existence of a Tax Acc class Deletion–if we remove employee 140, we lose information about the existence of a Tax Acc class Modification–giving a salary increase to employee 100 forces us to update multiple records Modification–giving a salary increase to employee 100 forces us to update multiple records Why do these anomalies exist? Because there are two themes (entity types) in this one relation. This results in data duplication and an unnecessary dependency between the entities

8 Example of Bad Design DRINKER(name, addr, beersLiked, manf, favBeer) nameaddrbeersLikedmanffavBeer JanewayVoyagerBudA.B.WickedAle Janeway???WickedAlePete’s??? SpockEnterpriseBud???Bud Data is redundant, because each of the ???’s can be figured out since name implies addr & favBeer and beersLiked implies manf.

9 This Bad Design Also Exhibits Anomalies nameaddrbeersLikedmanffavBeer JanewayVoyagerBudA.B.WickedAle JanewayVoyagerWickedAlePete’sWickedAle SpockEnterpriseBudA.B.Bud Update anomaly: if Janeway is transferred to Intrepid, will we remember to change each of her tuples? Deletion anomaly: If nobody likes Bud, we lose track of the fact that Anheuser-Busch manufactures Bud.

10 Objectives of Normalization Develop a "good" description of the data, its relationships and constraints Produce a stable set of relations that Is a faithful model of the enterprise Is highly flexible Reduces redundancy saves space and reduces inconsistency in data Is free of update, insertion and deletion anomalies

11 Normal Forms First Normal Form - 1NF Second Normal Form - 2NF Third Normal Form - 3NF Boyce-Codd Normal Form - BCNF Fourth Normal Form - 4NF Fifth Normal Form - 5NF Domain/Key Normal Form – DKNF Others…

12 12 Steps in normalization

13 Data Normalization Background Info Needed: all the attributes that must be incorporated into the database a list of all the defining associations between the attributes (i.e., the functional dependencies). a means of expressing that the value of one particular attribute is associated with a single, specific value of another attribute. If we know that one of these attributes has a particular value, then the other attribute must have some other value.

14 First Normal Form A relation is in 1NF if every attribute is single-valued for each tuple Each cell of the table has only one value in it Domains of attributes are atomic: no sets, lists, repeating fields or groups allowed in domains

15 15 Table with multivalued attributes, not in 1 st normal form

16 16 Table with no multivalued attributes and unique rows, in 1 st normal form Note: this is a relation, but not a well-structured one

17 Anomalies in this Table Insertion–if new product is ordered for order 1007 of existing customer, customer data must be re-entered, causing duplication Insertion–if new product is ordered for order 1007 of existing customer, customer data must be re-entered, causing duplication Deletion–if we delete the Dining Table from Order 1006, we lose information concerning this item's finish and price Deletion–if we delete the Dining Table from Order 1006, we lose information concerning this item's finish and price Update–changing the price of product ID 4 requires update in several records Update–changing the price of product ID 4 requires update in several records Why do these anomalies exist? Because there are multiple themes (entity types) in one relation. This results in duplication and an unnecessary dependency between the entities

18 Ensuring 1NF [1 of 3] Best solution: For each multi-valued attribute, create a new table, in which you place the key of the original table and the multi-valued attribute. Keep the original table, with its key Example: STUDENT(stuId, lastName, {major}, credits, status, socSecNo) becomes NEWSTUDENT(stuId, lastName, credits,status, socSecNo) MAJORS(stuId, major)

19 Another method for 1NF [2 of 3] “Flatten” the original table by making the multi-valued attribute part of the key Student(stuId, lastName, major, credits, status, socSecNo)

20 A Third Method for 1NF [3 of 3] If the number of repeats is limited, make additional columns for multiple values Student(stuId, lastName, major1, major2, credits, status, socSecNo) Hannay Reels: 3 Product Types, 2 Address Lines

21 Just Getting Started First normal form is merely a starting point in the normalization process. First normal form contains a great deal of data redundancy.

22 Functional Dependency A functional dependency (FD) is a type of relationship between attributes If A and B are sets of attributes of relation R, we say B is functionally dependent on A if each A value in R has associated with it exactly one value of B in R. Recall the concept of "single-valued" function Alternatively, if two tuples have the same A values, they must also have the same B values Write A → B, read A functionally determines B, or B functionally dependent on A

23 Example of FDs Let R be NewStudent(stuId, lastName, major, credits, status, socSecNo ) FDs in R include stuId → lastName, but not the reverse stuId → lastName, major, credits, status, socSecNo, stuId socSecNo → {stuId, lastName, major, credits, status, socSecNo} credits → status, but not status → credits A functional dependency is a generalization of the notion of a key.

24 "General Hardware": SALESPERSON and PRODUCT Functional Dependencies:

25 Beer Example Drinkers(name, addr, beersLiked, manf, favBeer). Reasonable FD’s to assert: 1. name  addr 2. name  favBeer 3. beersLiked  manf

26 Related Attributes nameaddr beersLiked manffavBeer JanewayVoyager Bud A.B.WickedAle JanewayVoyager WickedAle Pete’sWickedAle SpockEnterprise Bud A.B.Bud Because name  addr Because name  favBeer Because beersLiked  manf

27 FD’s With Multiple Attributes No need for FD’s with > 1 attribute on right. But sometimes convenient to combine FD’s as a shorthand. Example: name → addr and name → favBeer become name → {addr, favBeer} > 1 attribute on left may be essential. Example: {bar, beer} → price

28 FD’s From “Laws of Physics” While most FD’s come from E-R keyness and many-one relationships, some are really physical laws. Example: "no two courses can meet in the same room at the same time" tells us: {hour, room} → course.

29 Trivial Functional Dependency The FD X → Y is trivial if set {Y} is a subset of set {X} Examples: If A and B are attributes of R, {A} → {A} {A,B} → {A} {A,B} → {B} {A,B} → {A,B} are all trivial FDs

30 Full Functional Dependency In relation R, set of attributes B is fully functionally dependent on set of attributes A of R, if B is functionally dependent on A but not functionally dependent on any proper subset of A This means every attribute in A is needed to functionally determine B

31 Partial Functional Dependency NewClass(courseNo, stuId, stuLastName, facId, schedule, room, grade) FDs: {courseNo,stuId} → grade {courseNo,stuId} → lastName ** partial FD stuId → lastName {courseNo,stuId} → facId ** partial FD courseNo → facId {courseNo,stuId} → schedule ** partial FD courseNo → schedule {courseNo,stuId} → room ** partial FD courseNo → room …plus trivial FDs that are partial…

32 Functional Dependencies and Keys Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute Candidate Key: Candidate Key: A unique identifier. One of the candidate keys will become the primary key A unique identifier. One of the candidate keys will become the primary key E.g. perhaps there is both credit card number and SS# in a table…in this case both are candidate keys E.g. perhaps there is both credit card number and SS# in a table…in this case both are candidate keys Each non-key field is functionally dependent on every candidate key Each non-key field is functionally dependent on every candidate key

33 33 Candidate Key Which candidate key to pick depends on the application using the database. Alternate key is a candidate key that was not chosen to be the primary key of the relation.

34 Second Normal Form A relation is in second normal form (2NF) if it is in first normal form and all the non-key attributes are fully functionally dependent on the key. No non-key attribute is FD on just part of the key If key has only one attribute and R is 1NF, then R is automatically 2NF

35 Converting to 2NF Identify each partial FD Remove the attributes that depend on each of the determinants so identified Place these determinants in separate relations along with their dependent attributes In original relation keep the composite key and any attributes that are fully functionally dependent on all of it Even if the composite key has no dependent attributes, keep that relation to connect the others

36 2NF Example NEWCLASS(courseNo, stuId, stuLastName, facId, schedule, room, grade ) FDs grouped by determinant: {courseNo} → {courseNo, facId, schedule, room} {stuId} → {stuId, lastName} {courseNo, stuId} → {courseNo, stuId, facId, schedule, room, lastName, grade} Create tables grouped by determinants: COURSE(courseNo, facId, schedule, room) STUD(stuId, lastName) Keep relation with original composite key, with attributes FD on it, if any NEWSTUD2( courseNo, stuId, grade)

37 37 General Hardware Company: Second Normal Form In SALESPERSON, Salesperson Number is the sole primary key attribute. Every nonkey attribute of the table is fully defined just by Salesperson Number. Similar logic for PRODUCT and QUANTITY tables.

38 38 General Hardware Company: Second Normal Form

39 Transitive Dependency If A, B, and C are attributes of relation R, such that A → B, and B → C, then C is transitively dependent on A Example: NewStudent (stuId, lastName, major, credits, status) FD: credits → status By transitivity: stuId → credits AND credits → status implies stuId → status Transitive dependencies cause update, insertion, deletion anomalies.

40 Third Normal Form To be 3NF, relation must be 2NF and have no transitive dependencies No non-key attribute determines another non-key attribute. Here key includes “candidate key”

41 Making a relation 3NF For example, NEWSTUDENT(stuId, lastName, major, credits, status) with FD: credits → status 1)Remove the dependent attribute, status, from the relation 2)Create a new table with the dependent attribute and its determinant, credits 3)Keep the determinant, but not the dependent attribute, in the original table NEWSTUD2 (stuId, lastName, major, credits) STATS (credits, status)

42 Order_ID  Order_Date, Customer_ID, Customer_Name, Customer_Address Therefore, NOT in 2 nd Normal Form Customer_ID  Customer_Name, Customer_Address Product_ID  Product_Description, Product_Finish, Unit_Price Order_ID, Product_ID  Order_Quantity Functional dependency diagram for INVOICE

43 Partial dependencies are removed, but there are still transitive dependencies Getting it into Second Normal Form Removing partial dependencies

44 44 Transitive dependencies are removed Removing Transitive dependencies Getting it into Third Normal Form

45 45 General Hardware Company: Second Normal Form

46 46 General Hardware Company: Third Normal Form

47 Third Normal Form Important points about the third normal form structure are: It is completely free of data redundancy. All foreign keys appear where needed to logically tie together related tables.

48 48 Violating 3NF? Zip Codes determine City & State AND Company (therefore CustID) determines Zip Hannay Reels CustomerMaster does not split out zip/city/state into another table Why might this be? I screwed up??? ease/speed of access get full address with one data fetch (one table) lack of repetition (not many customers at same zip code in same city) foreign customers (no zip)

49 Closure of a Set of Functional Dependencies Given a set F of functional dependencies, there are certain other functional dependencies that are logically implied by F. For example: If A  B and B  C, then we can infer that A  C The set of all functional dependencies logically implied by F is the closure of F. We denote the closure of F by F +. F + is a superset of F.

50 Closure of a Set of Functional Dependencies We can find F +, the closure of F, by repeatedly applying Armstrong’s Axioms: if   , then    (reflexivity) if   , then      (augmentation) if   , and   , then    (transitivity) These rules are sound (generate only functional dependencies that actually hold), and complete (generate all functional dependencies that hold).

51 Example R = (A, B, C, G, H, I) F = { A  B A  C CG  H CG  I B  H} some members of F + A  H by transitivity from A  B and B  H AG  I by augmenting A  C with G, to get AG  CG and then transitivity with CG  I CG  HI by augmenting CG  I to infer CG  CGI, and augmenting of CG  H to infer CGI  HI, and then transitivity

52 Procedure for Computing F + To compute the closure of a set of functional dependencies F: F + = F repeat for each functional dependency f in F + apply reflexivity and augmentation rules on f add the resulting functional dependencies to F + for each pair of functional dependencies f 1 and f 2 in F + if f 1 and f 2 can be combined using transitivity then add the resulting functional dependency to F + until F + does not change any further

53 Closure of Functional Dependencies (Cont.) Additional rules: If    holds and    holds, then     holds (union) If     holds, then    holds and    holds (decomposition) If    holds and     holds, then     holds (pseudotransitivity) The above rules can be inferred from Armstrong’s axioms.

54 Canonical Cover Sets of functional dependencies may have redundant dependencies that can be inferred from the others For example: A  C is redundant in: {A  B, B  C, A  C} Parts of a functional dependency may be redundant E.g.: on RHS: {A  B, B  C, A  CD} can be simplified to {A  B, B  C, A  D} E.g.: on LHS: {A  B, B  C, AC  D} can be simplified to {A  B, B  C, A  D} Intuitively, a canonical cover of F is a “minimal” set of functional dependencies equivalent to F, having no redundant dependencies or redundant parts of dependencies Every set of FDs has an equivalent minimal set There can be several equivalent minimal sets

55 Equivalent FDs Two sets of FDs F and G are equivalent if: Every FD in F can be inferred from G, and Every FD in G can be inferred from F Hence, F and G are equivalent if F + = G + Definition (Covers): F covers G if every FD in G can be inferred from F (i.e., if G + is subset-of F + ) F and G are equivalent if F covers G and G covers F

56 Finding a Minimal Cover Finding a minimal cover G for F 1.Set G := F. 2.Replace each FD X  {A 1, A 2,..., A k } in G by the n functional dependencies X  A 1, X  A 2, …, X  A k. 3.For each FD X  A in G For each attribute B that is an element of X if ((G -{X  A})  {(X-{B})  A}) is equivalent to G, then replace X  A with (X-{B})  A in G. 4.For each remaining FD X  A in G, if (G-{X  A}) is equivalent to G, then remove X  A from G.

57 Finding a Minimal Cover Example 1 Example: {A → B, ABCD → E, EF → GH, ACDF → EG} Make RHS a single attribute: {A → B, ABCD → E, EF → G, EF → H, ACDF → E, ACDF → G} Minimize LHS: ACD → E instead of ABCD → E Eliminate redundant FDs Can ACDF → G be removed? Can ACDF → E be removed? Final answer: {A → B, ACD → E, EF → G, EF → H}

58 Canonical Cover Example 2 Compute the minimal cover of the following set of functional dependencies: {ABC  DE, BD  DE, E  CF, EG  F} ABC  D ABC  E BD  D// reflexive BD  E E  C E  F EG  F// augmentation The minimal cover is: {ABC  D, BD  E, E  C, E  F}

59 CSc-340 6b59 Homework/Project Homework due Next Class: 7.1, 7.3, 7.5, 7.6, 7.7, 7.8, 7.9, 7.10, 7.11, 7.12 No Homework due in One Week (Test) Project Verify (or convert to) 1NF, 2NF, 3NF

60 CSc340 6b60 Test in One Week Chapters 6-8 Give/Interpret Domain Relational Calculus Expressions Design an EER Diagram; Convert to Relational Schema (tuples, primary keys, foreign keys) Convert a relation to 1NF using each of 3 methods Test for and/or Convert to 2NF, 3NF, BCNF, 4NF Find a 3NF lossless join decomposition that preserves dependencies List all non-trivial Functional Dependencies for a relation Give Closure Set for Functional Dependencies List Candidate Keys; Give Canonical Cover Open Book 4 Sheets of Notes

61 CSc-340 6b61 In-Class Exercise (You may work together. You are free to go once you show me your work.) List the Functional Dependences in: (EmpID, CourseTitle, Name, Phone, Salary, DateCompleted) 8.6 Given the relation R and Functional Dependencies F, compute F + : R = (A, B, C, D, E); F: A  BC, B  D, CD  E, E  A Find Candidate Keys (more than 1) Find Canonical Cover


Download ppt "CSc-340 6b1 Relational Database Design Chapter 8 [1 of 2] Features of Good Relational Designs Atomic Domains and First Normal Form Decomposition Using."

Similar presentations


Ads by Google