Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 4 Logical Database Design and the Relational Model

Similar presentations


Presentation on theme: "Chapter 4 Logical Database Design and the Relational Model"— Presentation transcript:

1 Chapter 4 Logical Database Design and the Relational Model
Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration Gonzaga University Spokane, WA 99258

2 Objectives Define terms List five properties of relations
State two properties of candidate keys Define first, second, and third normal form Describe problems from merging relations Transform E-R and EER diagrams to relations Create tables with entity and relational integrity constraints Use normalization to convert anomalous tables to well-structured relations

3 Problem Solving for Modeling a Database Project
Study and Analyze w/Team Business Problem ER or EER or OO User interview & Integrated Model ????? ????? ????? Normalization (3NF)

4 NOTE: all relations are in 1st Normal form
Definition: A relation is a ______, ______-dimensional table of data Table is made up of rows (records), and columns (attribute or field) Not all tables qualify as relations Requirements: Every relation has a unique name. Every attribute value is atomic (not multivalued, not composite) Every row is unique (can’t have two rows with exactly the same values for all their fields) Attributes (columns) in tables have unique names The order of the columns is irrelevant The order of the rows is irrelevant NOTE: all relations are in 1st Normal form

5 Correspondence with ER Model
Relations (tables) correspond with entity types and with many-to-many relationship types Rows correspond with entity instances and with many-to-many relationship instances Columns correspond with attributes NOTE: The word relation (in relational database) is NOT the same as the word relationship (in ER model)

6 Key Fields Keys are special fields that serve two main purposes:
_______ keys are unique identifiers of the relation in question. Examples include employee numbers, social security numbers, etc. This is how we can guarantee that all rows are unique _______ keys are identifiers that enable a dependent relation (on the many side of a relationship) to refer to its parent relation (on the one side of the relationship) Keys can be simple (a single field) or composite (more than one field) Surrogate key Keys usually are used as indexes to speed up the response to user queries (More on this in Ch. 5)

7 Sample E-R Diagram (Figure 2-1)
3

8 Foreign Key (implements 1:N relationship between customer and order)
Figure 4-3 Schema for four relations (Pine Valley Furniture Company) Primary Key Foreign Key (implements 1:N relationship between customer and order) Combined, these are a composite primary key (uniquely identifies the order line)…individually they are foreign keys (implement M:N relationship between order and product)

9 Fig. 4-3: Schema for four relations (Pine Valley Furniture)
Graphical and Text Representations CUSTOMER(Customer_ID, Customer_name,Address, City,State,Zip) ORDER(Order_ID, Order_Date,Customer_ID) ORDER_LINE(Order_ID, Product_ID,Quantity) PRODUCT(Product_ID, Product_Description, Product_Finish,Unit_Price, On_Hand)

10 Fig. 4-1: EMPLOYEE1 Relation with sample data
EmpID Name DeptName Salary Margaret Simpson Marketing 48,000 Allen Beeton Accounting 52,000 110 Chris Lucero Info. System 43,000 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000

11 Fig. 4-2: Eliminating multi-valued attributes
(a) Table with repeating groups or multi-valued attributes (Un-Normalized) EMPLOYEE EmpID Name DeptName Salary Course Date Title Completed Margaret Simpson Marketing ,000 SPSS /19/200X Surveys /7/200X Allen Beeton Accounting ,000 Tax Acc 12/8/200X Chris Lucero Info. System ,000 SPSS /12/200X C /22/200X Lorenzo Davis Finance ,000 Susan Martin Marketing ,000 SPSS /16/200X Java /12/200X

12 Figure 4-2 (b) EMPLOYEE2 relation
Figure 4-2 (a) Table with repeating groups – how to “remove” them (and solve the problem) EmpID Name DeptName Salary Course Date Title Completed Margaret Simpson Marketing , SPSS /19/200X Surveys /7/200X Allen Beeton Accounting , Tax Acc 12/8/200X Chris Lucero Info. System , SPSS /12/200X C /22/200X Lorenzo Davis Finance 55,000 Susan Martin Marketing 42,000 SPSS /16/200X Java /12/200X Figure 4-2 (b) EMPLOYEE2 relation EmpID Name DeptName Salary Course Date Title Completed Margaret Simpson Marketing ,000 SPSS /19/200X Margaret Simpson Marketing ,000 Surveys /7/200X Allen Beeton Accounting ,000 Tax Acc 12/8/200X Chris Lucero Info. System ,000 SPSS /12/200X Chris Lucero Info. System ,000 C /22/200X Lorenzo Davis Finance ,000 Susan Martin Marketing ,000 SPSS /16/200X Susan Martin Marketing ,000 Java /12/200X

13 Fig. 4-2: Eliminating multi-valued attributes
(b) EMPLOYEE2 Relation (Normalized) EMPLOYEE2 EmpID Name DeptName Salary Course Date Title Completed Margaret Simpson Marketing ,000 SPSS /19/200X Margaret Simpson Marketing ,000 Surveys /7/200X Allen Beeton Accounting ,000 Tax Acc 12/8/200X Chris Lucero Info. System , SPSS /12/200X Chris Lucero Info. System , C /22/200X Lorenzo Davis Finance ,000 Susan Martin Marketing ,000 SPSS /16/200X Susan Martin Marketing ,000 Java /12/200X

14 Constraints Domain Constraints Entity Integrity Referential Integrity
Allowable values for an attribute. A domain definition contains: domain name, data type, size, meaning, and allowable values/range (if applicable). Entity Integrity No primary key attribute may be null. Referential Integrity A relationship between primary key and foreign key. Operational Constraints Business rules (see Chapter 3)

15 Integrity Constraints
_________ Integrity – rule that states that any foreign key value (on the relation of the many side) MUST match a primary key value in the relation of the one side. (Or the foreign key can be null) For example: Delete Rules Restrict – don’t allow delete of “parent” side if related rows exist in “dependent” side Cascade – automatically delete “dependent” side rows that correspond with the “parent” side row to be deleted e.g., DROP TABLE tablename CASCADE CONSTRAINTS; (p.110 of Oracle 11g) Set-to-Null – set the foreign key in the dependent side to null if deleting from the parent side  not allowed for weak entities

16 Fig. 4-5: Referential integrity constraints (Pine Valley Furniture)
pk pk fk Referential integrity constraints are drawn via arrows from dependent to parent table cpk/pk fk fk pk

17 Figure 4-6 SQL table definitions
Referential integrity constraints are implemented with foreign key to primary key references

18 Referential Integrity (Addition and Deletion)
PK FK

19

20 Well-Structured Relations
A well-structured relation contains minimal redundancy and allows users to insert, modify, and delete the rows in a table without errors or inconsistencies. The following anomalies should be removed for a well-structured relation: Insertion Anomaly Deletion Anomaly Modification Anomaly

21 Is EMPLOYEE2 a Well-Structured relation?
EmpID Name DeptName Salary Course Date Title Completed Margaret Simpson Marketing ,000 SPSS /19/200X Margaret Simpson Marketing ,000 Surveys /7/200X Allen Beeton Accounting ,000 Tax Acc 12/8/200X Chris Lucero Info. System , SPSS /12/200X Chris Lucero Info. System , C /22/200X Lorenzo Davis Finance ,000 Susan Martin Marketing ,000 SPSS /16/200X Susan Martin Marketing ,000 Java /12/200X NO – since there exists “anomalies” on addition, deletion and modification

22 EMPLOYEE2 EmpID Name DeptName Salary Course Date Title Completed
Margaret Simpson Marketing ,000 SPSS /19/200X Margaret Simpson Marketing ,000 Surveys /7/200X Allen Beeton Accounting ,000 Tax Acc 12/8/200X Chris Lucero Info. System , SPSS /12/200X Chris Lucero Info. System , C /22/200X Lorenzo Davis Finance ,000 Susan Martin Marketing ,000 SPSS /16/200X Susan Martin Marketing ,000 Java /12/200X

23 Is EMPLOYEE2 a Well-Structured relation?
No/Yes WHY? NO – since there exists “anomalies” on addition, deletion and modification

24 __________Anomaly: Inserting a new row, the user must supply values for both EmpID (PK) and CourseTitle (CPK and FK). This is an (insertion) anomaly, since the user should be able to enter employee data without knowing (supplying) course (title) data. EMPLOYEE2 EmpID Name DeptName Salary Course Date Title Completed Margaret Simpson Marketing ,000 SPSS /19/200X Margaret Simpson Marketing ,000 Surveys /7/200X Allen Beeton Accounting ,000 Tax Acc 12/8/200X Chris Lucero Info. System ,000 SPSS /12/200X Chris Lucero Info. System ,000 C /22/200X Lorenzo Davis Finance ,000 Susan Martin Marketing ,000 SPSS /16/200X Susan Martin Marketing ,000 Java /12/200X

25 Deletion Anomaly: Deleting the employee number 140, it results in losing not only the employee’s information but also the course had an offering that completed on that date. EMPLOYEE2 EmpID Name DeptName Salary Course Date Title Completed Margaret Simpson Marketing ,000 SPSS /19/200X Margaret Simpson Marketing ,000 Surveys /7/200X Allen Beeton Accounting ,000 Tax Acc 12/8/200X Chris Lucero Info. System ,000 SPSS /12/200X Chris Lucero Info. System ,000 C /22/200X Lorenzo Davis Finance ,000 Susan Martin Marketing ,000 SPSS /16/200X Susan Martin Marketing ,000 Java /12/200X

26 ________ Anomaly: If the employee number 100 gets a salary increase, we must record the increase in each of the rows for that employee (two occurences); otherwise the data will be inconsistent. EMPLOYEE2 EmpID Name DeptName Salary Course Date Title Completed Margaret Simpson Marketing ,000 SPSS /19/200X Margaret Simpson Marketing ,000 Surveys /7/200X Allen Beeton Accounting ,000 Tax Acc 12/8/200X Chris Lucero Info. System ,000 SPSS /12/200X Chris Lucero Info. System ,000 C /22/200X Lorenzo Davis Finance ,000 Susan Martin Marketing ,000 SPSS /16/200X Susan Martin Marketing ,000 Java /12/200X

27 Fig. 4-7: EMP_COURSE: Normalized Relations from EMPLOYEE2 EMPLOYEE2
EmpID Name DeptName Salary Course Date Title Completed Margaret Simpson Marketing ,000 SPSS /19/201X Margaret Simpson Marketing ,000 Surveys /7/201X Allen Beeton Accounting ,000 Tax Acc 12/8/201X Chris Lucero Info. System , SPSS /12/201X Chris Lucero Info. System , C /22/201X Lorenzo Davis Finance ,000 Susan Martin Marketing ,000 SPSS /16/201X Susan Martin Marketing ,000 Java /12/201X ?? EmpID Name DeptName Salary Margaret Simpson Marketing ,000 Allen Beet Accounting ,000 Chris Lucero Info. System ,000 Lorenzo Davis Finance ,000 Sususan Martin Marketing ,000 EMPLOYEE1 EmpID Course Date Title Completed SPSS /19/201X Surveys /7/201X Tax Acc 12/8/201X SPSS /12/201X C /22/201X SPSS /19/201X Java /12/201X EMP_COURSE YES, ‘DELETION’ anomaly since Dept INFO may be lost Is there any ano-maly?

28 Problem Solving for Modeling a Database Project
Study and Analyze w/Team Business Problem ER or EER or OO User interview & Integrated Model Transformation to Relations Relations Transformation (Seven Cases) ????? Normalization (3NF)

29 Seven Cases of Transforming EE-R Diagrams into Relations
1. Map Regular Entities 2. Map Weak Entities 3. Map Binary Relationships 4. Map Associative Entities 5. Map Unary Relationships 6. Map Ternary (and n-ary) Relationships 7. Map Supertype/Subtype Relationships

30 Transforming EE-R Diagrams into Relations
1. Map Regular Entities to Relations E-R attributes map directly onto the relation (Fig. 4-8) Composite attributes: Use only their simple, component attributes (Fig. 4-9). Multi-valued Attribute : Becomes a separate relation with a foreign key taken from the superior entity (Fig. 4-10).

31 [Same name on relation and entity type]
Fig. 4-8: Mapping the regular entity CUSTOMER (a) CUSTOMER entity type (b) CUSTOMER relation [Same name on relation and entity type]

32 [Use only their simple, component attributes]
Figure 4-9 Mapping a composite attribute (a) CUSTOMER entity type with composite attribute (b) CUSTOMER relation with address detail [Use only their simple, component attributes]

33 (b) Mapping a multivalued attribute
Fig. 4-10: Mapping an entity with a multivalued attribute (a) Employee entity type with multivalued attribute [Two relations created with one containing all of the attributes except the multi-valued attribute, and the second one contains the pk (on the first one) and the multi-valued attribute] Multivalued attribute becomes a separate relation with foreign key (b) Mapping a multivalued attribute EmployeeID EmployeeName EmployeeAddress EMPLOYEE Skill One–to–many relationship between original entity and new relation

34 Break ! (Ch. 4 - Part I) In class exercise
#1-I (a) (p.193), apply Figure 2-8. HW # 1-I (c), apply Figure 2-11.a Fig. 3-8: Figure 2-8

35 Problem Solving for Modeling a Database Project
Study and Analyze w/Team Business Problem ER or EER or OO User interview & Integrated Model Transformation to Relations Relations Transformation (Seven Cases) ????? Normalization (3NF)

36 Seven Cases of Transforming EE-R Diagrams into Relations
1. Map Regular Entities 2. Map Weak Entities 3. Map Binary Relationships 4. Map Associative Entities 5. Map Unary Relationships 6. Map Ternary (and n-ary) Relationships 7. Map Supertype/Subtype Relationships

37 Transforming EER Diagrams into Relations
2. Mapping Weak Entities Becomes a separate relation with a foreign key taken from the superior entity Primary key composed of: Partial identifier of weak entity Primary key of identifying relation (strong entity) (Fig. 4-11)

38 Question: Why need all FOUR attributes to be a CK?
Fig. 4-11: Example of mapping a weak entity (a) Weak entity DEPENDENT Fig. 4-11: (b) Relations resulting from weak entity [Becomes a separate relation with a foreign key taken from the superior entity] PK One employee (e.g., 100) might have more than one dependents, therefore, it needs all FOUR attributes to be a CK NOTE: the domain constraint for the foreign key should NOT allow null value if DEPENDENT is a weak entity CPK FK Question: Why need all FOUR attributes to be a CK?

39 Transforming EE-R Diagrams Into Relations
3. Map Binary Relationships One-to-Many - Primary key on the one side becomes a foreign key on the many side (Fig. 4-12). Many-to-Many - Create a new relation with the primary keys of the two entities as its primary key (Fig. 4-13). One-to-One - Primary key on the mandatory side becomes a foreign key on the optional side (Fig. 4-14).

40 [Primary key on the one side becomes a foreign key on the many side]
Fig. 4-12: Example of mapping a 1:M relationship (a) Relationship between customers and orders Note the mandatory one [Primary key on the one side becomes a foreign key on the many side] Fig. 4-12: (b) Mapping the relationship Again, no null value in the foreign key…this is because of the mandatory minimum cardinality Foreign key

41 Figure 4-13 Example of mapping an M:N relationship
a) Completes relationship (M:N) The Completes relationship will need to become a separate relation

42 Figure 4-13 Example of mapping an M:N relationship (cont.)
b) Three resulting relations Composite primary key (cpk) New intersection relation Foreign key

43 Figure 4-14 Example of mapping a binary 1:1 relationship
a) In_charge relationship (1:1) mandatory optional Often in 1:1 relationships, one direction is optional.

44 PK FK Same domain as Nurse_ID Fig. 4-14: (b) Resulting relations
mandatory Same domain as Nurse_ID PK optional FK [Primary key on the mandatory side becomes a foreign key on the optional side]

45 Transforming EE-R Diagrams Into Relations
4. Map Associative Entities Identifier Not Assigned Default primary key for the association relation is composed of the primary keys of the two entities (as in M:N relationship) (Fig. 4-15) Identifier Assigned It is natural and familiar to end-users. Default identifier may not be unique. (Fig. 4-16).

46 Figure 4-15 Example of mapping an associative entity
B A a) An associative entity B A A

47 cpk Figure 4-15 Example of mapping an associative entity (cont.)
b) Three resulting relations [Default primary key for the association relation is the primary keys of the two entities] PK cpk Composite primary key formed from the two foreign keys fk PK

48 [Default primary key for the association relation is assigned]
Figure 4-16: Mapping an associative entity (a) Associative entity (SHIPMENT) (b) Three resulting relations Primary key differs from foreign keys

49 Transforming EE-R Diagrams Into Relations
5. Map Unary Relationships One-to-Many Recursive foreign key in the same relation (Fig. 4-17). A recursive FK is a FK in a relation that references the PK values of that same relation. It must have the same domain as the PK. Many-to-Many - Bill-of-materials: Two relations: One for the entity type. One for an associative relation in which the primary key has two attributes, both taken from the primary key of the entity. (Fig. 4-18).

50 EMPLOYEE Figure 4-17 Mapping a unary 1:N relationship EmployeeID
A recursive FK is a FK in a relation that references the PK values of that same relation. It must have the same domain as the PK. (a) EMPLOYEE entity with unary relationship (b) EMPLOYEE relation with recursive foreign key EmployeeID EmployeeName EmployeeDateOfBirth EMPLOYEE ManagerID

51 cpk ITEM COMPONENT Figure 4-18: Mapping a unary M:N relationship
One for the entity type. One for an associative relation in which the primary key has two attributes, both taken from the primary key of the entity. (a) Bill-of-materials relationships (M:N) ItemNo ItemDescription ItemUnitCost ComponentNo Quantity COMPONENT ITEM PK (b) ITEM and COMPONENT relations fk cpk

52 Transforming EE-R Diagrams Into Relations
6. Map Ternary (and n-ary) Relationships One relation for each entity and one for the associative entity. Associative entity has foreign keys to each entity in the relationship (Fig. 4-19).

53 Figure 4-19 Mapping a ternary relationship
a) PATIENT TREATMENT Ternary relationship with associative entity

54 Figure 4-19 Mapping a ternary relationship (cont.)
b) Mapping the ternary relationship PATIENT TREATMENT A patient may receive a treatment one in the morning, then the same treatment in the afternoon.

55 Figure 4-19 Mapping a ternary relationship (cont.)
b) Mapping the ternary relationship PATIENT TREATMENT It would be better to create a surrogate key like Treatment# How to create it with Oracle? Remember that the primary key MUST be unique This is why treatment date and time are included in the composite primary key But this makes a very cumbersome key…

56 Transforming EER Diagrams into Relations
7. Mapping Supertype/Subtype Relationships One relation for supertype and for each subtype Supertype attributes (including identifier and subtype discriminator) go into supertype relation Subtype attributes go into each subtype; primary key of supertype relation also becomes primary key of subtype relation 1:1 relationship established between supertype and each subtype, with supertype as primary table (Fig. 4-20).

57 Figure 4-20 Supertype/subtype relationships

58 Mapping Supertype/subtype relationships to relations
Figure 4-21 Mapping Supertype/subtype relationships to relations SELECT * FROM EMPLOYEE, SALARIED_EMPLOYEE WHRE Employee_Number= S_Employee_Number; These are implemented as one-to-one relationships Display a table that contains all the attributes for SALARIED_EMPLOYEE

59

60 Partial Specialization
Break ! (Ch. 4 - Part II) Figure 3-6(b) In class exercise Transform it to relations (NOT 3NF) #2-III-a , (p.193, apply Figure 3-6.b (read step 7, on p.176) HW #2-III-d, (p.193), apply Figure 3-10 Partial Specialization Fig. 3-8:

61 Break ! (Ch. 4 - Part II) In class exercise (another set) (p.193)
Transform it to relations (NOT 3NF) #2-III (b) (Figure 3-7a) (see next slide) HW #2-III (c) (Figure 3-9)

62 Fig. 3-7: Examples of disjointness constraints
(a) Disjoint rule

63 Fig. 3-9: Subtype discriminator (overlap rule)

64 Problem Solving for Modeling a Database Project
Study and Analyze w/Team Business Problem ER or EER or OO User interview & Integrated Model Transformation to Relations Relations Transformation (Seven Cases) Normalization Normalization (3NF) IMPLEMENTATION

65 Seven Cases of Transforming EE-R Diagrams into Relations
1. Map Regular Entities 2. Map Weak Entities 3. Map Binary Relationships 4. Map Associative Entities 5. Map Unary Relationships 6. Map Ternary (and n-ary) Relationships 7. Map Supertype/Subtype Relationships

66 Next Topic Next topic is the most important topic (theory) in this database management class. What is it? Normalization

67 Data Normalization The process of decomposing relations with anomalies to produce smaller, well-structured and stable relations Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data

68 Well-Structured Relations
A relation that contains minimal data redundancy and allows users to insert, delete, and update rows without causing data inconsistencies Goal is to avoid (minimize) anomalies Insertion Anomaly – adding new rows forces user to create duplicate data Deletion Anomaly – deleting rows may cause a loss of data that would be needed for other future rows Modification Anomaly – changing data in a row forces changes to other rows because of duplication General rule of thumb: a table should not pertain to more than one entity type

69 Example – Figure 4.2b Question – Is this a relation?
Answer – Yes: unique rows and no multivalued attributes Question – What’s the primary key? Answer – EmpID and CourseTitle a composite key

70 Anomalies in this Table
Insertion – can’t enter a new employee without having the employee take a class Deletion – if we remove employee 140, we lose information about the existence of a Tax Acc class Modification – giving a salary increase to employee 100 forces us to update multiple records Why do these anomalies exist? Because are two themes (entity types – what are they?) in this one relation (two themes, entity types, were combined). This results in duplication, and an unnecessary dependency between the entities

71 Functional Dependencies and Keys
Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute. Candidate Key A unique identifier. One of the candidate keys will become the primary key E.g. perhaps there is both credit card number and SS# in a table…in this case both are candidate keys Each non-key field is functionally dependent on every candidate key

72 Figure 4-23: Representing Functional Dependencies (cont.)
(a) Functional dependencies in EMPLOYEE1 Fig. 4-2a) (b) Functional dependencies in EMPLOYEE2 (Fig. 4-2b)

73 First Normal Form No multivalued attributes
Every attribute value is atomic (singled-value) Fig. 4-2a is not in 1st Normal Form (multivalued attributes)  it is not a relation Fig. 4-2b is in 1st Normal form (but not in a well-structured relation) All relations are in 1st Normal Form The following example is not from the text will be illustrated for Normalization process.

74 EmpID Name DeptName Salary Course Date Title Completed
Figure 4-2 (a) EmpID Name DeptName Salary Course Date Title Completed Margaret Simpson Marketing , SPSS /19/200X Surveys /7/200X Allen Beeton Accounting , Tax Acc 12/8/200X Chris Lucero Info. System , SPSS /12/200X C /22/200X Lorenzo Davis Finance 55,000 Susan Martin Marketing 42, SPSS /16/200X Java /12/200X EmpID Name DeptName Salary Course Date Title Completed Margaret Simpson Marketing ,000 SPSS /19/200X Margaret Simpson Marketing ,000 Surveys /7/200X Allen Beeton Accounting ,000 Tax Acc 12/8/200X Chris Lucero Info. System ,000 SPSS /12/200X Chris Lucero Info. System ,000 C /22/200X Lorenzo Davis Finance ,000 Susan Martin Marketing ,000 SPSS /16/200X Susan Martin Marketing ,000 Java /12/200X Figure 4-2 (b)

75 Second Normal Form 1NF and every non-key attribute is fully functionally dependent on the primary key. Every non-key attribute must be defined by the entire key (either a single PK or a CPK), not by only part of the key. No partial functional dependencies. Fig. 4-2b is NOT in 2nd Normal Form

76 Remove Multivalued Attributes
Figure: 4-22 Steps in normalization Table with Multivalued attributes Remove Multivalued Attributes First normal form (1NF) Remove _____ Dependencies Second normal form(2NF) Remove … Third normal form (3NF) Remove remaining anomalies resulting from multiple candidate keys Boyce-Codd normal form (BC-NF) Remove Multivalued Dependencies Fourth normal Form (4NF) Remove Remaining Anomalies Fifth normal form (5NF)

77 (b) Functional Dependencies in EMPLOYEE2
A Process of 1NF to 2NF (EMPLOYEE NF) (b) Functional Dependencies in EMPLOYEE2 Dependency on entire primary key EmpID CourseTitle Name DeptName DateCompleted Salary Dependency on only part of the key EmpID Name DeptName Salary Course Date Title Completed Margaret Simpson Marketing ,000 SPSS /19/200X Margaret Simpson Marketing ,000 Surveys /7/200X Allen Beeton Accounting ,000 Tax Acc 12/8/200X Chris Lucero Info. System , SPSS /12/200X Chris Lucero Info. System , C /22/200X Lorenzo Davis Finance ,000 Susan Martin Marketing ,000 SPSS /16/200X Susan Martin Marketing ,000 Java /12/200X

78 Functional Dependencies in EMPLOYEE2
EmpID CourseTitle DateCompleted Salary DeptName Name Dependency on entire primary key EmpID, CourseTitle  DateCompleted EmpID  Name, DeptName, Salary Dependency on only part of the key (partial dep.) Therefore, NOT in 2nd Normal Form!!

79 Summary on Normalization: from 1NF to 2NF
EMPLOYEE2 EmpID CourseTitle Name DeptName Salary DateCompleted Partial Depend. EmpID Name Salary DeptName EMPLOYEE1 2NF 3NF ? EmpID CourseTitle DateCompleted EMP_COURSE

80 Summary on Normalization Is there any ano-maly (ies)?
EMPLOYEE2 (1NF) EmpID Name DeptName Salary Course Date Title Completed Margaret Simpson Marketing ,000 SPSS /19/200X Margaret Simpson Marketing ,000 Surveys /7/200X Allen Beeton Accounting ,000 Tax Acc 12/8/200X Chris Lucero Info. System , SPSS /12/200X Chris Lucero Info. System , C /22/200X Lorenzo Davis Finance ,000 Susan Martin Marketing ,000 SPSS /16/200X Susan Martin Marketing ,000 Java /12/200X EmpID Name DeptName Salary Margaret Simpson Marketing ,000 Allen Beet Accounting ,000 Chris Lucero Info. System ,000 Lorenzo Davis Finance ,000 Sususan Martin Marketing ,000 EMPLOYEE1 3NF EmpID Course Date Title Completed SPSS /19/200X Surveys /7/200X Tax Acc 12/8/200X SPSS /12/200X C /22/200X SPSS /19/200X Java /12/200X EMP_COURSE Is there any ano-maly (ies)?

81 Third Normal Form 2NF and no transitive dependencies (functional dependency between non-key attributes.) Other examples

82 Remove Multivalued Attributes
Figure: 4-22 Steps in normalization Table with Multivalued attributes Remove Multivalued Attributes First normal form (1NF) Remove Partial Dependencies Second normal form(2NF) Remove _________ Dependencies Third normal form (3NF) Remove remaining anomalies resulting from multiple candidate keys Boyce-Codd normal form (BC-NF) Remove Multivalued Dependencies Fourth normal Form (4NF) Remove Remaining Anomalies Fifth normal form (5NF)

83 Extra example-1: Relation with transitive dependency
(a) SALES relation with simple data SALES Cust_ID Name Salesperson Region 8023 Anderson 101 South 9167 Bancroft 102 West 7924 Hobbs 6837 Tucker 103 East 8596 Eckersley 7018 Arnold 104 North

84 What Anomalies might be in SALES relation?
WHY? Because it is … Not in the 3NF (why?) (transitive dependency) Insertion anomaly ? Deletion anomaly ? Modification anomaly ? SALES Cust_ID Name Salesperson Region 8023 Anderson 101 South 9167 Bancroft 102 West 7924 Hobbs 6837 Tucker 103 East 8596 Eckersley 7018 Arnold 104 North 1. Insertion anomaly - a new salesperson (Robinson) assigned to the North region cannot be entered unitl a customer has been assigned to that salesperson (since a a value for Cust_ID must be provided to insert a row in the table). 2. Deletion anomaly -- if customer number 6837 is deleted from the table, we lose the information that salesperson Hernandez is assigned to the East region. 3. Modification anomaly -- if salesperson Smith is reassigned to the East region, several rows must be changed to reflect that fact. N

85 Extra example-1: Relation with transitive dependency
CustID  Name CustID  Salesperson CustID  Region All this is OK (2nd NF)

86 Extra example-1: Relation with transitive dependency
CustID  Name CustID  Salesperson CustID  Region and Salesperson  Region All this is OK (2nd NF) BUT CustID  Salesperson  Region implies CustID  Region Transitive dependency (not in 3rd NF)

87 Remove a transitive dependency
Extra example-1: (b) Relations in 3NF Remove a transitive dependency

88 Extra example-1: Removing a transitive dependency
(a) Decomposing the SALES relation SALES1 S_PERSON Cust_ID Name Salesperson 8023 Anderson 101 9167 Bancroft 102 7924 Hobbs 6837 Tucker 103 8596 Eckersley 7018 Arnold 104 Salesperson Region 101 South 102 West 103 East 104 North

89 Extra example-2: Relation transitive dependencies
Snum Origin Destination Distance Seattle Denver ,537 Chicago Dallas ,058 Boston Atlanta ,214 Denver Los Angeles 1,150 Minneapolis St. Louis ?NF Insertion anomaly? Deletion anomaly? Modification anomaly?

90 Extra example-2: Relation transitive dependencies
Snum Origin Destination Distance SHIPMENT ?NF Snum Origin Destination Seattle Denver Chicago Dallas Boston Atlanta Denver Los Angeles Minneapolis St. Louis Origin Destination Distance Seattle Denver ,537 Chicago Dallas ,058 Boston Atlanta ,214 Denver Los Angeles 1,150 Minneapolis St. Louis Transitive dependency

91 Summary on Normalization: from 1NF to 2NF
EMPLOYEE2 EmpID CourseTitle Name DeptName Salary DateCompleted Partial Depend. EmpID Name Salary DeptName EMPLOYEE1 2NF 3NF ? EmpID CourseTitle DateCompleted EMP_COURSE

92 Figure: 4-22 Steps in normalization
Table with Multivalued attributes Remove ________ Attributes First normal form (1NF) Remove _______ Dependencies Second normal form(2NF) Remove ______ Dependencies Third normal form (3NF) Remove remaining anomalies resulting from multiple candidate keys Boyce-Codd normal form (BC-NF) Remove Multivalued Dependencies Fourth normal Form (4NF) Remove Remaining Anomalies Fifth normal form (5NF)

93 Conceptual Schema (Model)
Steps of Database Development User view-1 User view-2 User view-3 User view-N User interview & Integrated Model Conceptual Schema (Model) Logical Model (ERD or E/ERD) (Seven) Relations ________ (more relations produced) _____________ (up to 3NF) (more tables created) Implementation (w/Physical Model)

94 In-class Quiz next class
END of CHAPTER 4 In class exercise (p.193) #3- a,b,c,d HW (using Visio) (p ) #7 - a,b,c,d,e,f Bonus #8 – a,b,c,d In-class Quiz next class

95 MVC_Hospital HW Logical Design Phase
Draw a entity-relationship diagram (enterprise model) for Mountain View community Hospital, based on the narrative description of the case and this handout (but the entities are from the five (5) figures shown above). You should create a file and turn in with a hardcopy (called MVC_Hospital_DD.docx) contains the following materials: 1. Read and employ materials from chapters 2,3 and 4. 2. Include entities, associations (with detail cardinality), and attributes. 3. Determine and draw the order of entering data Next phase -- implementation, create SQL script file for table structure and data base (values).

96 Hint: You need to create VIEW (one or more) to help you create SQL efficiently and effectively
See sample on the Bb -- version 1 for charge_view that includes Patient Name CREATE OR REPLACE VIEW charge_view(Patient_No,Patient _Name, Item_Code, Charge) AS SELECT patient.patient_no, patient.p_first|| ' ' ||patient.p_last, pt_charg.item_code, (charge*num_times_admitted) FROM patient, pt_charg, item WHERE item.item_code = pt_charg.item_code AND patient.patient_n o = pt_charg.patient_no ORDER BY patient.p_last;

97 MVC_Hospital Create two script files:
1. a script file (MVC_Hospital_Lastname_Firstname.SQL) that contains a set of commands of DROP, CREATE, and INSERT that performs the same functions as in the script file of Northwoods.sql 2. Second script file (MVC_Hospital_QUERIES_Lastname_Firstname.SQL) containing a set of SQL commands that answer the questions. Test the query one/time successfully. Note that you may need other SQL commands and create database views (see pptx file for introducing VIEWS) for the purpose of answering questions easily. You may need to read other references related the SQL from the text book (e.g., Chapter 7 of McFadden). 3. Spool (2) and save it in the file MVC_Hospital_Spool_Lastname_Firstname.txt Finally, you create a new file (*.docx) containing all work done from Part I and save them in the file MVC_Hospital_Complete_Lastname_Firstname.docx. The file should contain your class information and personal information. 4. UPLOAD the .docx file to the Bb by the deadline.

98 Normalized vs. De-normalized
We will study the concept and technique of “normalization and de-normalization” as well as OLTP and OLAP.

99 More on OLTP vs. OLAP pk The figure depicts a relational database environment with two tables. The first table contains information about pet owners; the second, information about pets. The tables are related by the single column they have in common: Owner_ID. By relating tables to one another, we can reduce ____________ of data and improve database performance. The process of breaking tables apart and thereby reducing data redundancy is called _______________. fk pk pk: primary key fk: foreign key Fig. Extra-a: A simple database with a relation between two tables. For those have database background.

100 OLTP vs. OLAP (cont.) pk fk pk Most relational databases which are designed to handle a high number of reads and writes (updates and retrievals of information) are referred to as ________ (OnLine Transaction Processing) systems. OLTP systems are very efficient for high volume activities such as cashiering, where many items are being recorded via bar code scanners in a very short period of time. However, using OLTP databases for analysis is generally not very efficient, because in order to retrieve data from multiple tables at the same time, a query containing ________ must be used.

101 Fig. Extra-b: A combination of the tables into a single dataset.
OLTP vs. OLAP (cont.) In order to keep our transactional databases running quickly and smoothly, we may wish to create a data warehouse. A data warehouse is a type of large database (including both current and historical data) that has been _____________ and archived. Denormalization is the process of intentionally combining some tables into a single table in spite of the fact that this may introduce duplicate data in some columns. Fig. Extra-b: A combination of the tables into a single dataset. The figure depicts what our simple example data might look like if it were in a data warehouse. When we design databases in this way, we reduce the number of joins necessary to query related data, thereby speeding up the process of analyzing our data. Databases designed in this manner are called __________ (OnLine Analytical Processing) systems.

102 Merging Relations (View Integration)
In a project development process, there may be a number of separate E-R diagrams and user views created and some of them may be redundant. Therefore, some relations should be merged to remove the redundancy.

103 Merging Relations (View Integration - An example)
EMPLOYEE1( EmployeeID, Name, Address, Phone) EMPLOYEE2(EmployeeID, Name, Address, Jobcode, No_Years) EMPLOYEE(EmployeeID, Name, Address, Phone, Jobcode, No_Years)

104 Merging Relations (Problems on View Integration)
Issues to watch out for when merging entities from different ER models: Synonyms: Different names, same meaning. Homonyms: Same name, different meanings. Transitive Dependencies: dependencies–even if relations are in 3NF prior to merging, they may not be after merging Supertype/Subtype May be hidden prior to merging

105 Problems on View Integration
Synonyms: Different names, same meaning. STUDENT1(StudentID, Name) STUDENT2(MatriculationNo,Name, Address) STUDENT(SSN, Name, Address) Homonyms: Same name, different meanings. STUDENT1(StudentID, Name,Address) STUDENT2(StudentID,Name, Phone_No,Address) STUDENT(StudentID,Name, Phone_No, Campus_Address, Permanent_Address)

106 Problems on View Integration
Transitive Dependencies STUDENT1(StudentID, Major) STUDENT2(StudentID, Advisor) the result is ... STUDENT(StudentID, Major, Advisor) ??NF and after removing transitive dependency STUDENT_Major(StudentID, Major) Major_Advisor(Major, Advisor)

107 Problems on View Integration
Supertype/Subtype PATIENT1(PatientID, Name, Address) PATIENT2(PatientID, Room_No) PATIENT(PatientID, Name, Address) Two subtypes are hidden prior to merging RESIDENT_PATIENT(PatientID, RoomNo) OUTPATIENT(PatientID, DateTreated)


Download ppt "Chapter 4 Logical Database Design and the Relational Model"

Similar presentations


Ads by Google