Presentation is loading. Please wait.

Presentation is loading. Please wait.

Database Design Process

Similar presentations


Presentation on theme: "Database Design Process"— Presentation transcript:

1 Database Design: Logical Models: Normalization and The Relational Model

2 Database Design Process
Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

3 Draw ER Diagram A Bus Company owns a number of busses. Each bus is allocated to a particular route, although some routes may have several busses. Each route passes through a number of towns. One or more drivers are allocated to each stage of a route, which corresponds to a journey through some or all of the towns on a route. Some of the towns have a garage where busses are kept and each of the busses are identified by the registration number and can carry different numbers of passengers, since the vehicles vary in size and can be single or double-decked. Each route is identified by a route number and information is available on the average number of passengers carried per day for each route. Drivers have an employee number, name, address, and sometimes a telephone number.

4 Draw ER Diagram A lecturer, identified by his or her number, name and room number, is responsible for organizing a number of course modules. Each module has a unique code and also a name and each module can involve a number of lecturers who deliver part of it. A module is composed of a series of lectures and because of economic constraints and common sense, sometimes lectures on a given topic can be part of more than one module. A lecture has a time, room and date and is delivered by a lecturer and a lecturer may deliver more than one lecture. Students, identified by number and name, can attend lectures and a student must be registered for a number of modules. We also store the date on which the student first registered for that module. Finally, a lecturer acts as a tutor for a number of students and each student has only one tutor.

5 Requirements Collection
Database Design Steps Proj.Scope Requirements Collection and Analysis Internal Schema DB Requirements Conceptual Design Physical Design Conceptual Schema (in high-level data model E.g. ER model) Conceptual Schema (in DBMS specific data model e.g. relational model) Logical Design DBMS-independent DBMS specific

6 I. Requirements Analysis
Purpose: identify/describe data required by users Input: Functional and Data requirements Output: User specifications II. Conceptual Design Purpose: synthesize diff. users' views to global database design Input: User requirements from (I) & functional requirements Output: high-level data model

7 III. Logical Design (implementation design)
Purpose: to map conceptual design into specific DBMS IV. Physical Design Purpose: Concerned with factors relating to performance

8 The Physical Design Stage of SDLC (Figures 2-4, 2-5 revisited)
Project Identification and Selection Project Initiation and Planning Analysis Physical Design Implementation Maintenance Logical Design Purpose – information requirements structure Deliverable – detailed design specifications Logical Design Database activity – logical database design Maintenance

9 NOTE: all relations are in 1st Normal form
Definition: A relation is a named, two-dimensional table of data Table consists of rows (records), and columns (attribute or field) Requirements for a table to qualify as a relation: It must have a unique name. Every attribute value must be atomic (not multivalued, not composite) Every row must be unique (can’t have two rows with exactly the same values for all their fields) Attributes (columns) in tables must have unique names The order of the columns must be irrelevant The order of the rows must be irrelevant NOTE: all relations are in 1st Normal form

10 Correspondence with E-R Model
Relations (tables) correspond with entity types and with many-to-many relationship types Rows correspond with entity instances and with many-to-many relationship instances Columns correspond with attributes NOTE: The word relation (in relational database) is NOT the same as the word relationship (in E-R model)

11 Key Fields Keys are special fields that serve two main purposes:
Primary keys are unique identifiers of the relation in question. Examples include employee numbers, social security numbers, etc. This is how we can guarantee that all rows are unique Foreign keys are identifiers that enable a dependent relation (on the many side of a relationship) to refer to its parent relation (on the one side of the relationship) Keys can be simple (a single field) or composite (more than one field) Keys usually are used as indexes to speed up the response to user queries

12 Foreign Key (implements 1:N relationship between customer and order)
Primary Key Foreign Key (implements 1:N relationship between customer and order) Combined, these are a composite primary key (uniquely identifies the order line)…individually they are foreign keys (implement M:N relationship between order and product)

13 Integrity Constraints
Domain Constraints Allowable values for an attribute. Entity Integrity No primary key attribute may be null. All primary key fields MUST have data

14 Integrity Constraints
Referential Integrity – rule that states that any foreign key value (on the relation of the many side) MUST match a primary key value in the relation of the one side. (Or the foreign key can be null) For example: Delete Rules Restrict – don’t allow delete of “parent” side if related rows exist in “dependent” side Cascade – automatically delete “dependent” side rows that correspond with the “parent” side row to be deleted

15 Referential integrity constraints (Pine Valley Furniture)
Referential integrity constraints are drawn via arrows from dependent to parent table

16 Transforming ER Diagrams into Relations
Mapping Regular Entities to Relations Simple attributes: E-R attributes map directly onto the relation Composite attributes: Use only their simple, component attributes Multivalued Attribute - Becomes a separate relation with a foreign key taken from the superior entity

17 Mapping a regular entity
(a) CUSTOMER entity type with simple attributes (b) CUSTOMER relation

18 Mapping a composite attribute
(a) CUSTOMER entity type with composite attribute (b) CUSTOMER relation with address detail

19 Mapping a multivalued attribute
Multivalued attribute becomes a separate relation with foreign key (b) 1–to–many relationship between original entity and new relation

20 Transforming ER Diagrams into Relations (cont.)
Mapping Weak Entities Becomes a separate relation with a foreign key taken from the superior entity Primary key composed of: Partial identifier of weak entity Primary key of identifying relation (strong entity)

21

22 NOTE: the domain constraint for the foreign key should NOT allow null value if DEPENDENT is a weak entity Foreign key Composite primary key

23 Transforming ER Diagrams into Relations (cont.)
Mapping Binary Relationships One-to-Many - Primary key on the one side becomes a foreign key on the many side Many-to-Many - Create a new relation with the primary keys of the two entities as its primary key One-to-One - Primary key on the mandatory side becomes a foreign key on the optional side

24 Example of mapping a 1:M relationship
Relationship between customers and orders Note the mandatory one

25 Mapping the relationship
Again, no null value in the foreign key…this is because of the mandatory minimum cardinality Foreign key

26 Example of mapping an M:N relationship
E-R diagram (M:N) The Supplies relationship will need to become a separate relation

27 New intersection relation
Three resulting relations Composite primary key New intersection relation Foreign key

28 Mapping a binary 1:1 relationship
In_charge relationship

29 Resulting relations

30 Transforming ER Diagrams into Relations (cont.)
Mapping Unary Relationships One-to-Many - Recursive foreign key in the same relation Many-to-Many - Two relations: One for the entity type One for an associative relation in which the primary key has two attributes, both taken from the primary key of the entity

31 Mapping a unary 1:N relationship
(a) EMPLOYEE entity with Manages relationship (b) EMPLOYEE relation with recursive foreign key

32 Mapping a unary M:N relationship
(a) Bill-of-materials relationships (M:N) (b) ITEM and COMPONENT relations

33 Transforming ER Diagrams into Relations (cont.)
Mapping Ternary (and n-ary) Relationships One relation for each entity and one for the associative entity Associative entity has foreign keys to each entity in the relationship

34 Mapping a ternary relationship
Ternary relationship with associative entity

35 Remember that the primary key MUST be unique
Mapping the ternary relationship Remember that the primary key MUST be unique

36 Transforming EER Diagrams into Relations (cont.)
Mapping Supertype/Subtype Relationships One relation for supertype and for each subtype Supertype attributes (including identifier and subtype discriminator) go into supertype relation Subtype attributes go into each subtype; primary key of supertype relation also becomes primary key of subtype relation 1:1 relationship established between supertype and each subtype, with supertype as primary table

37 Supertype/subtype relationships

38 These are implemented as one-to-one relationships
Mapping Supertype/subtype relationships to relations These are implemented as one-to-one relationships

39 Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data The process of decomposing relations with anomalies to produce smaller, well-structured relations

40 Normalization Guidelines
Normalization guidelines are a set of data design standards called the normal forms. Five such normal forms are widely accepted. Matching these standards is known as normalization. We progress in order from first normal form through to fifth normal form. Meeting the guidelines means splitting tables into two or more tables with fewer columns. We can use the Primary-key/Foreign-key relationship to join data. Normalization reduces data redundancy although some duplication is unavoidable.

41 Definition of a Relation
Relation Definition: My-Relation(ID, Status, City). Primary Key My-Relation ID Status City London Paris New York Los Angeles Tuples Rows A Cell ( row / column intersection ) Attributes (Columns)

42 Normalization

43 Normalization Normalization theory is based on the observation that relations with certain properties are more effective in inserting, updating and deleting data than other sets of relations containing the same data Normalization is a multi-step process beginning with an “unnormalized” relation

44 Well-Structured Relations
A relation that contains minimal data redundancy and allows users to insert, delete, and update rows without causing data inconsistencies Goal is to avoid anomalies Insertion Anomaly – adding new rows forces user to create duplicate data Deletion Anomaly – deleting rows may cause a loss of data that would be needed for other future rows Modification Anomaly – changing data in a row forces changes to other rows because of duplication

45 Example – Figure Question – Is this a relation?
Answer – Yes: unique rows and no multivalued attributes Question – What’s the primary key? Answer – Composite: Emp_ID, Course_Title

46 Anomalies in this Table
Insertion – can’t enter a new employee without having the employee take a class Deletion – if we remove employee 140, we lose information about the existence of a Tax Acc class Modification – giving a salary increase to employee 100 forces us to update multiple records Why do these anomalies exist? Because there are two themes (entity types) into one relation. This results in duplication, and an unnecessary dependency between the entities

47 Figure Steps in normalization

48 First Normal Form No multivalued attributes
Every attribute value is atomic Fig is not in 1st Normal Form (multivalued attributes)  it is not a relation Fig is in 1st Normal form All relations are in 1st Normal Form

49 Table with multivalued attributes, not in 1st normal form
Note: this is NOT a relation

50 Table with no multivalued attributes and unique rows, in 1st normal form
Note: this is relation, but not a well-structured one

51 Anomalies in this Table
Insertion – if new product is ordered for order 1007 of existing customer, customer data must be re-entered, causing duplication Deletion – if we delete the Dining Table from Order 1006, we lose information concerning this item's finish and price Update – changing the price of product ID 4 requires update in several records Why do these anomalies exist? Because there are multiple themes (entity types) into one relation. This results in duplication, and an unnecessary dependency between the entities

52 Second Normal Form 1NF PLUS every non-key attribute is fully functionally dependent on the ENTIRE primary key Every non-key attribute must be defined by the entire key, not by only part of the key No partial functional dependencies

53 Therefore, NOT in 2nd Normal Form
Order_ID  Order_Date, Customer_ID, Customer_Name, Customer_Address Customer_ID  Customer_Name, Customer_Address Product_ID  Product_Description, Product_Finish, Unit_Price Order_ID, Product_ID  Order_Quantity Therefore, NOT in 2nd Normal Form

54 Getting it into Second Normal Form
Partial Dependencies are removed, but there are still transitive dependencies

55 Third Normal Form 2NF PLUS no transitive dependencies (functional dependencies on non-primary-key attributes) Note: this is called transitive, because the primary key is a determinant for another attribute, which in turn is a determinant for a third Solution: non-key determinant with transitive dependencies go into a new table; non-key determinant becomes primary key in the new table and stays as foreign key in the old table

56 Getting it into Third Normal Form
Transitive dependencies are removed

57 EXAMPLE # 2

58 Unnormalized Relations
First step in normalization is to convert the data into a two-dimensional table In unnormalized relations data can repeat within a column

59 Unnormalized Relation

60 First Normal Form To move to First Normal Form a relation must contain only atomic values at each row and column. No repeating groups A column or set of columns is called a Candidate Key when its values can uniquely identify the row in the relation.

61 First Normal Form

62 1NF Storage Anomalies Insertion: A new patient has not yet undergone surgery -- hence no surgeon # -- Since surgeon # is part of the key we can’t insert. Insertion: If a surgeon is newly hired and hasn’t operated yet -- there will be no way to include that person in the database. Update: If a patient comes in for a new procedure, and has moved, we need to change multiple address entries. Deletion (type 1): Deleting a patient record may also delete all info about a surgeon. Deletion (type 2): When there are functional dependencies (like side effects and drug) changing one item eliminates other information.

63 Second Normal Form A relation is said to be in Second Normal Form when every nonkey attribute is fully functionally dependent on the primary key. That is, every nonkey attribute needs the full primary key for unique identification

64 Second Normal Form

65 Second Normal Form

66 Second Normal Form

67 1NF Storage Anomalies Removed
Insertion: Can now enter new patients without surgery. Insertion: Can now enter Surgeons who haven’t operated. Deletion (type 1): If Charles Brown dies the corresponding tuples from Patient and Surgery tables can be deleted without losing information on David Rosen. Update: If John White comes in for third time, and has moved, we only need to change the Patient table

68 2NF Storage Anomalies Insertion: Cannot enter the fact that a particular drug has a particular side effect unless it is given to a patient. Deletion: If John White receives some other drug because of the penicillin rash, and a new drug and side effect are entered, we lose the information that penicillin can cause a rash Update: If drug side effects change (a new formula) we have to update multiple occurrences of side effects.

69 Third Normal Form A relation is said to be in Third Normal Form if there is no transitive functional dependency between nonkey attributes When one nonkey attribute can be determined with one or more nonkey attributes there is said to be a transitive functional dependency. The side effect column in the Surgery table is determined by the drug administered Side effect is transitively functionally dependent on drug so Surgery is not 3NF

70 Third Normal Form

71 Third Normal Form

72 2NF Storage Anomalies Removed
Insertion: We can now enter the fact that a particular drug has a particular side effect in the Drug relation. Deletion: If John White recieves some other drug as a result of the rash from penicillin, but the information on penicillin and rash is maintained. Update: The side effects for each drug appear only once.

73 Boyce-Codd Normal Form
Most 3NF relations are also BCNF relations. A 3NF relation is NOT in BCNF if: Candidate keys in the relation are composite keys (they are not single attributes) There is more than one candidate key in the relation, and The keys are not disjoint, that is, some attributes in the keys are common

74 Most 3NF Relations are also BCNF – Is this one?

75 BCNF Relations

76 Case Study (Table of grade report)
Student Id Student name Campus Address Major Course Title Instructor Name Location grade 101 Hassan Railway campus CS Cs 01 Cs 02 Cs 03 DBMS/ JAVA/ C++ Zeshan Umer Ali Nawaz BWP Multan Lodhran A B C 205 Akbar Abbasia IT It 201 It 202 Networks DBMS Administration Munwar Ali S. Town/ BWP/ Model Town


Download ppt "Database Design Process"

Similar presentations


Ads by Google