Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Relational Model and Normalization R. Nakatsu.

Similar presentations


Presentation on theme: "The Relational Model and Normalization R. Nakatsu."— Presentation transcript:

1 The Relational Model and Normalization R. Nakatsu

2 The Relational Model Data is represented in two-dimensional tables –Each of the tables is a matrix consisting of a series of row/column intersections –Tables are also called relations –Columns of the tables are attributes Information in more than one table can be easily extracted and combined E.F. Codd defined well-structured “normal forms” of relations

3 Functional Dependency Notation: X  Y Each value of X determines one and only one value of Y Examples: SID  Major, LastName, FirstName ComputerSerialNumber  MemorySize (SID, CourseNumber)  Grade

4 Functional Dependency What are the functional dependencies in the relation below?

5 A  B relationships A  B and B  Aone-to-one A  B but B not  Amany-to-one A not  B and B not  Amany-to-many Another way to write not  is . For example: A   B and B   A (A multi-determines B and B multi-determines A)

6 Key A group of one or more attributes that uniquely identifies a row. A relation has one primary key and may also have additional keys called candidate keys.

7 Composite Key is a key that contains two or more attributes

8 Normalization Normalization is a process that assigns attributes (fields) to tables such that data redundancies are eliminated or reduced, thereby reducing the likelihood of data anomalies. Stages of Normalization (Normal Forms): 1NF, 2NF, 3NF, BCNF, 4NF (Ensure that tables are at least 3NF; higher forms are far less likely to be encountered).

9 Normalization Process Objective: Ensure that each table conforms to the concept of well-formed relations. –Each table represents a single subject –No data item will be unnecessarily stored in more than one table –All nonkey attributes in a table are dependent on the primary key –Each table is void of insertion, update, and deletion anomalies

10 Anomaly An undesirable consequence of data modification in which two or more different themes are entered (insertion anomaly) in a single row or two or more themes are lost if the row is deleted (deletion anomaly).

11 Example State the deletion and insertion anomalies. SIDActivityFee 100Skiing200 100Golf 65 150Swimming 50 175Squash 50 175Swimming 50 200Swimming 50 200Golf 65

12 First Normal Form (1NF) Any table of data that meets the definition of a relation: No multi-valued attributes allowed. No repeating groups. No two rows can be identical (need a primary key). Order of the rows is insignificant. All entries in a column are of the same kind. Each column must have a unique name.

13 Table Not in 1NF

14

15 Order Table Attributes 1.Order ID 2.Order Date 3.Shipping Date 4.Customer ID 5.Customer Name 6.Shipping Address 7.Book 1 Title 8.Book 1 Price 9.Book 1 Qty 10.Book 2 Title 11.Book 2 Price 12.Book 2 Qty 13. Book 3 Title 14. Book 3 Price 15. Book 3 Qty 16. Book 4 Title 17. Book 4 Price 18. Book 4 Qty 19. Book 5 Title 20. Book 5 Price 21. Book 5 Qty (Title, Price, Qty) is a repeating group Table Not in 1NF

16 Second Normal Form (2NF) If it is in 1NF and all its nonkey attributes are dependent on all of the key. No partial dependencies are allowed. Partial dependency: Functional dependence in which the determinant is only part of the primary key.

17 Not in 2NF. Why? SIDActivityFee 100Skiing200 100Golf 65 150Swimming 50 175Squash 50 175Swimming 50 200Swimming 50 200Golf 65

18 Tables in 2NF

19 Third Normal Form (3NF) If it is in 2NF and has no transitive dependencies. Transitive Dependency: One nonkey attribute functionally depends on another nonkey attribute. What is the transitive dependency in this example?

20 Tables in 3NF

21 Boyce-Codd Normal Form (BCNF) If it is in 3NF and every determinant is a candidate key. © 2000 Prentice Hall

22 Database Systems, 9th Edition 22

23 Fourth Normal Form (4NF) If it is in BCNF and has no multi-valued dependencies. A multi-valued dependency occurs when one key determines multiple values of two other attributes, and those attributes are independent of one another. Given two independent attributes A and B: Key   A Key   B

24 Not in 4NF. Why? SIDMajorActivity 100MusicSwimming 100AccountingSwimming 100MusicTennis 100AccountingTennis 150MathJogging

25 Tables in 4NF © 2000 Prentice Hall

26 Summary of Normal Forms 1NF: Must meet the definition of a relation 2NF: No partial dependencies 3NF: No transitive dependencies BCNF: Every determinant is a candidate key 4NF: No multi-valued dependencies 5NF and DKNF: Not covered (of theoretical interest only) These normal forms are nested.

27 Dependency Diagram A dependency diagram depicts all dependencies found within given table structure –Helps to get an overview of all relationships among table’s attributes –Makes it less likely that an important dependency will be overlooked –The arrows on the top indicate that the Relation is in 1NF; that is, the primary key determines all other attributes.

28 Database Systems, 9th Edition 28

29 Solution

30 Example: Using ER Diagramming and Normalization Together Employee (Employee Number, Last Name, First Name, Job Class, Hourly Rate) In this example, HourlyRate is dependent on JobClass. What is the problem with this table? Employee NumberLast NameFirst NameJob ClassHourly Rate 11SmithJohnMechanic20 12JonesSusanTechnician18 13McKayBobMechanic20 14OwensPaulaClerk15 ChangSteveMechanic20 16SarandonSarahMechanic20

31 Solution: Create Two Tables Employee (Employee Number, Last Name, First Name, Job Class ID) Job Class ID is the link to the Job Class table. Employee NumberLast NameFirst NameJob Class ID 11SmithJohn2 12JonesSusan3 13McKayBob2 14OwensPaula1 15ChangSteve2 16SarandonSarah2

32 Job Class (Job Class ID, Job Class, Hourly Rate) There are no more field dependencies!

33 De-Normalization Sometimes normalization is not worth it. When a table is split into two or more tables, the cost of the extra processing (i.e., joins) may not be worth it. Controlled Redundancy: For performance reasons, however, it is sometimes appropriate to duplicate data intentionally.


Download ppt "The Relational Model and Normalization R. Nakatsu."

Similar presentations


Ads by Google