Presentation is loading. Please wait.

Presentation is loading. Please wait.

DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 David M. Kroenke’s Chapter Three: The Relational Model and Normalization.

Similar presentations


Presentation on theme: "DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 David M. Kroenke’s Chapter Three: The Relational Model and Normalization."— Presentation transcript:

1 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 David M. Kroenke’s Chapter Three: The Relational Model and Normalization Database Processing: Fundamentals, Design, and Implementation

2 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-2 Chapter Premise We have received one or more tables of existing data The data is to be stored in a NEW database Should the table of data –be stored as is, or –restructured?

3 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-3 Example 1

4 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-4 Example 2

5 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-5 Data Redundancy Data redundancy results in data inconsistency –Different and conflicting versions of the same data appear in different places –Errors more likely to occur when the same data must be entered in several different places Data anomalies develop when required changes in redundant data are not made successfully

6 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-6 Modification Anomalies Update anomalies –Occur when changes must be made to existing records Insertion anomalies –Occur when entering new records Deletion anomalies –Occur when deleting records

7 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-7 Modification Anomalies The EQUIPMENT_REPAIR table before and after an incorrect update operation on AcquisitionCost for Type = Drill Press:

8 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-8 The Relational Model Introduced in 1970 Created by E.F. Codd –IBM engineer –The model used the mathematical system known as “relational algebra” Today it is the standard for commercial DBMS products

9 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-9 Important Relational Model Terms Entity Relation (table) Functional Dependency Determinant Candidate Key Composite Key Primary Key Surrogate Key Foreign Key Referential integrity constraint Normal Form Multivalued Dependency

10 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-10 Entity An entity is some identifiable thing that users want to track: –Customer –Computer –Sale –Student –Invoice –Department –Course –Policy

11 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-11 Relation Data about entities is stored in relations A relation is a two-dimensional table with these characteristics: –Rows contain data about an entity –No two rows may contain identical data –Columns contain data about attributes of the entity –All entries in a column are the same data type –Each column has a unique name –Cells of the table hold a single value

12 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-12 Employee Relation

13 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-13 Normalization This is the process of organizing data into relations (tables) that are structurally sound from the perspective of the relational model –SQL works well on data organized this way –Data quality is higher; i.e., the data is more reliable

14 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-14 Objectives of Normalization Develop a good description of the data, its relationships and constraints Produce a stable set of relations that –Is a faithful model of the enterprise –Is highly flexible –Reduces redundancy saves space reduces data inconsistency –Is free of all anomalies

15 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-15 Anomalies are very bad!

16 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-16 Anomalies An anomaly is an inconsistent, incomplete, or contradictory state of the database –Insertion anomaly – user cannot insert a new record when it should be possible to do so –Deletion anomaly – when a record is deleted, other information that is tied to it is also deleted (not by design) –Update anomaly – a record is updated, but other appearances of the same data are not

17 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-17 Data redundancy leads to anomalies Find examples of insertion, deletion & update anomalies

18 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-18 Normalization Normalization plays an important role in database design. Through it, we decompose relations (tables) in stages from lower to higher normal forms –1NF, 2NF, 3NF, BCNF –Other normal forms are 4NF, 5NF, DKNF We use normalization and E-R modeling together for good database design It all starts with identifying functional dependencies (FD’s)

19 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-19 Functional Dependency If A and B are (sets of) attributes of relation R, B is functionally dependent on A if… –a particular value of A determines a unique value of B. Emp_Name is functionally dependent on Emp_Num with a particular value for Emp_Num, I can find the name of the employee (Emp_Name) with that Emp_Num A→B says A determines B –or B is dependent on A

20 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-20 Example of FDs R = NewStudent(stuId, lastName, major, credits, status, ssn) Some FDs in R: stuId → lastName stuId → (lastName, major, credits, status, ssn, stuId) ssn → (stuId, lastName, major, credits, status, ssn) credits → status

21 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-21 Functional Dependency A functional dependency occurs when the value of one (a set of) attribute(s) determines the value of a second (set of) attribute(s): StudentID  StudentName StudentID  (DormName, DormRoom, Fee) The attribute on the left side of the functional dependency is called the determinant Functional dependencies may be based on equations: ExtendedPrice = Quantity X UnitPrice (Quantity, UnitPrice)  ExtendedPrice But, functional dependencies are not equations!

22 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-22 Composite Determinants A determinant of a functional dependency may itself consist of more than one attribute: (StudentName, ClassName)  (Grade) Note that StudentName  Grade ClassName  Grade

23 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-23 Functional Dependencies in the SKU_DATA Table Can you find three FDs?

24 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-24 Functional Dependencies in the ORDER_ITEM Table Can you find two?

25 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-25 Keys are determinants A key is a combination of one or more attributes that is used to identify rows in a relation A composite key is a key that consists of two or more attributes

26 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-26 Keys & FD’s Superkey – functionally determines all attributes in a relation Candidate key – a superkey that is a minimal identifier Primary key - chosen candidate key –Must always be filled (non-null) –Must be unique –May be composite –Ideally, it is short, numeric and never changes Entity integrity

27 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-27 Surrogate Keys A surrogate key is an artificial column added to a relation to serve as a primary key: –DBMS supplied –Short, numeric, never changes, never reused –Has artificial values that may be meaningless to users –See next slide for example

28 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-28 Utility of Surrogate Keys RENTAL_PROPERTY with no surrogate key: RENTAL_PROPERTY (Street, City, State/Province, Zip/PostalCode, Country, Rental_Rate) RENTAL_PROPERTY with a surrogate key: RENTAL_PROPERTY (PropertyID, Street, City, State/Province, Zip/PostalCode, Country, Rental_Rate)

29 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-29 Foreign Keys A foreign key is the primary key of one relation that is placed in another relation –It forms a link between the two relations –A foreign key can be atomic (single column) or composite DEPARTMENT (DepartmentName, BudgetCode, ManagerName) EMPLOYEE (EmployeeNumber, EmployeeName,DepartmentName)

30 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-30 Referential Integrity Relations must exhibit integrity in their links to other relations –A foreign key field must contain a value that equals a primary key value in the corresponding relation, or is NULL SKU_DATA (SKU, SKU_Description, Department, Buyer) ORDER_ITEM (OrderNumber, SKU, Quantity, Price, ExtendedPrice) Where ORDER_ITEM.SKU must exist in SKU_DATA.SKU

31 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-31 When might a foreign key value be NULL? A CUSTOMER may have no AGENT An EMPLOYEE may be assigned to no DEPARTMENT (yet)

32 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-32 Normal Forms 1NF – A table that qualifies as a relation is in 1NF 2NF – A relation is in 2NF if all of its nonkey attributes are dependent on every attribute in the primary key 3NF – A relation is in 3NF if it is in 2NF and has no determinants except the primary key Boyce-Codd Normal Form (BCNF) – A relation is in BCNF if every determinant is a candidate key

33 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-33 First Normal Form (1NF) A relation is in 1NF if every attribute is single-valued for each tuple –each cell of the table contains only one value Domains of attributes are atomic –No sets –No lists –No repeating fields or groups allowed

34 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-34 This relation is not in 1NF stuidlastNamemajor creditsstatus ssn S1001 Smith History 90 Sr 100429500 S1003Jones Math 95 Sr 010124567 S1006Lee CSC 15 Fr 088520876 Math S1010 Burns Art 63 Jr 099320985 English S1060 Jones CSC 25 Fr 064624738 (Assume students can have double majors)

35 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-35 Decomposing into 1NF Create a new table for each multi-valued attribute –Put the PK of the original table and the multi-valued attribute in this table –The PK of this new table is composite –Will have additional rows for each value of the attribute Remove the multi-valued attribute from the original table NewStu2(stuId, lastName, credits,status, ssn) Majors(stuId, major)

36 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-36 stuId major S1001History S1003Math S1006CSC S1006Math S1010Art S1010English S1060CSC Two new tables S1001 Smith 90 Sr 100429500 S1003Jones 95 Sr 010124567 S1006Lee 15 Fr 088520876 S1010 Burns 63 Jr 099320985 S1060 Jones 25 Fr 064624738 stuId lastName credits status ssn NewStu2 Majors

37 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-37 Another method for 1NF “Flatten” the original table by making the multi-valued attribute part of a new composite key Student(stuId, lastName, major, credits, status, ssn) –See next slide…

38 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-38 Flattened and in 1NF stuidlastNamemajor creditsstatus ssn S1001 Smith History 90 Sr 100429500 S1003Jones Math 95 Sr 010124567 S1006Lee CSC 15 Fr 088520876 S1006Lee Math 15 Fr 088520876 S1010 Burns Art 63 Jr 099320985 S1010 Burns English 63 Jr 099320985 S1060 Jones CSC 25 Fr 064624738 NewStu Table with PK (stuID, major)

39 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-39 Another (generally not-so-good) method for 1NF If the number of repeats is specified or limited, can make additional columns for multiple values Student(stuId, lastName, major1, major2, credits, status, ssn)

40 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-40 This relation is also not in 1NF

41 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-41 The “flattening” approach works best here But we are still prone to all forms of anomalies, and so we must go on to transform this into 2NF and above.

42 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-42 Second Normal Form (2NF) A relation is in second normal form (2NF) if it is in first normal form and all the non-key attributes are fully functionally dependent on the key. –No non-key attribute is FD on just part of the key –If R’s key has only one attribute (ie, is not composite), and R is 1NF, R is automatically in 2NF

43 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-43 Full Functional Dependency Transforming to 2NF requires that all dependencies within a relation are full functional dependencies –ie., no partial dependencies on the key In relation R, a set of attributes B is fully functionally dependent on set of attributes A if B is functionally dependent on A… –but not functionally dependent on any proper subset of A This means every attribute in A is needed to functionally determine B

44 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-44 NewClass is not in 2NF NewClass(courseNo, stuId, lastName, facId, schedule, room, grade) FDs: (courseNo,stuId) → (lastName) (courseNo,stuId) →(facId) (courseNo,stuId) →(schedule) (courseNo,stuId) →(room) (courseNo,stuId) →(grade) courseNo → facId courseNo → schedule courseNo → room stuId → lastName …plus trivial FDs that are partial… But…these are all partially dependent on the key Looks like we’ve found a primary key

45 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-45 Decomposing into 2NF Identify each partial FD. Create a new relation for each part of the PK that determines other attributes. –Remove attributes that depend on each of these determinants from the original relation & put them in the new relation –I.e., place all determinants in separate relations along with their dependent attributes In the original relation keep the composite key and any attributes that are fully functionally dependent on all of it. Even if the composite key has no dependent attributes, keep that relation to connect logically to the other relations.

46 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-46 Putting NewClass into 2NF NewClass(courseNo, stuId, lastName, facId, schedule, room, grade ) FDs grouped by determinant: courseNo → (courseNo, facId, schedule, room) stuId → (stuId, lastName) (courseNo,stuId) → (courseNo, stuId, facId, schedule, room, lastName, grade) Create tables grouped by determinants: Course(courseNo, facId, schedule, room) Stu(stuId, lastName) Keep relation with original composite key, with attributes FD on it: NewStu2( courseNo, stuId, grade)

47 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-47 2NF - Putting it all together We started with: –NewClass(courseNo, stuId, lastName, facId, schedule, room, grade) It was already in 1NF –We decomposed it into 2NF: Course(courseNo, facId, schedule, room) Stu(stuId, lastName) NewStu2( courseNo, stuId, grade)

48 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-48 Third Normal Form A relation is in 3NF if whenever a non- trivial functional dependency X→A exists, –either X is a superkey or –A is a member of some candidate key To be in 3NF, a relation must be in 2NF and have no transitive dependencies –I.e., no non-key attribute may determine another non-key attribute. –Here key includes “candidate key”

49 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-49 Transitive Dependency If A, B, and C are attributes of relation R, such that A → B, and B → C, then C is transitively dependent on A. NewStudent (stuId, lastName, major, credits, status) FD: credits→status …but credits is not a key By transitivity: stuId→credits AND credits→status implies stuId→status

50 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-50 Decomposing into 3NF NewStudent (stuId, lastName, major, credits, status) FD credits→status Remove the dependent attribute, status, from the relation Create a new table with the dependent attribute and its determinant, credits Keep the determinant in the original table NewStu2 (stuId, lastName, major, credits) Status (credits, status) In 3NF

51 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-51 1NF/2NF/3NF Process Before moving onto BCNF… –Put relation in 1NF Remove all multi-valued attributes/repeating groups/etc. –List all FD’s / find a key –Put relation in 2NF Remove all partial dependencies on key –Put relation in 3NF Remove all transitive dependencies

52 DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-52 Comprehensive example Work(projName, projMgr, empID, hours, empName, budget, startDate, salary, empMgr, empDept, rating) –If not in 1NF, fix it –List all FD’s –Find a key –Remove all partial dependencies (2NF) –Remove all transitive dependencies (3NF)


Download ppt "DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 3-1 David M. Kroenke’s Chapter Three: The Relational Model and Normalization."

Similar presentations


Ads by Google