Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unit 5: Normal Forms Based on Functional Dependencies

Similar presentations


Presentation on theme: "Unit 5: Normal Forms Based on Functional Dependencies"— Presentation transcript:

1 Unit 5: Normal Forms Based on Functional Dependencies
IT238: Data Modeling and Database Design Unit 5: Normal Forms Based on Functional Dependencies Instructor: Qing Yan, M.D., Ph.D.

2 Week 3 Review: Assignment
Chapter 8 – Normal Forms Based on Functional Dependencies

3 Notation Figure 3.2 Summary of Presentation Layer ER diagram notation
From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 3 – Entity-Relationship Modeling

4 Notation (continued) Figure 3.2 Summary of Presentation Layer ER diagram notation (continued) Chapter 3 – Entity-Relationship Modeling

5 The Synthesis Approach to Generate an Initial Set of Entity Types and Attributes
List all discernable data elements and treat them as attributes Group these attributes based on apparent commonalities Designate each cluster of attributes as an entity type Review leftover data elements and investigate the possibility that some of them serve as links among the entity types previously identified Designate these links as relationship types From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 3 – Entity-Relationship Modeling

6 The Analysis Approach to Generate an Initial Set of Entity Types and Attributes
Begin by searching for things that can be labeled by singular nouns and call these things entity types Gather properties that appear to belong to individual entity types and label them as attributes of a particular entity type Be sensitive throughout the process to the identification of relationships among the various entity types From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 3 – Entity-Relationship Modeling

7 Week 3 Assignment Rubric
Chapter 8 – Normal Forms Based on Functional Dependencies

8 Week 3 Assignment Answer
Chapter 8 – Normal Forms Based on Functional Dependencies

9 Week 4 Key Concepts Relational schema A set of relation schemas in a relational data model. Relation state A specific occurrence of a relation (or should we say relation schema). Superkey A set of one or more attributes, which taken collectively, uniquely identifies a tuple of a relation. Superset A set S1 is a superset of another set S2 if every element in S2 is in S1. S1 may have elements that are not in S2. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

10 Week 4 Key Concepts Candidate key A superkey with no proper subsets that are superkeys. Uniqueness property The prohibition of the presence of identical values for the collection of attribute(s) that constitute a unique identifier. Primary key Chosen from among the candidate keys to serve the role of uniquely identifying tuples of a relation schema. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

11 Week 4 Assignment 1. What is a relation schema? What is the difference between a relation, a relation schema, and a relational schema? A relation schema: the basic structure of a relation. It consists of a name, and a set of related attributes. When sets of values are assigned to the attributes that make up a relation schema, a relation is created. A relation is a relation schema with data. A relational schema is a set of relation schemas. A relational schema is also called a relational data model. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

12 Unit 5 Objectives Apply the process of normalization to resolve data redundancy problems in a relation schema Demonstrate the need for dependency preservation during the normalization process Chapter 1 – Database Systems: Architecture and Components

13 Unit 5 To-Do List Complete the reading Textbook and Web
Participate in the discussion board 30 points Attend the introductory seminar or complete FLA quiz 20 points Complete the unit assignment 50 points Chapter 1 – Database Systems: Architecture and Components

14 Unit 5 Assignment Rubric
Chapter 8 – Normal Forms Based on Functional Dependencies

15 Key Concepts Normalization
The process of rendering undesirable functional dependencies desirable. A technique to facilitate systematic validation of the participation of attributes in a relation schema from a perspective of data redundancy. First, second, third and Boyce-Codd normal forms. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 1 – Database Systems: Architecture and Components

16 Key Concepts Normalization in the database design process
To render undesirable functional dependencies desirable. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

17 Key Concepts Functional dependencies – the source
technical translations of user-specified business rules expressed as constraints in a relation schema. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

18 Key Concepts First normal form A relation schema R where the attributes comprising the schema are atomic and single-valued. Second normal form A relation schema R where every non-prime attribute in R is fully functionally dependent on the primary key of R. Third normal form A relation schema R where no non-prime attribute is functionally dependent on another non-prime attribute in R. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 1 – Database Systems: Architecture and Components

19 Key Concepts Boyce-Codd normal form A relation schema R such that for every non-trivial functional dependency in R, the determinant is a superkey of R. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 1 – Database Systems: Architecture and Components

20 Key Concepts Reverse engineering The process of going from a lower level of abstraction (e.g., a normalized relational schema) to a higher level of abstraction (e.g., a Design-specific ER diagram). From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

21 Key Concepts The difference between a desirable and an undesirable functional dependency A desirable functional dependency is any functional dependency in a relation schema R where the determinant is a candidate key of R. Undesirable functional dependencies in a relation schema occur when the determinant of the functional dependency is not a candidate key of R. Functional dependencies of this nature are the seeds of data redundancies that lead to modification anomalies. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

22 Normalization Normalization is a technique that facilitates systematic validation of the participation of attributes in a relation schema from a perspective of data redundancy. Normal Forms (NFs) provide a stepwise progression towards attaining the goal of a fully normalized relation schema that is guaranteed to be free of data redundancies that cause update anomalies from a functional dependency perspective. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

23 Desirable Versus Undesirable FDs
Desirable FDs in a relation schema R are those where the determinant is a candidate key of R. Undesirable FDs in a relation schema R are those where the determinant is not a candidate key of R. That is, the FDs will cause data redundancy and the consequent modification anomalies in R. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

24 Normal Forms: An Overview
A relation schema is said to be in a particular normal form if it satisfies certain prescribed criteria for that normal form. First normal form (1NF) reflects one of the properties of a relation schema – i.e., by definition a relation schema is in 1NF. The normal forms associated with functional dependencies are second (2NF), third (3NF), and Boyce-Codd (BCNF) normal forms. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

25 Normal Forms: An Overview (continued)
The violations of each of these normal forms signal the presence of a specific type of ‘undesirable’ FD. violation of a normal form, can be interpreted as equivalent to an inadvertent mixing up of entity types belonging to two different entity classes in a single entity type. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology.

26 First Normal Form (1NF) First normal form (1NF) imposes conditions so that a base relation that is physically stored as a file does not contain records with a variable number of fields. This is accomplished by prohibiting multi-valued attributes, composite attributes, and combinations thereof in a relation schema. Such a constraint, in effect, prevents relations from containing other relations. In essence, 1NF, by definition, requires that the domain of an attribute must include only atomic values and that the value of an attribute in a relation’s tuple must be a single value from the domain of that attribute. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

27 1NF Violation – An Example
Note 1: Album_no is the primary key of ALBUM Note 2: Artist_nm is a multi-valued attribute causing a first normal form violation. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

28 Resolution of 1NF Violation
From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

29 Second Normal Form (2NF)
A relation schema R is in 2NF if every non-prime attribute in R is fully functionally dependent on the primary key of R. This means a non-prime attribute is not functionally dependent on a proper subset of the primary key of R. The Second Normal Form (2NF) is based on a concept known as full functional dependency. A functional dependency of the form Z  A is a ‘full functional dependency’ if and only if no proper subset of Z functionally determines A. In other words, if Z  A and X  A, and X is a proper subset of Z, then Z does not fully functionally determine A, i.e., Z  A is not a full functional dependency; it is a partial dependency. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

30 Violation of 2NF Note: The primary key and the attributes A, Y, and Z can be atomic or composite. A is a prime attribute, whereas Y and Z are non-prime attributes. In order for a partial dependency to exist here, Attribute A must be a proper subset of the primary key of R1. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

31 2NF Violation - An Example
What is the cause of the 2NF violation in NEW_ALBUM? From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

32 2NF Violation Explained
F: fd1: Album_no  Price; fd2: Album_no  Stock F+: F; fd12: Album_no  {Price, Stock}; fd12x: {Album_no, Artist_nm}  {Price, Stock} Fc = F Candidate Key of NEW_ALBUM: (Album_no, Artist_nm); Primary Key: (Album_no, Artist_nm) fd1 and fd2 (fd12) violate 2NF in NEW_ALBUM From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

33 Modification Anomalies Resulting from 2NF Violation
Suppose we want to change the value of Price or Stock of Album_no BTL007 in NEW_ALBUM Multipe tuples require update and failure to update some will change the semantics of the scenario  update anomaly Suppose we want to add a new tuple (Album_no: XY111, Price: 17.95, and Stock: 100) to NEW_ALBUM Unless we also know the artist(s) the insert is not possible since Artist_nm is part of the primary key of NEW_ALBUM  insertion anomaly Suppose we want to delete Album_no BTL007 from NEW_ALBUM Entails deletion of multiple tuples  deletion anomaly From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

34 Resolution of 2NF Violation
The resolution of 2NF violation is a two-step process that decomposes the target relation schema with the undesirable FDs into multiple relation schemas such that the undesirable FDs are rendered desirable. Pull out the undesirable FD(s) from the target relation schema as separate relation schema(s). Retain the determinant of the pulled-out relation schema as an attribute(s) in the leftover target relation schema to facilitate reconstruction of the original target relation schema. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

35 Resolution of 2NF Violation Demonstrated
Note: Album_no  Price Album_no  Stock Note: No non-trivial FD present From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

36 Third Normal Form – 3NF A relation schema R is in 3NF if it is in 2NF and no non-prime attribute is functionally dependent on another non-prime attribute in R. The Third Normal Form (3NF) is based on the concept of transitive dependency. Given a relation schema R (X, A, B) where X, A, and B are pair-wise disjoint atomic or composite attributes, X is the primary key of R, and A and B are non-prime attributes If A  B (or B  A) in R, then B (or A) is said to be ‘transitively dependent’ on X, the primary key of R. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

37 BCNF Defined A relation schema R is in BCNF if for every non-trivial functional dependency in R, the determinant is a superkey of R Remember: An FD in R is trivial if and only if the dependent is a subset of the determinant. Note: By this definition, violation of 2NF or 3NF also imply violation of BCNF From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

38 Possible Side Effects of Normalization
Attribute Preservation i.e., no attribute should be lost in the decomposition process. Dependency Preservation i.e., any decomposition should continue to preserve the minimal cover of F. Lossless-Join (Non-additive or Non-loss Join) Property i.e., the decomposition should be strictly reversible in that the reversal should strictly yield the original target relation in tact with no loss of tuples or addition of spurious tuples. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

39 Lossless-Join Explained
A decomposition of a relation schema R should be strictly (losslessly) reversible. The term “loss” in “lossless” implies loss of information – not necessarily loss of tuples generation of spurious tuples resulting from the natural join of the projections of R entails loss of information Lossless-join property is always predicated on: a set of FDs (F) prevailing over R the premise that the join attributes in the decomposition are non-null values Lossless Join is also referred to as Non-additive or Non-loss Join From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology.

40 Normal Forms in a Nutshell
1NF 2NF 3NF BCNF Modification Anomalies Yes Yes Yes No (due to FDs) Lossless Join N/A Yes Yes Yes Decomposition Dependency Yes Yes Yes Yes when* Preservation on decomposition * BCNF design that result from 2NF and 3NF resolutions requiring no further normalization. From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology.

41 Summary Notes on Normalization
Normalization eliminates data redundancies that cause modification anomalies by appropriately decomposing the target relation schema. Undesirable FDs in the target relation schema are not discarded, but are rendered “desirable” in the decomposed relational schema. There is no specific merit in resolving 2NF violation before 3NF violation – the ordering is only historical. BCNF subsume 2NF and 3NF in that 2NF and 3NF violations are also BCNF violations. Likewise, a relation schema in BCNF is also in 2NF and 3NF. A BCNF solution that also has the lossless join property and is dependency preserving may not be achievable at all times. In this case a lossless join is preferred and Dependency preservation is accomplished via application programs or materialized views From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

42 Normalization Step 1: Identify the candidate keys of URS1, given F
 (Store, Branch, Customer, Vendor) and (Manager, Customer, Vendor) Step 2: Choose a primary key  (Manager, Customer, Vendor) Step 3: Identify violations of normal forms for every FD in F Step 4: Resolve 2NF and 3NF violations and decompose Step 5: Resolve BCNF violations and decompose From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

43 Normalization (continued)
Step 6: Is the decomposition attribute preserving? Is the decomposition dependency preserving? Does the decomposition exhibit lossless-join property? Cross-check the decomposition! From: Umanath, N.S., & Scamell, R. (2007). Data Modeling and Database Design. Boston: Thomson Course Technology. Chapter 8 – Normal Forms Based on Functional Dependencies

44 Summary The process of normalization to resolve data redundancy problems in a relation schema The need for dependency preservation during the normalization process

45 Q & A Questions?


Download ppt "Unit 5: Normal Forms Based on Functional Dependencies"

Similar presentations


Ads by Google