Functional Dependency and Normalization

Slides:



Advertisements
Similar presentations
Functional Dependencies and Normalization for Relational Databases
Advertisements

primary key constraint foreign key constraint
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 16 Relational Database Design Algorithms and Further Dependencies.
NORMALIZATION. Normalization Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Ch 10, Functional Dependencies and Normal forms
Functional Dependencies and Normalization for Relational Databases.
Copyright © 2004 Pearson Education, Inc.. Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for Relational Databases
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
METU Department of Computer Eng Ceng 302 Introduction to DBMS Functional Dependencies and Normalization for Relational Databases by Pinar Senkul resources:
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Announcements Homework 1 due Friday. Slip it under my office door (1155) or put in my mailbox on 5 th floor. Program 2 has been graded ;-( Program 3 out.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 10 Functional Dependencies and Normalization for Relational Databases.
FUNCTIONAL DEPENDENCIES. Chapter Outline 1 Informal Design Guidelines for Relational Databases 1.1Semantics of the Relation Attributes 1.2 Redundant Information.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 6 NORMALIZATION FOR RELATIONAL DATABASES Instructor Ms. Arwa Binsaleh.
King Saud University College of Computer & Information Sciences Computer Science Department CS 380 Introduction to Database Systems Functional Dependencies.
DatabaseIM ISU1 Chapter 10 Functional Dependencies and Normalization for RDBs Fundamentals of Database Systems.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Topic 10 Functional Dependencies and Normalization for Relational Databases Faculty of Information Science and Technology Mahanakorn University of Technology.
Instructor: Churee Techawut Functional Dependencies and Normalization for Relational Databases Chapter 4 CS (204)321 Database System I.
Top-Down Database Design Mini-world Requirements Conceptual schema E1 E2 R Relation schemas ?
Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter Functional Dependencies and Normalization for Relational Databases.
1 Functional Dependencies and Normalization Chapter 15.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Lecture 8: Database Concepts May 4, Outline From last lecture: creating views Normalization.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
Dr. Mohamed Osman Hegaz1 Logical data base design (2) Normalization.
14-1 Chapter 14 Functional Dependencies and Normalization for Relational Database.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 340 Introduction to Database Systems.
Al-Imam University Girls Education Center Collage of Computer Science 1 st Semester, 1432/1433H Chapter 10_part 1 Functional Dependencies and Normalization.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Functional Dependencies and Normalization for Relational Databases تنبيه : شرائح العرض (Slides) هي وسيلة لتوضيح الدرس واداة من الادوات في ذلك. حيث المرجع.
10/3/2017.
10/3/2017.
COP 6726: New Directions in Database Systems
Normalization Database Management Systems, 3rd ed., Ramakrishnan and Gehrke, Chapter 19.
CHAPTER 14 Basics of Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for Relational Databases
Normalization Functional Dependencies Presented by: Dr. Samir Tartir
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for RDBs
Relational Database Design by Dr. S. Sridhar, Ph. D
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for Relational Databases
Database Management systems Subject Code: 10CS54 Prepared By:
Normalization Murali Mani.
Outline: Normalization
Normalization.
Normalization DB Design Guidelines Presented by: Dr. Samir Tartir
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
CS 405G: Introduction to Database Systems
Normalization February 28, 2019 DB:Normalization.
Sampath Jayarathna Cal Poly Pomona
Relational Database Design
Chapter Outline 1 Informal Design Guidelines for Relational Databases
Presentation transcript:

Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.

Informal Design Guidelines Semantics of relations and attributes. Guideline 1: Design a relation schema so that it is easy to explain its meaning. (Fig. 14.1, 14.2) Do not combine attributes from multiple entity types and relationship types into single relation. (Fig. 14.3) Reducing redundant values in tuples saves storage space and avoid update anomalies. (Fig. 14.4) - Insertion anomalies. - Deletion anomalies. - Modification anomalies. Guideline 2: Design the base relation schemas so that no insertion, deletion, or modification anomalies occur.

Insert Anomalies Inserting a dept with no employee info – null values need to assign, which will create problems Inconsistency problem with insertion of new tuple Deletion Anomalies – If we delete last employee, dept info is deleted. -Modification anomalies – if we change manager of department 5, we must update all the tuples

Figure 14.1 Simplified version of the COMPANY relational database schema.

Figure 14.2 Example relations for the schema of Figure 14.1

Figure 14. 3 Two relation schemas and their functional dependencies Figure 14.3 Two relation schemas and their functional dependencies. Both suffer from update anomalies. (a) The EMP_DEPT relation schema. (b) The EMP_PROJ relation schema.

Figure 14. 4 Example relations for the schemas in Figure 14 Figure 14.4 Example relations for the schemas in Figure 14.3 that result from applying NATURAL JOIN to the relations in Figure 14.2.

Figure 14. 5 Alternative (bad) representation of the EMP_PROJ relation Figure 14.5 Alternative (bad) representation of the EMP_PROJ relation. (a) Representing EMP_PROJ of Figure 14.3(b) by two relation schemas: EMP_LOCS and EMP_PROJ1. (b) Result of projecting the populated relation EMP_PROJ of Figure 14.4 on the attributes of EMP_LOCS and EMP_PROJ1.

Figure 14.5 (continued)

Figure 14.6 Result of applying the NATURAL JOIN operation to the tuples above dotted lines in EMP_PROJ1 and EMP_LOCS, with generated spurious tuples marked by an asterisk.

Informal Design Guidelines Reducing the null values in tuples. e.g., if 10% of employees have offices, it is better to have a separate relation, EMP_OFFICE, rather than an attribute OFFICE_NUMBER in EMPLOYEE. Guideline 3: Avoid placing attributes in a base relation whose values are mostly null. Disallowing spurious tuples. - Spurious tuples: tuples that are not in the original relation but generated by natural join of decomposed subrelations. - Example: decompose EMP_PROJ into EMP_LOCS and EMP_PROJ1. (Fig. 14.5) - natural join of EMP_LOCS and EMP_PROJ1 results in spurious tuples. (Fig. 14.6) Guideline 4: Design relation schemas so that they can be naturally JOINed on primary keys or foreign keys in a away that guarantees no spurious tuples are generated.

Functional Dependencies A functional dependency, denoted by X  Y, between two sets of attributes X and Y (X and Y are subsets of R) specifies a constraint on the possible tuples that can form a relation instance r of R: for any two tuples t1 and t2 in r such that t1[X]= t2[X], we must have t1[Y]= t2[Y]. If X  Y, we say X functionally determines Y or Y is functionally dependent on X. We abbreviate functional dependency by FD. X is called the left-hand side of the FD. Y is called the right-hand side of the FD. A functional dependency is a property of the meaning or semantics of the attributes, I.e., a property of the relation schema. They must hold on all relation states (extensions) of R. Relation extensions r(R) that satisfy the FD are called legal extensions.

Figure 14.7 The teach relation state with an apparent functional dependency text  COURSE. However, COURSE  TEXT is ruled out.

Functional Dependencies (Cont.) Examples. 1. SSN  ENAME 2. PNUMBER  {PNAME, PLOCATION} 3. {SSN, PNUMBER}  HOURS 4. Others? Diagrammatic notation for displaying FDs. (Fig. 14.3) FD is property of the relation schema R, not of a particular relation state/instance r(R). FDs cannot be inferred from a given relation extension r, but must be defined explicitly by someone who knows the semantics of the attributes of R. (Fig. 14.7)

Figure 14. 3 Two relation schemas and their functional dependencies Figure 14.3 Two relation schemas and their functional dependencies. Both suffer from update anomalies. (a) The EMP_DEPT relation schema. (b) The EMP_PROJ relation schema.

Functional Dependencies (Cont) From the FDs: F = {SSN  { ENAME, BDATE, ADDRESS, DNUMBER}, DNUMBER  {DNAME, DMGRSSN}} we can infer the following FDs: SSN  {ENAME, DMGRSSN}, SSN  SSN, DNUMBER  DNAME A FD X  Y is inferred from a set of dependencies F specified on R if X  Y holds in every relation state r that is a legal extension of R. F |= X  Y denotes X  Y is inferred from F. The closure of F, denoted by F+, is the set of all FDs that can be inferred from F.

Functional Dependencies (Cont.) Inference rules for FDs. Abbreviated notation: XYZ  UV for {X, Y, Z}  {U, V} Reflective: If Y  X, then X  Y Augmentation: {X  Y} |= XZ  YZ Transitive: {X  Y, Y  Z} |= X  Z Decomposition (projective): {X  YZ} |= X  Y Union (additive): {X  Y, X  Z} |= X  YZ Pseudotransitive: {X  Y, WY  Z} |= WX  Z The first three rules are sound and complete, called Armstrong's inference rules.

Functional Dependencies (Cont.) Closure of X under F, denoted by X+, is the set of all attributes that are functionally determined by X under F. Algorithms for determining X+ X+ := X; repeat oldX+ := X+; for each FD Y  Z in F do if Y  X+ then X+ :=X+  Z; until oldX+ = X+; Example: F = {SSN  ENAME, PNUMBER {PNAME, PLOCATION}, {SSN, PNUMBER}  HOURS} {SSN}+ = {SSN, ENAME} {PNUMBER}+ = ? {SSN, PNUMBER}+ = ?

Functional Dependencies (Cont.) Equivalence of sets of FDs. E is covered by F if every FD in E is also in F+, i.e., every FD in E can be inferred from F. E and F are equivalent if E+ = F+, i.e, E covers F and F covers E. F is minimal if - every dependency in F has a single attribute for its right hand side; - we cannot remove any FD from F and still have a set of FDs equivalent to F; - we cannot replace any FD X  A in F with a FD Y  A where Y  X and still have a set of FDs equivalent to F. Minimal set: a standard or canonical form with no redundancies. A minimal cover of F is a minimal set of dependencies, Fmin, that is equivalent to F.

Functional Dependencies (Cont.) Compute a minimal cover Algorithm 14.2 Find a minimal cover G for F. 1. G := f; 2. Replace each FD X  A1, A2,…, AK in G by the k FDs X  A1, X  A2, X  AK; 3. for each FD X  A in G for each attribute B X if (X – B)+ with-respect-to G contains A then replace X  A with X – {B}  A in G; 4. For each FD X  A in G if X+ with-respect-to G-{X  A} contains A then remove X  A from G; There is at least one minimal cover for any F, maybe several.

Normal Forms Superkey, candidate key or key, primary key. A FD X  Y is a full functional dependency if removal of any attribute from X means that the dependency does not hold any more; otherwise, it is a partial functional dependency. An attribute is prime if it is a member of any key (Primary or candidate). A relation R is in first normal form if domains of attributes include only atomic values. (Fig. 14.8, 14.9) A relation R is in second normal form if every non-prime attribute A in R is not partially dependent on any key of R. Alternatively, R is in 2NF if every non-prime attribute A in R is fully dependent on every key of R. Examples. (Fig. 14.10 a, b)

Figure 14. 8 Normalization into 1NF Figure 14.8 Normalization into 1NF. (a) Relational schema that is not in 1NF. (b) Example relation instance. (c) 1NF relation with redundancy.

Figure 14. 9 Normalizing nested relations into 1NF Figure 14.9 Normalizing nested relations into 1NF. (a) Schema of the EMP_PROJ relation with a “nested relation” PROJS. (b) Example extension of the EMP_PROJ relation showing nested relations within each tuple.

Figure 14.9 (continued) (c) Decomposing EMP_PROJ into 1NF relations EMP_PROJ1 and EMP_PROJ2 by propagating the primary key.

Figure 14. 10 The normalization process Figure 14.10 The normalization process. (a) Normalizing EMP_PROJ into 2NF relations. (b) Normalizing EMP_DEPT into 3NF relations.

Normal Forms A relation R is in third normal form if for every FD X  A that holds on R, either - X is a superkey of R, or - A is a prime attribute of R. (Alternative Def . - No transitive dependencies – If there is a set of attributes Z that is neither a candidate key nor a subset of any key (primary or candidate) of R , X  Z and Z  Y holds. SSN  DMGRSSN is transitive as SSN  Dnumber  DMGRSSN (Emp-dept) and dnumber is neither a key nor a subset of key. Example. (Fig. 14.10 c) A relation R is in Boyce-Codd normal form if for every FD X  A that holds on R, X is a superkey of R. Example. (Fig. 14.12) Increasing Order of restrictiveness: 1NF, 2NF, 3NF, BCNF. For example, if a relation schema R is in BCNF, it is in 3NF.

Figure 14. 11 Normalization to 2NF and 3NF Figure 14.11 Normalization to 2NF and 3NF. (a) The lots relation schema and its functional dependencies FD1 through FD4. (b) Decomposing lots into the 2NF relations LOTS1 and LOTS2.

Figure 14.11 (continued) (c) Decomposing LOTS1 into the 3NF relations LOTS1A and LOTS1B. (d) Summary of normalization of lots.

Figure 14. 12 Boyce-Codd normal form Figure 14.12 Boyce-Codd normal form. (a) BCNF normalization with the dependency of FD2 being “lost” in the decomposition. (b) A relation R in 3NF but not in BCNF.

Figure 14.13 A relation TEACH that is in 3NF but not in BCNF.

Normalization Database design revisited. Top-down approach – conceptual design. A more purist way – decomposition. Normalization: a process in which unsatisfactory relational schemas are decomposed into smaller relation schemas that possess desirable properties. Starting with a single universal relation schema R = A1, A2,…. An that includes all the attributes of the database. Decompose R into a set of relation schemas D ={R1, R2,… Rm} using the FDs specified by the database designers. D is called a decomposition of R. Guidelines for normalization: normal forms, attribute preservation, dependency preservation, lossless join.

Normalization (Cont.) Attribute preservation. No attributes are lost. m U Ri = R i=1 Dependency preservation. (F(R1)  F(R2)  …….  F(Rm) )+ = F+ where F(R1) is the set of FDs, X  Y , in F+ such that X  Y  Ri. A decomposition D={R1, R2,…., Rm} of R has the lossless join property with respect to the set of dependencies F on R if, for every relation state r of R that satisfies F, *(<R1>(r),…, <Rm>(r)) = r where <Ri> are the attributes in Ri.

Normalization (Cont.) Decomposition into 3NF relation schemas Algorithm 15.1 Dependency-preserving and lossless decomposition into 3NF relation schemas. 1. Find a minimal cover G for F (Algorithm 14.2) 2. For each left-hand side X of a FD in G create a relation schema {X  A1  A2 …  Ak} in D where X  A1, X  A2,…., X  Ak are the only dependencies in G with X as left-hand side; 3. Place any remaining (unplaced) attributes in a single relation schema; 4. If none of the relation schemas contains a key of R, create one more relation schema that contains attributes that form a key for R.

Normalization (Cont.) Determine a key Algorithm 15.4a Find a key K for R. 1. K := R; 2. For each attribute A in K if (K – {A})+ with-respect-to F contains A then remove A from K; Example. (Fig. 14.11) It is not always possible to find a decomposition that preserves dependencies and in BCNF. (Fig. 14.12) The lossless join decomposition is based on the assumption that no null values are allowed for the join attributes.