Download presentation
Presentation is loading. Please wait.
Published byOliver Turner Modified over 8 years ago
1
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19
2
Dr. Nan Wang The Evils of Redundancy Redundancy causes several problems associated with relational schemas: – Redundant storage – Insert anomalies – Delete anomalies – Update anomalies – What are the problems in the following schema? Hourly_Emps (ssn, name, lot, rating, hourly_wages, hours_worked)
3
Dr. Nan Wang Example Hourly_Emps (ssn, name, lot, rating, hourly_wages, hours_worked) Abbreviate an attribute name to a single letter Refer to a relation schema by a string of letters, one per attribute. In the example, hourly_wages is determined by rating. This IC is an example of a functional dependency (R W). What does it lead to?
4
Dr. Nan Wang Example Problems: – Redundant storage – Update anomaly: Can we change W in just the 1st tuple of SNLRWH? – Solutions to the anomalies?
5
Dr. Nan Wang Example Problems: – Insertion anomaly: What if we want to insert an employee and don’t know the hourly wage for his rating? – Solution? Null values What if the s is unknow?
6
Dr. Nan Wang Example Problems: – Deletion anomaly: If we delete all employees with rating 5, we lose the information about the wage for rating 5! – Solutions?
7
Dr. Nan Wang Fighting the Evils of Redundancy Redundancy is at the root of evil - true or not? Integrity constraints, in particular functional dependencies, can be used to identify schemas with such problems and to suggest refinements. R W Decomposition(replace SNLRWH with SNLRH and RW) Hourly_Emps2 Wages
8
Dr. Nan Wang Fighting the Evils of Redundancy R W Decomposition(replace SNLRWH with SNLRH and RW) Do we solve all this problems? – Redundant storage – Insert anomalies – Delete anomalies – Update anomalies Hourly_Emps2 Wages
9
Dr. Nan Wang Fighting the Evils of Redundancy Main refinement technique: decomposition (replacing ABCD with, say, AB and BCD, or ACD and ABD). Problems related to decomposition: – Is there reason to decompose a relation? Normal forms help us to decide whether or not to decompose a relation in further. How? – What problems (if any) does the decomposition cause? Lossless join decomposition Dependency-preservation decomposition
10
Dr. Nan Wang Properties of decomposition Lossless join decomposition – Enable us to recover any instance of the decomposed relation from corresponding instances of the smaller relations. Dependency-preservation decomposition – Enable us to enforce any constraints on the original relation by simple enforcing some constraints on each of smaller relations. – Key constrains Hourly_Emps2 Wages
11
Dr. Nan Wang Good? 11 – List N, W, H – How?
12
Dr. Nan Wang Good DB designer Normal forms and what problems they do or do not alleviate? Decomposition and what problems with decompositions. Questions: –Is a relation in a given normal form? –Is a decomposition dependency preserving? 12
13
Dr. Nan Wang Functional Dependencies (FDs) A functional dependency X Y holds over relation R if, for every allowable instance r of R: – t1 r, t2 r, t1.X =t2.X implies t1.Y =t2.Y – i.e., given two tuples in r, if the X values agree, then the Y values must also agree. (X and Y are sets of attributes.)
14
Dr. Nan Wang Functional Dependencies (FDs) An FD is a statement about all allowable relations. – Must be identified based on semantics of application. – Given some allowable instance r1 of R, we can check if it violates some FD f, but we cannot tell if f holds over R! K is a candidate key for R means that K R – However, K R does not require K to be minimal!
15
Dr. Nan Wang FDs- example 15 FD: AB C Question: Do all tuples in the relation satisfy the Functional Dependents AB C Can we add a tuple to the relation? Why?
16
Dr. Nan Wang exercise List all the functional dependencies that this relation instance satisfies. Z → Y, X → Y, and XZ → Y 16
17
Dr. Nan Wang Example: Constraints on Entity Set Consider relation obtained from Hourly_Emps: – Hourly_Emps (ssn, name, lot, rating, hrly_wages, hrs_worked) Notation: We will denote this relation schema by listing the attributes: SNLRWH – This is really the set of attributes {S,N,L,R,W,H}. – Sometimes, we will refer to all attributes of a relation by using the relation name. (e.g., Hourly_Emps for SNLRWH) Some FDs on Hourly_Emps: – ssn is the key: S SNLRWH – rating determines hrly_wages: R W
18
Dr. Nan Wang Example (Contd.) Problems due to R W : – Update anomaly: Can we change W in just the 1st tuple of SNLRWH? – Insertion anomaly: What if we want to insert an employee and don’t know the hourly wage for his rating? – Deletion anomaly: If we delete all employees with rating 5, we lose the information about the wage for rating 5! Hourly_Emps2 Wages Will 2 smaller tables be better?
19
Dr. Nan Wang Reasoning About FDs Given some FDs, we can usually infer additional FDs: –For example: –Workers(ssn,name,lot,did,since) – Ssn did holds, – did lot holds, implies ssn lot An FD f is implied by a set of FDs F if f holds whenever all FDs in F hold. –F + = closure of F is the set of all FDs that are implied by a given set F of FDs.
20
Dr. Nan Wang Closure of a set of FDs Infer or compute the closure of a given set of F of FDs –Following three rules (Armstrong’s Axioms) Armstrong’s Axioms (X, Y, Z are sets of attributes): – Reflexivity: If X Y, then Y X – Augmentation: If X Y, then XZ YZ for any Z – Transitivity: If X Y and Y Z, then X Z These are sound and complete inference rules for FDs!
21
Dr. Nan Wang Soundness and Completeness Completeness: given F, the rules allows us to determine all dependencies in Soundness: we cannot generate FDs not in Addition rules: –Union: if X Y, X Z then X YZ –Decomposition: if X YZ, then X Y and X Z
22
Dr. Nan Wang Soundness and Completeness R=(A,B,C) and F={A B, B C} –Reflexivity: if Y is subset of X, then X Y –Transitivity : if X Y and Y Z then X Z A C –Augmentation : if X Y then XZ YZ for any Z AC BC AB BC AB AC –A BC F + ={A B, B C, A C, AC BC,AB AC. …}
23
Dr. Nan Wang Reasoning About FDs FD is a statement about the world as we understand it – Ssn did, did lot implies ssn lot Is it correct to state that “Since X functionally determines Y, if we know X, we know Y.” Or “if X Y, then X identifies Y”
24
Dr. Nan Wang Reasoning About FDs (Contd.) Couple of additional rules (that follow from AA): – Union: If X Y and X Z, then X YZ – Decomposition: If X YZ, then X Y and X Z Example: Contracts(cid,sid,jid,did,pid,qty,value), and: – C is the key: C CSJDPQV – Project purchases each part using single contract: JP C – Dept purchases at most one part from a supplier: SD P
25
Dr. Nan Wang Reasoning About FDs (Contd.) C is the key: C CSJDPQV Project purchases each part using single contract: JP C Dept purchases at most one part from a supplier: SD P JP C, C CSJDPQV imply JP CSJDPQV SD P implies SDJ JP SDJ JP, JP CSJDPQV imply SDJ CSJDPQV
26
Dr. Nan Wang Exercise Consider a relation R with five attributes ABCDE. You are given the following dependencies: A→B, BC→E, and ED→A. List all keys for R. CDE, ACD, BCD 26
27
Dr. Nan Wang Normal Forms Returning to the issue of schema refinement, the first question to ask is whether any refinement is needed! If a relation is in a certain normal form (BCNF, 3NF etc.), it is known that certain kinds of problems are avoided/minimized. This can be used to help us decide whether decomposing the relation will help.
28
Dr. Nan Wang Normalization Often performed as a series of tests on a relation to determine whether it satisfies or violates the requirements of a given normal form. Four most commonly used normal forms are first (1NF), second (2NF), third (3NF) and Boyce- Codd (BCNF) normal forms. Based on functional dependencies among the attributes of a relation. A relation can be normalized to a specific form to prevent the possible occurrence of update anomalies 28
29
Dr. Nan Wang First Norm Forms(1NF) There are no repeating or duplication fields. Each cell contains only a single value. Each record is unique. –Identified by primary key 29
30
Dr. Nan Wang 30
31
Dr. Nan Wang Second Normal Forms (2NF) All non-key fields depend on all components of the primary key. –Guaranteed when primary key is a single field. 31
32
Dr. Nan Wang 32
33
Dr. Nan Wang Third Normal Forms (3NF) No non-key field depends on another. –All non-key fields depend only on primary key. 33
34
Dr. Nan Wang 34
35
Dr. Nan Wang 35
36
Dr. Nan Wang 36
37
Dr. Nan Wang R(ABCD) 37
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.