Normalization Decomposition techniques for ensuring: Lossless joins Dependency preservation Redundancy avoidance We will look at some normal forms: Boyce-Codd.

Slides:



Advertisements
Similar presentations
CS 319: Theory of Databases
Advertisements

Schema Refinement: Normal Forms
Schema Refinement: Canonical/minimal Covers
Lecture 21 CS 157 B Revision of Midterm3 Prof. Sin-Min Lee.
Normalization 1 Instructor: Mohamed Eltabakh Part II.
Logical Database Design (3 of 3) John Ortiz. Lecture 7Logical Database Design (2)2 Normalization  If a relation is not in BCNF or 3NF, we refine it by.
Schema Refinement and Normal Forms Given a design, how do we know it is good or not? What is the best design? Can a bad design be transformed into a good.
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala N ATIONAL I NSTITUTE OF T ECHNOLOGY A GARTALA Aug-Dec,2010 Normalization 2 CSE-503 :: D ATABASE.
1 Design Theory. 2 Minimal Sets of Dependancies A set of dependencies is minimal if: 1.Every right side is a single attribute 2.For no X  A in F and.
603 Database Systems Senior Lecturer: Laurie Webster II, M.S.S.E.,M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 6 A First Course in Database Systems.
Murali Mani Normalization. Murali Mani What and Why Normalization? To remove potential redundancy in design Redundancy causes several anomalies: insert,
©Silberschatz, Korth and Sudarshan Relational Database Design First Normal Form Pitfalls in Relational Database Design Functional Dependencies Decomposition.
7.1 Chapter 7: Relational Database Design. 7.2 Chapter 7: Relational Database Design Features of Good Relational Design Atomic Domains and First Normal.
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 227 Database Systems I Design Theory for Relational Databases.
CS Algorithm : Decomposition into 3NF  Obviously, the algorithm for lossless join decomp into BCNF can be used to obtain a lossless join decomp.
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
Slides adapted from A. Silberschatz et al. Database System Concepts, 5th Ed. Relational Database Design - part 2 - Database Management Systems I Alex Coman,
Normal Form Design addendum by C. Zaniolo. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Normal Form Design Compute the canonical cover.
1 Normalization Chapter What it’s all about Given a relation, R, and a set of functional dependencies, F, on R. Assume that R is not in a desirable.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
Winter 2002Arthur Keller – CS 1804–1 Schedule Today: Jan. 15 (T) u Normal Forms, Multivalued Dependencies. u Read Sections Assignment 1 due. Jan.
Decomposition By Yuhung Chen CS157A Section 2 October
1 Schema Refinement and Normal Forms Chapter 19 Raghu Ramakrishnan and J. Gehrke (second text book) In Course Pick-up box tomorrow.
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E.
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
1 Triggers: Correction. 2 Mutating Tables (Explanation) The problems with mutating tables are mainly with FOR EACH ROW triggers STATEMENT triggers can.
Department of Computer Science and Engineering, HKUST Slide 1 7. Relational Database Design.
Boyce-Codd Normal Form By: Thanh Truong. Boyce-Codd Normal Form Eliminates all redundancy that can be discovered by functional dependencies But, we can.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Database Systems Normal Forms. Decomposition Suppose we have a relation R[U] with a schema U={A 1,…,A n } – A decomposition of U is a set of schemas.
Normalization Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” Formally, R is in BCNF if for every nontrivial FD.
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design Pitfalls in.
Relational Database Design by Relational Database Design by Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING.
Chapter 8: Relational Database Design First Normal Form First Normal Form Functional Dependencies Functional Dependencies Decomposition Decomposition Boyce-Codd.
Schema Refinement and Normalization. Functional Dependencies (Review) A functional dependency X  Y holds over relation schema R if, for every allowable.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Functional Dependencies An example: loan-info= Observe: tuples with the same value for lno will always have the same value for amt We write: lno  amt.
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
Functional Dependencies and Normalization for Relational Databases
Functional Dependencies and Normalization 1 Instructor: Mohamed Eltabakh
Chapter 7: Relational Database Design. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Chapter 7: Relational Database Design First Normal.
Third Normal Form (3NF) Zaki Malik October 23, 2008.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 CS 430 Database Theory Winter 2005 Lecture 8: Functional Dependencies Second, Third, and Boyce-Codd Normal Forms.
Copyright © Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
Normal Forms Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems June 18, 2016 Some slide content courtesy of Susan Davidson.
Advanced Normalization
Schedule Today: Next After that Normal Forms. Section 3.6.
Module 5: Overview of Database Design -- Normalization
CPSC-310 Database Systems
Relational Database Design
CS 480: Database Systems Lecture 22 March 6, 2013.
Relational Database Design
Advanced Normalization
Relational Design Theory
Functional Dependencies and Normalization
CSCI 4333 Database Design and Implementation – Exercise (3)
Relational Design Theory
Designing Relational Databases
Anomalies Boyce-Codd Normal Form 3rd Normal Form
CSCI 6315 Applied Database Systems – Exercise (5)
Functional Dependencies and Normalization
Functional Dependencies and Normalization
Presentation transcript:

Normalization Decomposition techniques for ensuring: Lossless joins Dependency preservation Redundancy avoidance We will look at some normal forms: Boyce-Codd Normal Form (BCNF) 3rd Normal Form (3NF)

Boyce-Codd Normal Form (BCNF) What is a normal form? Characterization of schema decomposition in terms of properties it satisfies BCNF: guarantees no redundancy Defined: relation schema R, with FD set, F is in BCNF if: For all nontrivial X  Y in F+: X  R (i.e. X a superkey)

BCNF Example: R=(A, B, C) F = (A  B, B  C) Is R in BCNF? Ans: Consider the non-trivial dependencies in F+: A  B, A  R (A a key) A  C, -//- B  C, B-/-> R (B not a superkey) Therefore not in BCNF

BCNF Example: R= R1 U R2 R1 = (A, B), R2 = (B,C) F = (A  B, B  C) Are R1, R2 in BCNF? Ans: Yes, both non-trivial FDs define a key in R1, R2 Is the decomposition lossless? DP? Ans: Losless: Yes. DP: Yes.

BCNF Decomposition Algorithm Algorithm BCNF(R: relation, F: FD set) Begin 1. Compute F+ 2. Result  {R} 3. While some R i in Result not in BCNF Do a. Chose (X  Y) in F+ s.t. (X  Y) covered by R i X -/-> R i ( X not a superkey for R i ) b. Decompose R i on (X  Y) R i1  X U Y R i2  R i - Y c. Result  Result - {R i } U { R i1, R i2 } 4. return Result End

BCNF Decomposition Example: R= (A, B, C, D) F = (A  B, AB  D, B  C) Decompose R into BCNF Ans: Fc = {A  BD, B  C} R=(A, B, C, D) B  C is covered by R and B not a superkey R1 = (B,C) In BCNF: B  C and B key R2 = (A, B, D) In BCNF: A  B, A  D, A  BD and A is a key

BCNF Decomposition Example: R = (bname, bcity, assets, cname, lno, amt) F = { bname  bcity assets, lno  amt bname} Decompose R into BCNF Ans: Fc = F R, and bname  bcity covered by R, bname not a key R1 = (bname, bcity) In BCNF R2 = (bname, assets, cname, lno, amt) lno  amt bname covered by R2 and lno not a key R3 = (lno, amt, bname) In BCNF R4 = (assets, cname, lno) lno  assets.... R5 = (lno, assets) R6=(lno, cname) key= superkey here Not DP! bname  assets is not covered by any relation AND cannot be implied by the covered FDs. Covered FDs: G = { bname  bcity, lno  amt bname, lno  assets}

BCNF Decomposition Can there be > 1 BCNF decompositions? Ans: Yes, last example was not DP. But... Given Fc = { bname  bcity assets, lno  amt bname} R=(bname, bcity, assets, cname, lno, amt) bname  bcity assets and bname -/-> R R1 = (bname, bcity, assets) BCNF: bname  R1 R2 = (bname, cname, lno, amt) lno  amt bname, lno -/-> R2 R3 = (lno, amt, bname) BCNF: lno  R3 R4 = (lno, cname) Is R = R1 U R3 U R4 DP? Yes!!

BCNF Decomposition Can we decompose on FD’s in Fc to get a DP, BCNF decomposition? Usually, yes, but... Consider: R = (J, K, L) F = (JK  L, L  K} (Fc = F) We can apply decomposition either using: JK  L, L  K or the oposite Dec. #1 Using L  K R1 = (L, K) R2 = (J, L) Not DP. Dec. #2 Using JK  L R1 = (J, K, L) not BCNF R2 = (J, K) So, BCNF and DP decomposition may not be possible.

Aside Is the example realistic? Consider: BankerName  BranchName BranchName CustomerName  BankerName

3NF: An alternative to BCNF Motivation: sometimes, BCNF is not what you want E.g.: street city  zip and zip  city BCNF: R1 = { zip, city} R2 ={ zip, street} No redundancy, but to preserve 1st FD requires assertion with join Alternative: 3rd Normal Form Designed to say that decomposition can stop at {street, city, zip}

3NF: An alternative to BCNF BCNF test: Given R with FD set, F: For any non-trivial FD, X  Y in F+ and covered by R, then X  R 3NF test: Given R with FD set, F: For any non-trivial FD, X  Y in F+ and covered by R, then X  R or Y is a subset of some candidate key of R Thus, 3NF a weaker normal form than BCNF: i.e. R in BCNF => R in 3NF but R in 3NF =/=> R in BCNF (not sure than R is in BCNF)

3NF: An alternative to BCNF Example: R=(J, K, L) F = {JK  L, L  K} then R is 3NF! Key for R: JK JK  L covered by R, JK  R L  K, K is a part of a candidate key

3NF Example: R=(bname, cname, lno, amt) F=Fc = { lno  amt bname, cname bname  lno} Q: is R in BCNF, 3NF or neither? Ans: R not in BCNF: lno  amt, covered by R and lno -/->R R not in 3NF: candidate keys of R: lno cname or cname bname lno  amt bname covered by R {amt bname} not a subset of a candidate key

3NF Example: R = R1 U R2 R1 = (lno, amt, bname) R2 = (lno, cname, bname), F=Fc = { lno  amt bname, cname bname  lno} Q: Are R1, R2 in BCNF, 3NF or neither? Ans: R1 in BCNF : lno  amt bname covered by R1 and lno  R1 R2 not in BCNF: lno  bname and lno-/-> R2 R1 in 3NF (since it is in BCNF) R2 in 3NF: R2’s candidate keys: cname bname and lno cname lno  bname, bname subset of a c.key cname bname  lno, lno subset of a c. key

3NF Decomposition Algorithm Algorithm 3NF ( R: relation, F: FD set) 1. Compute Fc 2. i  0 3. For each X  Y in Fc do if no Rj (1 <= j <=i) contains X,Y i  i+1 Ri  X U Y 4. If no Rj (1<= j <= i) contains a candidate key for R i  i+1 Ri  any candidate key for R 5. return (R1, R2,..., Ri)

3NF Decomposition Example Example: R = ( bname, cname, banker, office) Fc = { banker  bname office, cname bname  banker} Q1: candidate keys of R: cname bname or cname banker Q2: decompose R into 3NF. Ans: R is not in 3NF: banker  bname office {bname, office} not a subset of a c. key 3NF: R1 = (banker, bname, office) R2 = (cname, bname, banker) R3 = ? Empty (done)

Theory and practice Performance tuning: Redundancy not the sole guide to decomposition Workload matters too!! nature of queries run mix of updates, queries..... Workload can influence: BCNF vs 3NF may further decompose a BCNF into (4NF) may denormalize (i.e., undo a decomposition or add new columns)