Some slides are from Dr. Sara Cohen

Slides:



Advertisements
Similar presentations
Schema Refinement: Normal Forms
Advertisements

Schema Refinement: Canonical/minimal Covers
Logical Database Design (3 of 3) John Ortiz. Lecture 7Logical Database Design (2)2 Normalization  If a relation is not in BCNF or 3NF, we refine it by.
1 Design Theory. 2 Minimal Sets of Dependancies A set of dependencies is minimal if: 1.Every right side is a single attribute 2.For no X  A in F and.
Properties of Armstrong’s Axioms Soundness All dependencies generated by the Axioms are correct Completeness Repeatedly applying these rules can generate.
CS Algorithm : Decomposition into 3NF  Obviously, the algorithm for lossless join decomp into BCNF can be used to obtain a lossless join decomp.
1 Design Theory. 2 Let U be a set of attributes and F be a set of functional dependencies on U. Suppose that X  U is a set of attributes. Definition:
Classroom Exercise: Normalization
Normalization DB Tuning CS186 Final Review Session.
Normalization DB Tuning CS186 Final Review Session.
Design Theory.
1 Normalization Chapter What it’s all about Given a relation, R, and a set of functional dependencies, F, on R. Assume that R is not in a desirable.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
Decomposition By Yuhung Chen CS157A Section 2 October
Schema Refinement and Normalization Nobody realizes that some people expend tremendous energy merely to be normal. Albert Camus.
Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E.
1 Triggers: Correction. 2 Mutating Tables (Explanation) The problems with mutating tables are mainly with FOR EACH ROW triggers STATEMENT triggers can.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
Database Systems Normal Forms. Decomposition Suppose we have a relation R[U] with a schema U={A 1,…,A n } – A decomposition of U is a set of schemas.
Normal Forms1. 2 The Problems of Redundancy Redundancy is at the root of several problems associated with relational schemas: Wastes storage Causes problems.
Chapter 8: Relational Database Design First Normal Form First Normal Form Functional Dependencies Functional Dependencies Decomposition Decomposition Boyce-Codd.
Schema Refinement and Normalization. Functional Dependencies (Review) A functional dependency X  Y holds over relation schema R if, for every allowable.
DAVID DENG CS157B MARCH 23, 2010 Dependency Preserving Decomposition.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Functional Dependencies An example: loan-info= Observe: tuples with the same value for lno will always have the same value for amt We write: lno  amt.
THIRD NORMAL FORM (3NF) A relation R is in BCNF if whenever a FD XA holds in R, one of the following statements is true: XA is a trivial FD, or X is.
Functional Dependencies. FarkasCSCE 5202 Reading and Exercises Database Systems- The Complete Book: Chapter 3.1, 3.2, 3.3., 3.4 Following lecture slides.
CHAPTER 19 SCHEMA, REFINEMENT AND NORMAL FORMS
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Design Theory for RDB Normal Forms.
Relational Database Design (Discussion Session)
Schema Refinement and Normal Forms
CS411 Database Systems 08: Midterm Review Kazuhiro Minami 1.
Problem Axiom of addition:
Normalization First Normal Form (1NF) Boyce-Codd Normal Form (BCNF)
Relational Database Design by Dr. S. Sridhar, Ph. D
Relational Database Design
The Closure of a set of Attributes
CS 480: Database Systems Lecture 22 March 6, 2013.
3.1 Functional Dependencies
Handout 4 Functional Dependencies
Dependency Preservation
Schema Refinement & Normalization Theory
Functional Dependencies and Normalization
Schema Refinement and Normalization
Lecture 6: Design Theory
Module 5: Overview of Normalization
Functional Dependencies and Normalization
Functional Dependencies and Normalization
Cse 344 May 16th – Normalization.
Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design October 12 & 15, 2007.
Relational Data Base Design in Practice
Normalization Part II cs3431.
Lecture 8: Database Design
Lecture 07: E/R Diagrams and Functional Dependencies
CS 405G: Introduction to Database Systems
Relational Database Design
Designing Relational Databases
Schema Refinement and Normalization
Instructor: Mohamed Eltabakh
Chapter 7a: Overview of Database Design -- Normalization
Functional Dependencies and Normalization
Functional Dependencies and Normalization
Lecture 09: Functional Dependencies
CS4222 Principles of Database System
Presentation transcript:

Some slides are from Dr. Sara Cohen Design Theory Some slides are from Dr. Sara Cohen

Overview Starting Point: Set of functional dependencies that describe real-world constraints Goal: Create tables that do not contain redundancies, so that there is less wasted space there is less of a chance to introduce errors in the database

Design Theory Armstrong's axioms defined, so that we can derive functional dependencies Need to identify a key: find a single key find all keys Both algorithms use as a subroutine an algorithm that computes the closure. In class a polynomial algorithm was given. A linear algorithm will be shown.

Compute Closure in Linear Time

Closure of a Set of Attributes Let U be a set of attributes and F be a set of functional dependencies on U. Suppose that X  U is a set of attributes. Definition: X+ = { A | F X  A} We would like to compute X+ |=

Algorithm From Class Compute Closure(X, F) C := X While there is a V  W in F such that (V  C)and (W  C) do C := C  W 3.Return C Complexity O(|U||F|)

Example R=ABCDE F={ABC, CEB, DA, BCE} {A}+ = {A,B}+ = {B,D}+=

A More Efficient Algorithm We start by creating a table, with a row for each FD and a column for each attribute. The table will have 2 additional columns called size and tail. In the row for a dependency X Y, there will be the value true in each column corresponding to an attribute in X. The size column will contain the size of the set X. The tail column will contain Y.

Example Table A B C D E Size Tail A → C B → D AD → E  F = {A → C, B → D, AD → E} A B C D E Size Tail A → C  1 B → D AD → E 2

Compute Closure(X, F, T) /* T is the table, n is the number of FDs in F */ C := X Q := X While Q is not empty A := Q.dequeue() for i=1..n if T[i, A]=true then T[i,size] := T[i, size] –1 if T[i,size]=0, then Q := Q  (T[i,tail]\C) C := C  T[i,tail]

Computing AB+ A B C D E Size Tail A → C B → D AD → E  1 2 Start: X+ = {A,B}, Q = {A, B} A B C D E Size Tail A → C  1 B → D AD → E 2

Computing AB+ A B C D E Size Tail A → C B → D AD → E  1 Iteration of A: X+ = {A,B,C}, Q = {B,C} A B C D E Size Tail A → C  B → D 1 AD → E

Computing AB+ A B C D E Size Tail A → C B → D AD → E  1 Iteration of B: X+ = {A,B,C,D}, Q = {C,D} A B C D E Size Tail A → C  B → D AD → E 1

Computing AB+ A B C D E Size Tail A → C B → D AD → E  1 Iteration of C: X+ = {A,B,C,D}, Q = {D} A B C D E Size Tail A → C  B → D AD → E 1

Computing AB+ A B C D E Size Tail A → C B → D AD → E  Iteration of D: X+ = {A,B,C,D,E}, Q = {E} A B C D E Size Tail A → C  B → D AD → E

Computing AB+ A B C D E Size Tail A → C B → D AD → E  Iteration of E: X+ = {A,B,C,D,E}, Q = {} A B C D E Size Tail A → C  B → D AD → E

Complexity? A B C D E Size Tail A → C B → D AD → E  1 2 To get an efficient algorithm, we assume that there are pointers from each “true” box in the table to the next “true” box in the same column. A B C D E Size Tail A → C  1 B → D AD → E 2

Complexity Complexity:O(|F|) Dequeue each attribute of X+ (attributes appearing at right in FDs of F) The number of changes of size in the table is the number of attributes appearing at left in FDs of F

Decomposition Characteristics

Characteristics of a Decomposition Two important characteristics of a decomposition: lossless join: necessary, otherwise original relation cannot be recreated, even if tables are not modified dependency preserving: allows us to check that inserts/updates are correct without joining the sub-relations

Lossless Join T C S Smith DB Cohen Jones OS Levy C S DB Cohen OS Levy

Checking Check for a lossless join using the algorithm from class (with the a-s and b-s) Check for dependency preserving using an algorithm shown today

Dependency Preservation R=ABC Decomposition {AB, AC} Dependencies {AB, BC}. Is it lossless? Does this decomposition preserve BC?

Dependency Preservation (cont’d) B A 100 10 1 2 300 20 3 B A 10 1 2 30 3 4 C A 100 1 2 300 3 400 4

Definitions We define S (F) to be the set of dependencies XY in F+ such that X and Y are in S. We say that a decomposition R1...Rn of R is dependency preserving if for all instances r of R that satisfy the FDs of R: (R1 (F) U ... U Rn (F))+ = F+ Note that one inclusion clearly holds always. This definition implies an exponential algorithm to check if a decomposition is dependency preserving We give a polynomial algorithm

Algorithm Let R be a relation, decomposed into R1, R2,…,Rn Let F be a set of functional dependencies To check whether R1,…,Rn preserves all the functional dependencies in F, run the algorithm on the next slide for each X -> Y in F If the answer is “Yes” for all FDs, then the decomposition preserves F If the answer is “No” for at least one FD, then the decomposition does not preserve F

Testing Dependency Preservation To check if the decomposition preserves XY: Z:=X while changes to Z occur do for i=1 to n do Z:= Z  ((Z  Ri)+  Ri) if YZ return “yes” else return “no”

Example (1) R=ABCD F = {A -> B, B -> C, C -> D, D -> A} R1=AB, R2=BC, R3=CD Is this decomposition dependency preserving?

Example (2) R = ABCDE F = {A -> ABCDE, BC -> A, DE -> C} Suppose we decompose R into ABDE and DEC. Is the decomposition dependency preserving?

Normal Forms

Non-Redundant Cover Algorithm for decomposition to 3NF that has a lossless join and is dependency preserving uses a non-redundant cover

Finding a Non-Redundant Cover 3 Steps: Define G as the result of putting F in standard form by decomposing each FD so that it has a single attribute on the right side For each XA in G and for each B in X, check whether G X-BA. If so, remove B For each XA in G, check whether G-{XA} XA. If so, remove XA |= |=

Normal Forms The basic idea: if a relation is in one of these forms, then it avoids certain problems (e.g., redundancy) Normal Forms: BCNF: Every dependency X->A in F+ must be (1) trivial or (2) X is a super-key 3NF: Every dependency X->A in F+ must be (1) trivial, (2) X is a super-key or (3) A is an attribute of a key

Example Reminder F+ = {X -> X+ | exist Y->Z in F st Y in X and Z not in X} Suppose that R = ABC. For each of the following values of F, decide whether R is in BCNF/3NF: F = {} F = {A -> B} F = {A -> B, A -> C} F = {A -> B, B -> C} F = {A -> B, BC -> A}

Decomposition into 3NF Given a relation R with functional dependencies F Step 1: Find a non-redundant cover G of F Step 2: For each FD XA in G, create a schema XA Step 3: If no schema created so far contains a key, add a key as a schema Step 4: Remove schemas that are contained in other schemas The result is a decomposition into 3NF that is dependency preserving and has a lossless join

Example Find a decomposition into 3NF for the relation R = ABCDEFGH, with the functional dependencies F = {AB, ABCDE, EFGH, ACDFEG}

Example Non-redundant cover G = {AB, ACDE, EFG, EFH} Key ACDF Schema: AB, ACDE, EFG, EFH, ACDF

Decomposition into BCNF There always exists a decomposition into BCNF that has a lossless join There does not always exist a decomposition into BCNF that is dependency preserving Example: Consider the relation SBD (sailor, boat, date) with the FDs {SBD} and {DB} There is a polynomial algorithm for finding such a decomposition

Algorithm from Class Suppose R is not in BCNF Suppose that XA violates the BCNF condition for R Decompose R into R-A, XA Continue recursively with R-A and XA Note: We must find violations to the BCNF condition in FR-A and FXA

Polynomial Algorithm for Decomposition into BCNF

Lemmas Lemma 1: Every 2-attributes scheme is in BCNF Lemma 2: If a schema R is not in BCNF, then we can find attributes A and B in R, such that (R – AB)  A. It may or may not be the case that (R – AB)  B as well.

Algorithm Check whether the schema R is into BCNF If not, decompose R R – A XA such that X -> A there is no YX such that YA

Algorithm Z:=R; // at all times, Z is the one scheme of the decomposition that may not be in BCNF repeat decompose Z into Z–A and XA, where XA is in BCNF and XA; // use the decomposition procedure add XA to the decomposition; Z:=Z–A; Until Z cannot be decomposed //by Lemma 2 add Z to the decomposition

Decomposition Procedure if Z contains no A and B such that A is in (Z-AB)+ // all closures are taken with respect to F then return that Z is in BCNF and cannot be decomposed else begin find one such A and B; Y:=Z–B; while Y contains A and B such that A is in (Y-AB)+ do Y:=Y–B; return the decomposition Z–A and Y; // Y is in the form XA, XA end

Example Schema R=CTHRSG C = course T = teacher H = hour R = room S = student G = grade

FDs C  T Each course has one teacher HR  C Only one course can meet in a room at one time HT  R A teacher can be in only one room at one time CS  G Each student has one grade in each course HS  R A student can be in only one room at one time

Running the Algorithm (1) Z = CTHRSG Check A=C, B=T: C in (HRSG)+ Y = CHRSG A=R, B=C, Y = HRSG A=R, B=G, Y = HRS Add HRS to the decomposition and Z=CTHSG

Running the Algorithm (2) Z = CTHSG Check A=T, B=H Y = CTSG A=T, B=S, Y = CTG A=T, B=G, Y = CT Add CT to the decomposition and Z=CHSG

Running the Algorithm (3) Z = CHSG Check A=G, B=H Y = CSG Add CSG to the decomposition and Z=CHS CHS is into BCNF

Decomposition into BCNF HRS, CT, CSG, CHS Is it lossless join? Is it dependency preserving?