Some slides are from Dr. Sara Cohen

Some slides are from Dr. Sara Cohen
Design Theory Some slides are from Dr. Sara Cohen

Overview Starting Point: Set of functional dependencies that describe real-world constraints Goal: Create tables that do not contain redundancies, so that there is less wasted space there is less of a chance to introduce errors in the database

Design Theory Armstrong's axioms defined, so that we can derive functional dependencies Need to identify a key: find a single key find all keys Both algorithms use as a subroutine an algorithm that computes the closure. In class a polynomial algorithm was given. A linear algorithm will be shown.

Compute Closure in Linear Time

Closure of a Set of Attributes
Let U be a set of attributes and F be a set of functional dependencies on U. Suppose that X  U is a set of attributes. Definition: X+ = { A | F X  A} We would like to compute X+ |=

Algorithm From Class Compute Closure(X, F) C := X
While there is a V  W in F such that (V  C)and (W  C) do C := C  W 3.Return C Complexity O(|U||F|)

Example R=ABCDE F={ABC, CEB, DA, BCE} {A}+ = {A,B}+ = {B,D}+=

A More Efficient Algorithm
We start by creating a table, with a row for each FD and a column for each attribute. The table will have 2 additional columns called size and tail. In the row for a dependency X Y, there will be the value true in each column corresponding to an attribute in X. The size column will contain the size of the set X. The tail column will contain Y.

Example Table A B C D E Size Tail A → C B → D AD → E 
F = {A → C, B → D, AD → E} A B C D E Size Tail A → C  1 B → D AD → E 2

Compute Closure(X, F, T) /* T is the table,
n is the number of FDs in F */ C := X Q := X While Q is not empty A := Q.dequeue() for i=1..n if T[i, A]=true then T[i,size] := T[i, size] –1 if T[i,size]=0, then Q := Q  (T[i,tail]\C) C := C  T[i,tail]

Computing AB+ A B C D E Size Tail A → C B → D AD → E  1 2
Start: X+ = {A,B}, Q = {A, B} A B C D E Size Tail A → C  1 B → D AD → E 2

Computing AB+ A B C D E Size Tail A → C B → D AD → E  1
Iteration of A: X+ = {A,B,C}, Q = {B,C} A B C D E Size Tail A → C  B → D 1 AD → E

Iteration of B: X+ = {A,B,C,D}, Q = {C,D} A B C D E Size Tail A → C  B → D AD → E 1

Iteration of C: X+ = {A,B,C,D}, Q = {D} A B C D E Size Tail A → C  B → D AD → E 1

Computing AB+ A B C D E Size Tail A → C B → D AD → E 
Iteration of D: X+ = {A,B,C,D,E}, Q = {E} A B C D E Size Tail A → C  B → D AD → E

Computing AB+ A B C D E Size Tail A → C B → D AD → E 
Iteration of E: X+ = {A,B,C,D,E}, Q = {} A B C D E Size Tail A → C  B → D AD → E

Complexity? A B C D E Size Tail A → C B → D AD → E  1 2
To get an efficient algorithm, we assume that there are pointers from each “true” box in the table to the next “true” box in the same column. A B C D E Size Tail A → C  1 B → D AD → E 2

Complexity Complexity:O(|F|) Dequeue each attribute of X+ (attributes appearing at right in FDs of F) The number of changes of size in the table is the number of attributes appearing at left in FDs of F

Decomposition Characteristics

Characteristics of a Decomposition
Two important characteristics of a decomposition: lossless join: necessary, otherwise original relation cannot be recreated, even if tables are not modified dependency preserving: allows us to check that inserts/updates are correct without joining the sub-relations

Lossless Join T C S Smith DB Cohen Jones OS Levy C S DB Cohen OS Levy

Checking Check for a lossless join using the algorithm from class (with the a-s and b-s) Check for dependency preserving using an algorithm shown today

Dependency Preservation
R=ABC Decomposition {AB, AC} Dependencies {AB, BC}. Is it lossless? Does this decomposition preserve BC?

Dependency Preservation (cont’d)
B A 100 10 1 2 300 20 3 B A 10 1 2 30 3 4 C A 100 1 2 300 3 400 4

Definitions We define S (F) to be the set of dependencies XY in F+ such that X and Y are in S. We say that a decomposition R1...Rn of R is dependency preserving if for all instances r of R that satisfy the FDs of R: (R1 (F) U ... U Rn (F))+ = F+ Note that one inclusion clearly holds always. This definition implies an exponential algorithm to check if a decomposition is dependency preserving We give a polynomial algorithm

Algorithm Let R be a relation, decomposed into R1, R2,…,Rn
Let F be a set of functional dependencies To check whether R1,…,Rn preserves all the functional dependencies in F, run the algorithm on the next slide for each X -> Y in F If the answer is “Yes” for all FDs, then the decomposition preserves F If the answer is “No” for at least one FD, then the decomposition does not preserve F

Testing Dependency Preservation
To check if the decomposition preserves XY: Z:=X while changes to Z occur do for i=1 to n do Z:= Z  ((Z  Ri)+  Ri) if YZ return “yes” else return “no”

Example (1) R=ABCD F = {A -> B, B -> C, C -> D, D -> A}
R1=AB, R2=BC, R3=CD Is this decomposition dependency preserving?

Example (2) R = ABCDE F = {A -> ABCDE, BC -> A, DE -> C}
Suppose we decompose R into ABDE and DEC. Is the decomposition dependency preserving?

Normal Forms

Non-Redundant Cover Algorithm for decomposition to 3NF that has a lossless join and is dependency preserving uses a non-redundant cover

Finding a Non-Redundant Cover
3 Steps: Define G as the result of putting F in standard form by decomposing each FD so that it has a single attribute on the right side For each XA in G and for each B in X, check whether G X-BA. If so, remove B For each XA in G, check whether G-{XA} XA. If so, remove XA |= |=

Normal Forms The basic idea: if a relation is in one of these forms, then it avoids certain problems (e.g., redundancy) Normal Forms: BCNF: Every dependency X->A in F+ must be (1) trivial or (2) X is a super-key 3NF: Every dependency X->A in F+ must be (1) trivial, (2) X is a super-key or (3) A is an attribute of a key

Example Reminder F+ = {X -> X+ | exist Y->Z in F st Y in X and Z not in X} Suppose that R = ABC. For each of the following values of F, decide whether R is in BCNF/3NF: F = {} F = {A -> B} F = {A -> B, A -> C} F = {A -> B, B -> C} F = {A -> B, BC -> A}

Decomposition into 3NF Given a relation R with functional dependencies F Step 1: Find a non-redundant cover G of F Step 2: For each FD XA in G, create a schema XA Step 3: If no schema created so far contains a key, add a key as a schema Step 4: Remove schemas that are contained in other schemas The result is a decomposition into 3NF that is dependency preserving and has a lossless join

Example Find a decomposition into 3NF for the relation R = ABCDEFGH, with the functional dependencies F = {AB, ABCDE, EFGH, ACDFEG}

Example Non-redundant cover G = {AB, ACDE, EFG, EFH} Key ACDF
Schema: AB, ACDE, EFG, EFH, ACDF

Decomposition into BCNF
There always exists a decomposition into BCNF that has a lossless join There does not always exist a decomposition into BCNF that is dependency preserving Example: Consider the relation SBD (sailor, boat, date) with the FDs {SBD} and {DB} There is a polynomial algorithm for finding such a decomposition

Algorithm from Class Suppose R is not in BCNF
Suppose that XA violates the BCNF condition for R Decompose R into R-A, XA Continue recursively with R-A and XA Note: We must find violations to the BCNF condition in FR-A and FXA

Polynomial Algorithm for Decomposition into BCNF

Lemmas Lemma 1: Every 2-attributes scheme is in BCNF
Lemma 2: If a schema R is not in BCNF, then we can find attributes A and B in R, such that (R – AB)  A. It may or may not be the case that (R – AB)  B as well.

Algorithm Check whether the schema R is into BCNF If not, decompose R
R – A XA such that X -> A there is no YX such that YA

Algorithm Z:=R; // at all times, Z is the one scheme of the decomposition that may not be in BCNF repeat decompose Z into Z–A and XA, where XA is in BCNF and XA; // use the decomposition procedure add XA to the decomposition; Z:=Z–A; Until Z cannot be decomposed //by Lemma 2 add Z to the decomposition

Decomposition Procedure
if Z contains no A and B such that A is in (Z-AB)+ // all closures are taken with respect to F then return that Z is in BCNF and cannot be decomposed else begin find one such A and B; Y:=Z–B; while Y contains A and B such that A is in (Y-AB)+ do Y:=Y–B; return the decomposition Z–A and Y; // Y is in the form XA, XA end

Example Schema R=CTHRSG C = course T = teacher H = hour R = room
S = student G = grade

FDs C  T Each course has one teacher
HR  C Only one course can meet in a room at one time HT  R A teacher can be in only one room at one time CS  G Each student has one grade in each course HS  R A student can be in only one room at one time

Running the Algorithm (1)
Z = CTHRSG Check A=C, B=T: C in (HRSG)+ Y = CHRSG A=R, B=C, Y = HRSG A=R, B=G, Y = HRS Add HRS to the decomposition and Z=CTHSG

Z = CTHSG Check A=T, B=H Y = CTSG A=T, B=S, Y = CTG A=T, B=G, Y = CT Add CT to the decomposition and Z=CHSG

Z = CHSG Check A=G, B=H Y = CSG Add CSG to the decomposition and Z=CHS CHS is into BCNF

Decomposition into BCNF
HRS, CT, CSG, CHS Is it lossless join? Is it dependency preserving?

Some slides are from Dr. Sara Cohen

Similar presentations

Presentation on theme: "Some slides are from Dr. Sara Cohen"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Some slides are from Dr. Sara Cohen

Similar presentations

Presentation on theme: "Some slides are from Dr. Sara Cohen"— Presentation transcript:

Similar presentations

About project

Feedback