Lossless Decomposition Elias Aseged SE 157B - DB 2.

Slides:



Advertisements
Similar presentations
Schema Refinement: Normal Forms
Advertisements

Shantanu Narang.  Background  Why and What of Normalization  Quick Overview of Lower Normal Forms  Higher Order Normal Forms.
Lossless Decomposition Anannya Sengupta CS 157A Prof. Sin-Min Lee.
CS 440 Database Management Systems Practice problems for normalization.
LOSSLESS DECOMPOSITION Prof. Sin-Min Lee Department of Computer Science San Jose State University.
1 Design Theory. 2 Minimal Sets of Dependancies A set of dependencies is minimal if: 1.Every right side is a single attribute 2.For no X  A in F and.
1 Normalization. 2 Normal Forms v If a relation is in a certain normal form (BCNF, 3NF etc.), it is known that certain kinds of redundancies are avoided/minimized.
Database Systems Lecture #5 Yan Pan School of Software, SYSU 2011.
7.1 Chapter 7: Relational Database Design. 7.2 Chapter 7: Relational Database Design Features of Good Relational Design Atomic Domains and First Normal.
Classroom Exercise: Normalization
Normalization DB Tuning CS186 Final Review Session.
CMSC424: Database Design Instructor: Amol Deshpande
1 Introduction to Database Systems CSE 444 Lectures 8 & 9 Database Design April 16 & 18, 2008.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
Lossless Decomposition By Chi-Shu Ho For CS157A Prof. Sin-Min Lee.
Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science San Jose State University.
Decomposition By Yuhung Chen CS157A Section 2 October
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #7 Matthew P. Johnson Stern School of Business, NYU Spring,
M.P. Johnson, DBMS, Stern/NYU, Spring C : Database Management Systems Lecture #5 M.P. Johnson Stern School of Business, NYU Spring, 2008.
Cs3431 Normalization Part II. cs3431 Attribute Closure : Example Consider R (A, B, C, D, E) with FDs A  B, B  C, CD  E Does A  E hold ? (Is A  E.
Boyce-Codd NF & Lossless Decomposition Professor Sin-Min Lee.
M.P. Johnson, DBMS, Stern/NYU, Sp20041 C : Database Management Systems Lecture #6 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.
Department of Computer Science and Engineering, HKUST Slide 1 7. Relational Database Design.
Functional Dependencies and Relational Schema Design.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
Functional Dependencies and Normalization 1 Instructor: Mohamed Eltabakh
Chapter 8: Relational Database Design First Normal Form First Normal Form Functional Dependencies Functional Dependencies Decomposition Decomposition Boyce-Codd.
DAVID DENG CS157B MARCH 23, 2010 Dependency Preserving Decomposition.
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
Schema Refinement and Normal Forms Chapter 19 1 Database Management Systems 3ed, R.Ramakrishnan & J.Gehrke.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Lecture 09: Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
Functional Dependencies An example: loan-info= Observe: tuples with the same value for lno will always have the same value for amt We write: lno  amt.
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
THIRD NORMAL FORM (3NF) A relation R is in BCNF if whenever a FD XA holds in R, one of the following statements is true: XA is a trivial FD, or X is.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Functional Dependencies and Normalization for Relational Databases
Functional Dependencies and Normalization 1 Instructor: Mohamed Eltabakh
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Functional Dependencies. Outline Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6)
Temple University – CIS Dept. CIS331– Principles of Database Systems V. Megalooikonomou Database Design and Normalization (based on notes by Silberchatz,Korth,
Copyright, Harris Corporation & Ophir Frieder, The Process of Normalization.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Functional Dependencies and Relational Schema Design.
Design Theory for RDB Normal Forms. Lu Chaojun, SJTU 2 Redundant because these info may be figured out by using FD s1  … What’s Bad Design? Redundancy.
UNIT IV – Schema refinement – Problems Caused by redundancy
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Functional Dependencies CIS 4301 Lecture Notes Lecture 8 - 2/7/2006.
Carnegie Mellon Carnegie Mellon Univ. Dept. of Computer Science Database Applications C. Faloutsos Database design and normalization.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
1 Lecture 08: E/R Diagrams and Functional Dependencies Friday, January 21, 2005.
Huffman code and Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 Dept. of CIS, Temple Univ. CIS661 – Principles of Data Management V. Megalooikonomou Database design and normalization (based on slides by C. Faloutsos.
1 Lecture 9: Database Design Wednesday, January 25, 2006.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
CSC 411/511: DBMS Design Dr. Nan Wang 1 Schema Refinement and Normal Forms Chapter 19.
Formal definition of a key A key is a set of attributes A 1,..., A n such that for any other attribute B: A 1,..., A n  B A minimal key is a set of attributes.
1 CS122A: Introduction to Data Management Lecture #13: Relational DB Design Theory (II) Instructor: Chen Li.
Higher Forms of Normalization
Dependency Preservation
Module 5: Overview of Normalization
Lecture #17: Schema Refinement & Normalization - Normal Forms
Functional Dependencies and Normalization
Exercise R(A,B,C,D) with FD’s ABC, CD, and DA
Some slides are from Dr. Sara Cohen
Chapter 7a: Overview of Database Design -- Normalization
Lecture 09: Functional Dependencies
Presentation transcript:

Lossless Decomposition Elias Aseged SE 157B - DB 2

What is Decomposition? Decomposition – the process of breaking down in parts or elements. Decomposition in database means breaking tables down into multiple tables From Database perspective means going to a higher normal form

Decomposition Important that decompositions are “good”, Two Characteristics of Good Decompositions 1) Lossless 2) Preserve dependencies

What is lossless? Lossless means functioning without a loss. In other words, retain everything. Important for databases to have this feature.

Formal Definition Let R be a relation schema. Let F be a set of functional dependencies on R. Let and form a decomposition of R. The decomposition is a lossless-join decomposition of R if at least one of the following functional dependencies are in F + 1) R1 ∩ R2  R1 2) R1 ∩ R2  R2

In Simpler Terms… R1 ∩ R2  R1 R1 ∩ R2  R2 If R is split into R1 and R2, for the decomposition to be lossless then at least one of the two should hold true. Projecting on R1 and R2, and joining back, results in the relation you started with

Why lossless? Ensures that attributes involved in the natural join (R1 ∩ R2) are a candidate key for at least one of the two relations. This ensures we can never get the situation where false tuples are generated, as for any value on the join attributes there will be a unique tuple in one of the relations.

A decomposition is lossless if we can recover: R(A,B,C) R1(A,B) R2(A,C) R’(A,B,C) should be the same as R(A,B,C) Must ensure R’ = R Decompose Recover Lossless Decomposition

Sometimes the same set of data is reproduced: (Word, 100) + (Word, WP)  (Word, 100, WP) (Oracle, 1000) + (Oracle, DB)  (Oracle, 1000, DB) (Access, 100) + (Access, DB)  (Access, 100, DB) NamePriceCategory Word100WP Oracle1000DB Access100DB NamePrice Word100 Oracle1000 Access100 NameCategory WordWP OracleDB AccessDB

Lossy Decomposition Sometimes it’s not: (Word, WP) + (100, WP) = (Word, 100, WP) (Oracle, DB) + (1000, DB) = (Oracle, 1000, DB) (Oracle, DB) + (100, DB) = (Oracle, 100, DB) (Access, DB) + (1000, DB) = (Access, 1000, DB) (Access, DB) + (100, DB) = (Access, 100, DB) NamePriceCategory Word100WP Oracle1000DB Access100DB CategoryName WPWord DBOracle DBAccess CategoryPrice WP100 DB1000 DB100 What’s wrong?

Ensuring lossless decomposition R(A 1,..., A n, B 1,..., B m, C 1,..., C p ) If A 1,..., A n  B 1,..., B m or A 1,..., A n  C 1,..., C p Then the decomposition is lossless R 1 (A 1,..., A n, B 1,..., B m ) R 2 (A 1,..., A n, C 1,..., C p ) Note: don’t need both

Identifying a Loss Decomposition Make a table for sub schemas of R Fill in table with distinguished variables (corresponding to the sub schemas) – If one row is full of distinguished variables, it’s lossless – If no one row is full, add distinguished variables To add distinguished variables 1)2 or more rows with distinguished variables on LHS 2)1 or more rows with distinguished variables on RHS 3)1 or more rows with non-distinguished variables on RHS

Example 1 (From Class) R(A B C D E) FD1 = (A  B) FD2 = (BC  E) FD3 = (ED  A) R 1 =(AB); R 2 =(ACDE);

Answer aa aaaAa A B C D E R1 R2 *This decomposition is lossless

Example 2 Is this decomposition lossless? R (A B C D E) FD1 – AB  C FD2 – C  E FD3 – B  D FD4 – E  A R 1 =(BCD); R 2 =(ACE);

Answer If you do this procedure and you don’t have one row full of distinguished variables, then the decomposition is lossy. a aaa a aaa A B C D E R1 R2 *This decomposition is lossless

R(A B C D E) FD1: A  BC FD2: BD  CE FD3: E  AD FD4: CE  A R1(ABC) = R2 (BCDE) =

Conclusion Decomposing is the act of breaking tables down in order to achieve higher normal form. Decompositions should always be lossless. This confirms that information in the original relation can be accurately reconstructed based on the decomposed relations. Remember that for a decomposition to be considered “GOOD” it must also preserve functional dependencies.

Questions?

References 99/lec14.pdf 99/lec14.pdf /notes/Chapter7/node7.html /notes/Chapter7/node7.html 57BL14HuffmanCode&LosslessDecomposition.ppt 57BL14HuffmanCode&LosslessDecomposition.ppt