CS422 Principles of Database Systems Normalization

Slides:



Advertisements
Similar presentations
CS 440 Database Management Systems Practice problems for normalization.
Advertisements

Database Management COP4540, SCS, FIU Functional Dependencies (Chapter 14)
Murali Mani Normalization. Murali Mani What and Why Normalization? To remove potential redundancy in design Redundancy causes several anomalies: insert,
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Design Theory for Relational Databases 2015, Fall Pusan National University Ki-Joune Li.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Functional Dependencies. FarkasCSCE 5202 Reading and Exercises Database Systems- The Complete Book: Chapter 3.1, 3.2, 3.3., 3.4 Following lecture slides.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#16: Schema Refinement & Normalization.
© D. Wong Ch. 3 (continued)  Database design problems  Functional Dependency  Keys of relations  Decompositions based on Functional Dependency.
Third Normal Form (3NF) Zaki Malik October 23, 2008.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
CS 405G: Introduction to Database Systems
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
Functional Dependencies CIS 4301 Lecture Notes Lecture 8 - 2/7/2006.
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
Normal Forms Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems June 18, 2016 Some slide content courtesy of Susan Davidson.
Formal definition of a key A key is a set of attributes A 1,..., A n such that for any other attribute B: A 1,..., A n  B A minimal key is a set of attributes.
Functional Dependencies and Normalization 1 Instructor: Mohamed Eltabakh Part 2.
Lecture 11: Functional Dependencies
Advanced Normalization
Design Theory for Relational Databases
CS422 Principles of Database Systems Normalization
Database Management Systems (CS 564)
CS411 Database Systems 08: Midterm Review Kazuhiro Minami 1.
Module 5: Overview of Database Design -- Normalization
Normalization First Normal Form (1NF) Boyce-Codd Normal Form (BCNF)
Relational Database Design by Dr. S. Sridhar, Ph. D
Relational Database Design
3.1 Functional Dependencies
Handout 4 Functional Dependencies
Advanced Normalization
Functional Dependencies and Normalization
Schema Refinement and Normalization
Database Normalization
Lecture 6: Design Theory
Faloutsos & Pavlo SCS /615 Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications Lecture #16: Schema Refinement & Normalization.
Module 5: Overview of Normalization
Design Theory for Relational Databases
Schema Refinement What and why
Normalization Murali Mani.
Lecture #17: Schema Refinement & Normalization - Normal Forms
Functional Dependencies and Normalization
Functional Dependencies and Normalization
Temple University – CIS Dept. CIS331– Principles of Database Systems
Cse 344 May 16th – Normalization.
Functional Dependencies and Relational Schema Design
Lecture 09: Functional Dependencies, Database Design
Normalization.
Lecture 8: Database Design
Functional Dependencies
Lecture 07: E/R Diagrams and Functional Dependencies
Normalization cs3431.
CS 405G: Introduction to Database Systems
Instructor: Mohamed Eltabakh
Relational Database Design
Anomalies Boyce-Codd Normal Form 3rd Normal Form
Lecture 6: Functional Dependencies
Database.
Chapter 3: Design theory for relational Databases
Chapter 7a: Overview of Database Design -- Normalization
Functional Dependencies and Normalization
Functional Dependencies and Normalization
Lecture 09: Functional Dependencies
CS4222 Principles of Database System
Design Theory for Relational Databases
Presentation transcript:

CS422 Principles of Database Systems Normalization Chengyu Sun California State University, Los Angeles

Schema Design Problem Description ER Design ER Diagram Relational ER to Relational Conversion Good Schema? Transform Bad into Good N Y Good Relational Schema

Bad Schema Update anomaly Delete anomaly class_records id name address assignment due grade 1 John 123 Main St. HW1 2009-06-22 A- HW2 2009-07-10 B 2 Jane 456 State St. A class_records Update anomaly Delete anomaly

Normalization students assignments grades id name address 1 John 123 Main St. 2 Jane 456 State St. students name due HW1 2009-06-22 HW2 2009-07-10 student assignment grades 1 HW1 A- HW2 B 2 A assignments grades

Questions To Be Answered How do we decide whether a schema is bad? How do we decompose a table to turn a bad schema into a good one?

Functional Dependency (FD) A functional dependency on table R is the assertion that two records having the same values for attributes {A1,…,An} must also have the same value for attribute B {A1,…,An}B, or {A1,…,An} functionally determine B

FD and Data A FD is an assertion based on assumptions about all possible data, not just the existing data id name 1 John 2 Jane {id}  name {name}  id

FD and Uniqueness Note that the definition does NOT say “{A1,…,An} uniquely determines a record”. id name address assignment due grade 1 John 123 Main St. HW1 2009-06-22 A- HW2 2009-07-10 B 2 Jane 456 State St. A

FD with Multiple Attributes {A1, A2, A3, …, An}  B1 {A1, A2, A3, …, An}  B2 … {A1, A2, A3, …, An}  Bm {A1, A2, A3, …, An}  {B1, B2, B3, …, Bm} A  B

Trivial Functional Dependency FD: {A1, A2, A3, …, An}  {B1, B2, B3, …, Bm} FD is trivial if all B’s are in A FD is nontrivial if at least one B is not in A FD is completely nontrivial if no B is in A From now on, when we talk about FD, we mean completely nontrivial FD unless otherwise noted.

FD Example 1 Musicians ( id, name, address ) Bands ( id, name ) Band_Members ( band_id, musician_id )

FD Example 2 Books ( id, title ) Authors ( id, name ) Book_Authors ( book_id, author_id, author_order )

FD Example 3 Functional dependencies?? class_records id name address assignment due grade 1 John 123 Main St. HW1 2009-06-22 A- HW2 2009-07-10 B 2 Jane 456 State St. A class_records Functional dependencies??

Key A is a key of table R if A functionally determines all attributes of R No proper subset of A functionally determines all attributes of R

A Few Things about Keys A table may have multiple keys A key may consist of multiple attributes Superset of a key is called a super key The definition doesn’t say anything about uniqueness A key has to be minimal, but not necessarily minimum

Key Examples Musicians and bands Books and authors Class_records

Boyce-Codd Normal Form (BCNF) A table R is in BCNF if for every nontrivial FD A  B in R, A is a super key of R. Or The key, the whole key, and nothing but the key, so help me Codd.

BCNF or Not? Musicians and bands Books and authors Class_records

Determine If a Table is BCNF Step 1: identify all FDs Step 2: find all keys Step 3: check LHS of all non-trivial FDs and see if they are a superset of a key (i.e. a super key)

Decompose into BCNF Given table R with functional dependencies S Look among F for a BCNF violation AB Compute A+ Decompose R into: R1 = A+ R2 = (R – A+) U A Continue decomposition with R1 and R2 until all resulting tables are BCNF

BCNF Violation Example Class_records Functional dependencies Key(s) BCNF violation(s)??

Closure of Attributes A+ Given a set of attributes A a set of functional dependencies S Closure of A under S, A+, is the set of all possible attributes that are functionally determined by A based on the functional dependencies inferable from S

Simple Closure Example R: {A,B,C} S: {AB, BC} {A}+ ?? {B}+ ?? {C}+ ??

Armstrong’s Axioms Reflexivity If B  A, then A  B Transitivity If A  B and B  C, then A  C Augmentation If A  B, then AC  BC for any C

Two More FD Rules Union If A  B and A  C, then A  BC Decomposition If A  BC, then A  B and A  C

Computing A+ Initialize A+ = A Search in S for BC where Add C to A+ B  A+ C  A+ Add C to A+ Repeat until nothing can be added to A+

Computing A+ Example R( A, B, C, D, E, F) S: ABC, BCAD, DE, CF B Is {A,B} a key ?? How do we find out the key(s) from R??

Example: BCNF Decomposition id name address assignment due grade 1 John 123 Main St. HW1 2009-06-22 A- HW2 2009-07-10 B 2 Jane 456 State St. A class_records ??

Motivation for 3NF {street_address, city}  zip {zip}  city (street_address, city, zip) (street_address, zip) (city, zip) We lose the FD {street_address,city}zip after decomposition, or in other words, it becomes unenforceable.

An Unenforceable FD Before decomposition: After decomposition: street city zip 545 Tech Sq. Cambridge 02138 02139 Data error like this can be detected After decomposition: street zip 545 Tech Sq. 02138 02139 city zip Cambridge 02138 02139 The same data error can no longer be detected.

Third Normal Form (3NF) A table R is in 3NF if for every nontrivial FD A  B in R, A is a super key of R or B is part of a key of R BCNF 3NF

Schema Design Problem ER Design ER Diagram Description Relational ER to Relational Conversion BCNF or 3NF? Normalization (i.e. Decomposition) N Y Good Relational Schema