Design Theory for Relational Databases 2015, Fall Pusan National University Ki-Joune Li.

Slides:



Advertisements
Similar presentations
Normalisation to 3NF Database Systems Lecture 11 Natasha Alechina.
Advertisements

Announcements Read 6.1 – 6.3 for Wednesday Project Step 3, due now Homework 5, due Friday 10/22 Project Step 4, due Monday Research paper –List of sources.
Spring 2011 Instructor: Hassan Khosravi
Boyce-Codd NF Takahiko Saito Spring 2005 CS 157A.
Normalization CMSC 461 Michael Wilson. Anomalies  Poor relational database design can lead to the occurrence of anomalies  Anomalies that we tend to.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Boyce-Codd normal form (BCNF) Kai Zhu CS157B Professor: Dr. Lee.
Normalization continued CMSC 461 Michael Wilson. Normalization clarification  Normalization is simply a way of reducing anomalous database behavior 
Database Management COP4540, SCS, FIU Functional Dependencies (Chapter 14)
INLS 623 – D ATABASE N ORMALIZATION Instructor: Jason Carter.
603 Database Systems Senior Lecturer: Laurie Webster II, M.S.S.E.,M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 6 A First Course in Database Systems.
Functional Dependencies - Example
Lossless Decomposition (2) Prof. Sin-Min Lee Department of Computer Science San Jose State University.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 227 Database Systems I Design Theory for Relational Databases.
Instructor: Amol Deshpande  Data Models ◦ Conceptual representation of the data  Data Retrieval ◦ How to ask questions of the database.
603 Database Systems Senior Lecturer: Laurie Webster II, M.S.S.E.,M.S.E.E., M.S.BME, Ph.D., P.E. Lecture 8 A First Course in Database Systems.
Closure The closure of {B 1 …B k } under the set of FDs S, denoted by {B 1 …B k } +, is defined as follows: {B 1 …B k } + = {B | any relation satisfies.
Functional Dependencies Definition: If two tuples agree on the attributes A, A, … A 12n then they must also agree on the attributes B, B, … B 12m Formally:
CMSC424: Database Design Instructor: Amol Deshpande
Normal Form Design addendum by C. Zaniolo. ©Silberschatz, Korth and Sudarshan7.2Database System Concepts Normal Form Design Compute the canonical cover.
The principal problem that we encounter is redundancy, where a fact is repeated in more than one tuple. Most common cause: attempts to group into one relation.
1 CMSC424, Spring 2005 CMSC424: Database Design Lecture 9.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Normalization. FarkasCSCE 5202 Reading Assignments  Database Systems The Complete Book: Chapters 3.6, 3.7, 3.8  Following lecture slides are modified.
Fall 2001Arthur Keller – CS 1804–1 Schedule Today Oct. 4 (TH) Functional Dependencies and Normalization. u Read Sections Project Part 1 due. Oct.
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
Functional Dependencies and Relational Schema Design.
Chapter 8: Relational Database Design First Normal Form First Normal Form Functional Dependencies Functional Dependencies Decomposition Decomposition Boyce-Codd.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
Relational Database Design by Relational Database Design by Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Functional Dependencies An example: loan-info= Observe: tuples with the same value for lno will always have the same value for amt We write: lno  amt.
SCUJ. Holliday - coen 1784–1 Schedule Today: u Normal Forms. u Section 3.6. Next u Relational Algebra. Read chapter 5 to page 199 After that u SQL Queries.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
CS 564 Database Management Systems: Design and Implementation Discussion Session Friday, Sept 18, Apul Jain.
Revisit FDs & BCNF Normalization 1 Instructor: Mohamed Eltabakh
Functional Dependencies. FarkasCSCE 5202 Reading and Exercises Database Systems- The Complete Book: Chapter 3.1, 3.2, 3.3., 3.4 Following lecture slides.
© D. Wong Ch. 3 (continued)  Database design problems  Functional Dependency  Keys of relations  Decompositions based on Functional Dependency.
NORMALIZATION COSC 6340 Spring Objective Normalization presents a set of rules that tables and databases must follow to be well structured. Historically.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
Lecture 3 Functional Dependency and Normal Forms Prof. Sin-Min Lee Department of Computer Science.
Chapter 5.1 and 5.2 Brian Cobarrubia Database Management Systems II January 31, 2008.
3 Spring Chapter Normalization of Database Tables.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
CS 222 Database Management System Spring Lecture 4 Database Design Theory Korra Sathya Babu Department of Computer Science NIT Rourkela.
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D.
© D. Wong Functional Dependencies (FD)  Given: relation schema R(A1, …, An), and X and Y be subsets of (A1, … An). FD : X  Y means X functionally.
Databases : Functional Dependencies 2007, Fall Pusan National University Ki-Joune Li.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 CS 430 Database Theory Winter 2005 Lecture 8: Functional Dependencies Second, Third, and Boyce-Codd Normal Forms.
1 Lecture 9: Database Design Wednesday, January 25, 2006.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
Normal Forms Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems June 18, 2016 Some slide content courtesy of Susan Davidson.
Databases : Design of Relational Database Schemas 2007, Fall Pusan National University Ki-Joune Li.
Formal definition of a key A key is a set of attributes A 1,..., A n such that for any other attribute B: A 1,..., A n  B A minimal key is a set of attributes.
Advanced Normalization
CS422 Principles of Database Systems Normalization
CS422 Principles of Database Systems Normalization
3.1 Functional Dependencies
Advanced Normalization
Design Theory for Relational Databases
Schema Refinement What and why
Chapter 14 & Chapter 15 Normalization Pearson Education © 2009.
Functional Dependencies and Normalization
Chapter 3: Design theory for relational Databases
Design Theory for Relational Databases
Presentation transcript:

Design Theory for Relational Databases 2015, Fall Pusan National University Ki-Joune Li

Properties of Table When we design relational DB, o It is a set of relations. o Relations can be derived from UML diagram But NOT all relations are correct. o We should carefully observe the properties of table o Functional Dependency o Key o Decomposition of Table 2

Definition of Functional Dependency FD (Functional Dependency) on a Relation R o iff A 1 A 2 A 3 … A n  B where A 1, A 2, A 3, …, A n, B are attributes of R o A set of attributes A 1 A 2 A 3 … A n functionally determines B o More than one B’s  A 1 A 2 A 3 … A n  B 1  A 1 A 2 A 3 … A n  B 2 …  A 1 A 2 A 3 … A n  B k  A 1 A 2 A 3 … A n  B 1 B 2 … B k A 1 A 2 A 3 … A n B 1 B 2 B 3 … B k 3

Functional Dependency: Example A Relation o Movies (title, year, length, filmType, studioName, starName) (title year)  length (title year)  filmType (title year)  studioName (title year)  length filmType studioName ? (title year)  starName : more than one star in a film It is important to discover FD in a relation o It helps to decide the correctness of relation design. 4

Key Given a relation R o A set of one or more attributes {A 1, A 2, A 3, …, A n } is a KEY iff  the set functionally determines all other attributes and  no proper subset of {A 1, A 2, A 3, …, A n } functionally determines other attributes (Minimal) o Primary Key:  If a relation has more than one keys, a key is defined as primary key o Super Key  a set of attributes containing a key  No minimality condition Example o Movies (title, year, length, filmType, studioName, starName) o What are keys ? 5

How to discover keys From E-R Diagram: Underlined Attributes o It means that keys are defined based on the understanding of the real world o Example: Movies (title, year, length, filmType, studioName, starName)  (year, starName) is not key if a star can make more than one film per year  (year, starName) is a key if a star is allowed to make only one film per year Relation (A 1, A 2, B) for relationship between R 1 and R 2 o One-One o One-Many o Many-One o Many-Many 6

Rules about Functional Dependencies Functional Dependency o An important property of Relation (or Table) o Some interesting properties or rules of FD Transitive Rule o A  B and B  C then A  C Splitting/Combining Rule o A 1 A 2 A 3 …A n  B 1, A 1 A 2 A 3 …A n  B 2, …, A 1 A 2 A 3 … A n  B k iff A 1 A 2 A 3 … A n  B 1 B 2 … B k Trivial FD Rule: Given a FD A 1 A 2 A 3 …A n  B o FD is trivial if B is one of {A 1 A 2 A 3 …A n } : really trivial o FD is Completely non-trivial: B is not in {A 1 A 2 A 3 …A n } 7

Rules about Functional Dependencies Trivial Dependency Rule o A 1 A 2 … A n  B 1 B 2 … B m is equivalent to A 1 A 2 … A n  C 1 C 2 … C k if {C 1 C 2 … C k }  { B 1 B 2 … B m } and for any C  {C 1 C 2 … C k }, C  {A 1 A 2 … A n } o Example: (year, title)  (studioName, year), (year, title)  studioName Unnecessary A 1 A 2 A 3 … A n C 1 C 2 C 3 … C k B 1 B 2 B 3 … B m 8

Armstrong's Axioms Reflexivity: (Trivial FD) If {C 1 C 2 … C k }  { B 1 B 2 … B m }, then B 1 B 2 … B m  C 1 C 2 … C k Augmentation: If A 1 A 2 … A n  B 1 B 2 … B m, then A 1 A 2 … A n C 1 C 2 … C k  B 1 B 2 … B m C 1 C 2 … C k Transitivity: A 1 A 2 … A n  B 1 B 2 … B m and B 1 B 2 … B m  C 1 C 2 … C k, then A 1 A 2 … A n  C 1 C 2 … C k 9

Closure of Attributes Closure : {A 1, A 2, … A n } + o {A 1 A 2 … A n } is a set of attributes and S is a set of FD o Closure of {A 1 A 2 … A n } under FD's in S: set of attributes B such that A 1 A 2 … A n  B o That is, under all functional dependencies, every B i that we derive A 1 A 2 … A n  B 1 A 1 A 2 … A n  B 2... A 1 A 2 … A n  B k then {A 1 A 2 … A n } + = {B 1,B 2,…, B k } 10

Algorithm to Find Closure Input: Set of Attributes {A 1, A 2, … A n }, and set S of FDs Output: {A 1, A 2, … A n } + Process 1. Split FDs that each FD has a single attribute on the right. e.g. A 1 A 2  B C then Split it to A 1 A 2  B and A 1 A 2  C 2. Initialize X = {A 1, A 2, … A n } 3. Search for some FD e.g. B 1 B 2... B m  C such that B 1, B 2,.. B m are in X but C not in X 4. Repeat 3 until no more attribute to add in X Example o Given attributes A, B, C, D, E, and F o S: A B  C, B C  A D, D  E, and C F  B What is { A, B } + ? 11

Closure and Key If {A 1, A 2, … A n } + is the set of all attributes of relation R, then A 1, A 2, … A n is a super key o Example: R (A, B, C, D, E) and S: A B  C, B C  A D, D  E then { A, B } + = {A, B, C, D, E} : all attributes of R.  {A, B} is a super key of R. if no attribute can be removed to cover the all attributed, then it is a key. o Example: if we remove B from {A, B} then {A} + is not {A, B, C, D, E}. therefore {A, B} is a key 12

Closing Set of Functional Dependencies Closing Set of FD set S: o Basis T of S: If we can derive S from a T, then T is a basis of S. o Remove all duplicated FDs o Minimal Basis B satisfies three conditions  All the FD in B have one attribute in right side  If any FD is removed from S, then some FD becomes no longer valid.  If for any FD in B, we remove one or more attributes from the left side, then the result is no more a basis Example o for a S={A  B, A  C, B  A, B  C, C  A, C  B}, what is the minimal basis of S? {AB  C, AC  B, BC  A}? 13

14 Bad Design: Anomalies Bad Design: Example Redundancy Update Anomaly Deletion Anomaly TitleYearLengthFilm TypeStudioNameStarName Star Wars ColorFoxCarrie Fisher Star Wars ColorFoxMark Hamill Star Wars ColorFoxHarrison Ford Mighty Ducks ColorDisneyEmilio Estevez Wayne’s World199295ColorParamountDana Carvey Wayne’s World199295ColorParamountMike Meyers

15 Decomposing Relations Decomposition of Bad Relation o A good way to remove the problem of bad relations Decomposition: Lossless Decomposition o { A 1 A 2 … A n }  { B 1 B 2 … B m }, {C 1 C 2 … C k } such that { B 1 B 2 … B m }  {C 1 C 2 … C k } = { A 1 A 2 … A n } and { B 1 B 2 … B m }  {C 1 C 2 … C k }  {}

16 Decomposing Relations: Example R={title, year, length, filmType, studioName, starName}  {title, year, length, filmType, studioName} (=R1), {title, year, starName} (=R2) Redundancy Update Anomaly Deletion Anomaly TitleYearLengthFilm TypeStudioName Star Wars ColorFox Mighty Ducks ColorDisney Wayne’s World199295ColorParamount TitleYearStarName Star Wars1977Carrie Fisher Star Wars1977Mark Hamill Star Wars1977Harrison Ford Mighty Ducks1991Emilio Estevez Wayne’s World1992Dana Carvey Wayne’s World1992Mike Meyers

17 Normal Form: Conditions for Good Relation 1 st Normal Form (1NF) 2 nd Normal Form (2NF) 3 rd Normal Form (3NF) Boyce-Codd Normal Form (BCNF)

18 1 st Normal Form 1NF: Every component of relation should be ATOMIC o No Table in component o No Set o No List etc..

19 2 nd Normal Form 2NF o 1NF and o None of the non-prime attributes of the relation is functionally dependent on a part of a candidate key  Partial Dependency on non-prime attribute Example o Player (Team, Number, TeamAddress, Name, Position) o 1NF but not 2NF B CA

Example Player (Team, Number, TeamAddress, Name, Position) o FD1: Team, Name  Name, Position o FD2: Team  TeamAddress o Key: {Team, Name} + ={Team, Number, TeamAddress, Name, Position} o in FD2, TeamAddress (non-prime attribute) is dependent on {Team}, which is a subset of the key and o 2NF violation Should be decomposed o R1(Team, Number, Name, Position) and R2(Team, TeamAddress) o R1 R2 = R 20

21 Example EmployeeSkillCurrent Work Location JonesTyping114 Main Street JonesShorthand114 Main Street JonesWhittling114 Main Street RobertsLight Cleaning73 Industrial Way EllisAlchemy73 Industrial Way EllisJuggling73 Industrial Way HarrisonLight Cleaning73 Industrial Way Candidate Key: {Employee, Skill} Not 2ND  Partial FD: Employee  Current Work Location  Should be decomposed (Employee, Skill), (Employee, Current Work Location)

22 3 rd Normal Form 2NF: Every non-prime attributes of the relation must be non- transitively dependent on every candidate key Example o Team (TeamName, Address, ManagerID, ManagerHireDate) o FD:  TeamName  Address, TeamName  ManagerID  (TeamName  )ManagerID  ManagerHireDate  Key: {TeamName}  2NF but Not 3NF o To be decomposed  (TeamName, Address, ManagerID), (Manager SS ID, ManagerHireDate) B CA

23 Example: 2NF but NOT 3NF TournamentYearWinnerWinner Date of Birth Indiana Invitational1998Al Fredrickson21 July 1975 Cleveland Open1999Bob Albertson28 September 1968 Des Moines Masters1999Al Fredrickson21 July 1975 Indiana Invitational1999Chip Masterson14 March 1977 Candidate Key: {Tournament, Year} 2NF: No Partial Dependency Not 3ND  Transitive Functional Dependency  {Tournament, Year}  Winner  Winner Date of Birth  Should be decomposed (Tournament, Year, Winner), (Player, Birth date}

24 Boyce-Codd Normal Form (BCNF) BCNF: For every one of its non-trivial functional dependencies X  Y, X is a super key o Remember: nontrivial means A is not a member of set X. o Remember, a superkey is any superset of a key (not necessarily a proper superset) BCNF is slightly stronger than 3NF

25 1NF 2NF 3NF Relationship between 1NF, 2NF, 3NF and BCNF BCNF

26 Example: 3NF but NOT BCNF Prof. IDProf. SS IDStudent ID A table to show the assignment of students Candidate Keys  {Prof. ID, Student ID}  {Prof. SS ID, Student ID} 1NF 2NF: no partial FD on non-prime attributes on candidate key 3NF: No transitive FD NOT BCNF:  Prof. ID  Prof. SS ID : Functional Dependency but not candidate key  Should be decomposed (Prof. ID, Student ID), (Prof. ID, Prof. SS ID)

Decomposition Three Conditions o Elimination of Anomalies  Update  Redundancy  Deletion o Lossless Decomposition  Original Relation by Natural Join o Preservation of Dependencies Relation with two attributes: Always in BCNF (why?) 27

BCNF Decomposition Algorithm Algorithm o Input: Relation R 0 and set S 0 of FDs o Output: R 1, R 2, … R n such that R 0 =R 1 R 2 … R n o Process 1. Check R 0 is in BCNF, then return R 0 2. If there is any BCNF violation with X  Y, then compute X +. Then R 1 = X + and R 2 =has the rest attributes and X 3. Decompose FD set S 0 into S 1 and S Repeat 1-3 until no more BCNF violation. Example o Team (TeamName, Address, ManagerID, ManagerHireDate) o FD:  TeamName  Address, TeamName  ManagerID  ManagerID  ManagerHireDate 28