Introduction to Schema Refinement

Slides:



Advertisements
Similar presentations
Schema Refinement: Normal Forms
Advertisements

Normalisation to 3NF Database Systems Lecture 11 Natasha Alechina.
Schema Refinement and Normal Forms Given a design, how do we know it is good or not? What is the best design? Can a bad design be transformed into a good.
1/22/20091 Study the methods of first, second, third, Boyce-Codd, fourth and fifth normal form for relational database design, in order to eliminate data.
NORMALIZATION. Normalization Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations.
METU Department of Computer Eng Ceng 302 Introduction to DBMS Further Dependencies by Pinar Senkul resources: mostly froom Elmasri, Navathe and other books.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Relational Normalization Theory. Limitations of E-R Designs Provides a set of guidelines, does not result in a unique database schema Does not provide.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Nov 11, 2003Murali Mani Normalization B term 2004: lecture 7, 8, 9.
Part 6 Chapter 15 Normalization of Relational Database Csci455 r 1.
1 Database Design Theory Which tables to have in a database Normalization.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Normalization II. Boyce–Codd Normal Form (BCNF) Based on functional dependencies that take into account all candidate keys in a relation, however BCNF.
Chapter 8 Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Introduction to Schema Refinement. Different problems may arise when converting a relation into standard form They are Data redundancy Update Anomalies.
Ch 7: Normalization-Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa.
CS 405G: Introduction to Database Systems 16. Functional Dependency.
Lecture 12 Inst: Haya Sammaneh
Copyright © Curt Hill Schema Refinement III 4 th NF and 5 th NF.
IS 230Lecture 8Slide 1 Normalization Lecture 9. IS 230Lecture 8Slide 2 Lecture 8: Normalization 1. Normalization 2. Data redundancy and anomalies 3. Spurious.
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
Database Management COP4540, SCS, FIU Relation Normalization (Chapter 14)
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Normalization for Relational Databases.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
CS143 Review: Normalization Theory Q: Is it a good table design? We can start with an ER diagram or with a large relation that contain a sample of the.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide DESIGNING A SET OF RELATIONS (2) Goals: Lossless join property (a must). Dependency.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 4 Normalization.
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Normalization Ioan Despi 2 The basic objective of logical modeling: to develop a “good” description of the data, its relationships and its constraints.
Functional Dependencies and Normalization for Relational Databases
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Chapter 11 Relational Database Design Algorithms and Further Dependencies.
Relational Database Design Algorithms and Further Dependencies.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems.
1 Functional Dependencies and Normalization Chapter 15.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
Design Process - Where are we?
Dr. Mohamed Osman Hegaz1 Logical data base design (2) Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Normalization.
Chapter 5.1 and 5.2 Brian Cobarrubia Database Management Systems II January 31, 2008.
CS 405G: Introduction to Database Systems Instructor: Jinze Liu Fall 2009.
CS 405G: Introduction to Database Systems Database Normalization.
Ch 7: Normalization-Part 1
Relational Database Design Algorithms and Further Dependencies.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
1 CS 430 Database Theory Winter 2005 Lecture 8: Functional Dependencies Second, Third, and Boyce-Codd Normal Forms.
Objectives of Normalization  To create a formal framework for analyzing relation schemas based on their keys and on the functional dependencies among.
Copyright © Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
4NF & MULTIVALUED DEPENDENCY By Kristina Miguel. Review  Superkey – a set of attributes which will uniquely identify each tuple in a relation  Candidate.
Normalization Database Management Systems, 3rd ed., Ramakrishnan and Gehrke, Chapter 19.
Advanced Normalization
CHAPTER 14 Basics of Functional Dependencies and Normalization for Relational Databases.
Normalization Karolina muszyńska
Gergely Lukács Pázmány Péter Catholic University
Advanced Normalization
Normal forms First Normal Form (1NF) Second Normal Form (2NF)
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Database Normalization
Module 5: Overview of Normalization
Database solutions The process of normalization Marzena Nowakowska Faculty of Management and Computer Modelling Kielce University of Technology rooms:
Normalization.
Chapter Outline 1 Informal Design Guidelines for Relational Databases
Chapter 7a: Overview of Database Design -- Normalization
Presentation transcript:

Introduction to Schema Refinement

Normal Forms Types of Normal Form First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd Normal Form ( BCNF) Fourth Normal Form (4NF) Fifth Nornal Form (5NF)

Normal Forms First Normal Form (1NF) A row of data cannot contain repeating group of data Ie atomic value Here the student Jeet is used twice in the table and subject PHY is repeated Another method is to divide the relation into 2 Cid name Subject 101 Jeet PHY CHE 102 Seet 103 Swet SOCIAL Subid Cid Subject 1 101 PHY 2 CHE 3 102 4 103 SOCIAL Cid Cname 101 Jeet 102 Seet

Normal Forms Second normal form (2NF): A relation that is in 1NF and every non-primary key attribute is fully functionally dependent on the primary key. Does not permit partial dependency No attribute is dependent on only to primary key Primary key consists of only a single attribute it is automatically be in 2NF if it is in 1NF

Normal Forms Second normal form (2NF): Consider the following relation, not in 2NF Here Cid & Order_id is PK It is in 1NF Not in 2NF, there are partial dependencies of columns on Primary key Cname is only dependent on Cid Order_name is dependent on order_id There is no link between Cname & Sale_details To reduce this table into 2NF, break the table into 3 different tables Cid Cname Order_id Order_name Sale_details 101 Adam 10 Order1 Sale1 11 order2 sale2 102 Alex 12 Order3 Sale3 103 Sumo 13 Order4 Sale4 Cid Cname 101 Adam 102 Alex 103 Sumo Order_id Order_name 10 Order1 11 order2 12 Order3 13 Order4 Cid Order_id Sale_details 101 10 Sale1 11 sale2 102 12 Sale3 103 13 Sale4

Normal Forms Second normal form (2NF): 1NF  2NF This relation is in 1NF, because the value of each domain are atomic. To convert into 2NF First find the attribute that make primary key No one attribute alone form a primary key 2NF does not contain partial dependency Divide the above relation into 3 Order Product Customer Address Qty Unit Price s1 p1 Anish bhopal 300 500 p2 100 600 s2 p3 Ammu kollam s3 p5 Sumo kottayam 200 450

Normal Forms R1 R2 R3 Advantages Insert : can insert any row in relation R1 without using product attribute Delete : can delete the S1 & P2 by deleting a row from relation R3, without losing the information that in R1 Update: address for a given customer is written once, it is not repeated many times in relation R1, so we can update the address only once Order Customer Address s1 Anish bhopal s2 Ammu kollam s3 Sumo kottayam Product Unit Price p1 500 p2 600 p3 300 p5 200 Order Product Qty s1 p1 300 p2 100 s2 p3 500 s3 p5 200 450

Normal Forms Third normal form (3NF)   3NF is based on the concept of transitive dependency. Transitive dependencies are not allowed in 3NF. Transitive dependency means, if in a relation if XY and YZ hold, then X Z is also a functional dependency that holds on R. Here X, Y, Z are attributes of the table and also Y should not be a candidate key or a subset of any key (prime attribute) of the table R. Ie all non-prime attribute of table must be dependent on primary key Example.  Student3

Normal Forms Third normal form (3NF) 3 FD’s here. that is   Fd1 Stdidgrade Fd2 Stdid  marks Fd3 Marks  grade We can see that marks is not a prime attribute of student3. Stdid  grade is a transitive dependency because of Fd2 and Fd3. This is not allowed in 3NF.

Normal Forms Third normal form (3NF) A relation R is said to be in 3NF, if R is in 2NF and also no non prime attribute of R is transitively dependent on the key of R. The above relation schema student3 is in 2NF, since there are no partial dependencies on a key exists. But it is not in 3NF because of the transitive dependency stdid  grade via ‘marks’. We can normalize student3 by decomposing it in to two 3NF relation schemas, Student3A and student3B as follows.   Student3A (stdid, branch, sem, rn, name, marks) Student3B (marks, grade)

Normal Forms Third normal form (3NF) Student3A and student3B as follows.   Student3A (stdid, branch, sem, rn, name, marks) Student3B (marks, grade)

Normal Forms Boyce Codd Normal form (BCNF) It higher form of 3NF This is because every relation in BCNF is als ion 3NF. However a relation in 3NF may not be in BCNF. A relation schema R is in BCNF if whenever a non trivial functional dependency X  A holds in R, then X is a superkey of R. The only difference between BCNF and 3NF is that the condition (b) of 3NF is absent from BCNF.  

Normal Forms Boyce Codd Normal form (BCNF) Here we can see that the relation Lots1A is not in BCNF, but it is in 3NF. FD5 violates BCNF because area is not a superkey. Fd1 and Fd2 satisfies BCNF because the LHS are super keys. So remove the attribute (county name) and place it in another relation.

Normal Forms Boyce Codd Normal form (BCNF)

Normal Forms Fourth Normal Form (4NF) A relation is in 4NF if It is in BCNF And It has no multivalued dependency In this relation Each subject has a well-defined set of trainers Eg: MCA subject has 3 trainers Each subject has a well-defined set of textbooks Eg: mca subject has 2 textbooks The textbook that are used for a given subject are independent of the trainers Subject Trainer Textbooks MCA Aji Jinson Lisha DBMS JAVA Computer JK OS C++

Normal Forms Fourth Normal Form (4NF) The table has been converted to a realtion by filling in all of empty rows The relation is in 1NF The primary key of this relation consists of all three attributes Since there is no determinants other than the primary key, this relation is actually in BCNF Subject Trainer Textbooks MCA AJI DBMS JAVA Jinson Lisha Computer JK OS C++

Normal Forms Fourth Normal Form (4NF) Suppose for teaching MCA a new trainer comes, it is necessary to create 2 new tuples , one for each of the 2 text books. See that it is not necessary to include all faculty Decomposition cannot be made on the basis of functional dependencies, because there are no functional dependencies in the relation. So we introduce multi valued dependencies (MVDs) in the relation.

Normal Forms Multivalued dependencies and 4NF Subject  trainer Subjecttextbook Double arrows are used here. Read as “ subject multidetermines trainer” or “trainer is multidependent on subject”) we know that a subject does not have a single corresponding trainer, ie.. functional dependency subject  trainer does not hold. But each subject has a well defined set of corresponding trainers. By well defined here means that for a given subject(MCA) and a given text book (DBMS) the set of trainers (AJI,Jinson,Lisha) matching the pair (MCA,Computer) in the relation depends on the value of MCA alone. It makes no difference what particular value of text book we choose. The second MVD can also be interpreted like this.

Normal Forms Definition of multi valued dependency Let R be a table, and let A, B, C be arbitrary subsets of the set of attributes of R. Then we say that B is multidependent on A , AB. If and only if the set of B values matching a given ( A value, C value pair) in R depends only on the A value and is independent of the C value. MVDs always go together in pairs. That is given the table R (A, B, C), the MVD AB holds if and only if A C also holds.

Normal Forms Fourth normal form This is based on multivalued functional dependencies. A relation schema R is in 4NF with respect to a set of dependencies F (that includes FDs and MVDs) if, for every non trivial multivalued dependency X Y, X is a super key of R. Consider the table. Subject Trainer Textbooks MCA AJI DBMS JAVA Jinson Lisha Computer JK OS C++

Normal Forms Fourth normal form The table or relation CFX is not in fourth normal form because The MVDs subject  textbook and subjecttrainer are not satisfying any of the 2 conditions of fourth normal form. So decomposing it into tables Subject Trainer MCA AJI Jinson Lisha Computer JK Subject Textbooks MCA DBMS JAVA Computer OS C++

Normal Forms Lossless join decomposition Consider the example EMP Suppose we decompose the EMP table into Emp_projects and Emp_dependents. Emp_projects Emp_dependents ename Pname Dname Smith X John Y Anna smith ename Pname Smith X Y ename Dname Smith John Anna

Normal Forms Lossless join decomposition Suppose we again join these tables we can see that we get the original EMP table. So this decomposition of EMP table in to Emp_projects and Emp_dependents is a lossless join decomposition because nothing is lost after a decomposition.

Normal Forms Lossless join decomposition Suppose we decompose the supply table in to two that is R1 and R2. We get

Normal Forms Lossless join decomposition If we again join these two tables R1 and R2 we will get the join of these tables will not give our original table supply. So this is a lossy join decomposition because after decomposing the Supply table we have lost some values.

Normal Forms Join dependencies and fifth normal form In some cases there may be no lossless join decomposition of a table R into 2 tables but there may be a lossless join decomposition into more than 2 tables. For example in the supply table

Normal Forms Join dependencies and fifth normal form If we decompose the supply table in to 3

Normal Forms Join dependencies and fifth normal form Here we can see that if we again join these tables R1, R2, R3 we will get the original table. We can see that by joining just R1 and R2 will not get the supply table. But by joining all these 3 tables we will get the supply table

Normal Forms Join dependencies and fifth normal form So we are moving to another type of dependency called Join dependency. If a join dependency is present in a table we perform decomposition to fifth normal form(5NF) Here for the supply table the join dependency is specified by JD (R1, R2, R3) This is because by joining R1 and R2 and R3 tables we will get the original table ‘Supply’. JD (R1, R2, R3) can also be written as JD( (sname, partname), (sname, projname), (partname,projname) ) We can see that JD( R1, R2) is not valid for the supply table because on joining R1 and R2 we will not get the Supply table. Trivial join dependency   For a table R, a join dependency specified as JD(R1, R2, R3…) is trivial, if any of these Ri’ s is the table R.

Normal Forms Fifth normal form It is also called project join normal form. A relation schema is in fifth normal form (5NF) , if for every nontrivial join dependency JD( R1, R2, R3…), every Ri is a superkey of R. example

Normal Forms Fifth normal form The key of this table is (sname, partname, projname)  We have seen that it has a join dependency  JD { (sname,partname),(sname,projname), (partname,projname) }  Here the projections are (sname,partname), (sname,projname) and (partname,projname).

Normal Forms Fifth normal form We can say that this table supply is not in 5NF because of this join dependency Each of these projections do not form a superkey of supply.  Superkey of supply is (sname,projname,partname). (sname,partname) is not a superkey. (sname,projname) is not a super key. (partname,projname) is not a superkey.

Normal Forms Fifth normal form So we have to normalise this table supply in to tables that satisfy 5NF.  We are decomposing the table supply by considering the JD. Take each of the projections in the JD and form tables as

Normal Forms Fifth normal form

Normal Forms Fifth normal form See that each of these R1, R2, R3 are in fifth normal form because there are no non trivial join dependencies in each of these tables.   A join dependency is very difficult to detect in practice. So it is not normally applied in a database.

Normal Forms Fifth normal form

Normal Forms