Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Schema Refinement

Similar presentations


Presentation on theme: "Introduction to Schema Refinement"— Presentation transcript:

1 Introduction to Schema Refinement

2 Normal Forms Types of Normal Form First Normal Form (1NF)
Second Normal Form (2NF) Third Normal Form(3NF) Boyce-Codd Normal Form ( BCNF) Fourth Normal Form (4NF) Fifth Nornal Form (5NF)

3 Normal Forms First Normal Form (1NF)
A row of data cannot contain repeating group of data Ie atomic value Here the student Jeet is used twice in the table and subject PHY is repeated Another method is to divide the relation into 2 Cid name Subject 101 Jeet PHY CHE 102 Seet 103 Swet SOCIAL Subid Cid Subject 1 101 PHY 2 CHE 3 102 4 103 SOCIAL Cid Cname 101 Jeet 102 Seet

4 Normal Forms Second normal form (2NF):
A relation that is in 1NF and every non-primary key attribute is fully functionally dependent on the primary key. Does not permit partial dependency No attribute is dependent on only to primary key Primary key consists of only a single attribute it is automatically be in 2NF if it is in 1NF

5 Normal Forms Second normal form (2NF):
Consider the following relation, not in 2NF Here Cid & Order_id is PK It is in 1NF Not in 2NF, there are partial dependencies of columns on Primary key Cname is only dependent on Cid Order_name is dependent on order_id There is no link between Cname & Sale_details To reduce this table into 2NF, break the table into 3 different tables Cid Cname Order_id Order_name Sale_details 101 Adam 10 Order1 Sale1 11 order2 sale2 102 Alex 12 Order3 Sale3 103 Sumo 13 Order4 Sale4 Cid Cname 101 Adam 102 Alex 103 Sumo Order_id Order_name 10 Order1 11 order2 12 Order3 13 Order4 Cid Order_id Sale_details 101 10 Sale1 11 sale2 102 12 Sale3 103 13 Sale4

6 Normal Forms Second normal form (2NF):
1NF  2NF This relation is in 1NF, because the value of each domain are atomic. To convert into 2NF First find the attribute that make primary key No one attribute alone form a primary key 2NF does not contain partial dependency Divide the above relation into 3 Order Product Customer Address Qty Unit Price s1 p1 Anish bhopal 300 500 p2 100 600 s2 p3 Ammu kollam s3 p5 Sumo kottayam 200 450

7 Normal Forms R1 R2 R3 Advantages
Insert : can insert any row in relation R1 without using product attribute Delete : can delete the S1 & P2 by deleting a row from relation R3, without losing the information that in R1 Update: address for a given customer is written once, it is not repeated many times in relation R1, so we can update the address only once Order Customer Address s1 Anish bhopal s2 Ammu kollam s3 Sumo kottayam Product Unit Price p1 500 p2 600 p3 300 p5 200 Order Product Qty s1 p1 300 p2 100 s2 p3 500 s3 p5 200 450

8 Normal Forms Third normal form (3NF)
  3NF is based on the concept of transitive dependency. Transitive dependencies are not allowed in 3NF. Transitive dependency means, if in a relation if XY and YZ hold, then X Z is also a functional dependency that holds on R. Here X, Y, Z are attributes of the table and also Y should not be a candidate key or a subset of any key (prime attribute) of the table R. Ie all non-prime attribute of table must be dependent on primary key Example.  Student3

9 Normal Forms Third normal form (3NF) 3 FD’s here. that is
Fd Stdidgrade Fd Stdid  marks Fd Marks  grade We can see that marks is not a prime attribute of student3. Stdid  grade is a transitive dependency because of Fd2 and Fd3. This is not allowed in 3NF.

10 Normal Forms Third normal form (3NF)
A relation R is said to be in 3NF, if R is in 2NF and also no non prime attribute of R is transitively dependent on the key of R. The above relation schema student3 is in 2NF, since there are no partial dependencies on a key exists. But it is not in 3NF because of the transitive dependency stdid  grade via ‘marks’. We can normalize student3 by decomposing it in to two 3NF relation schemas, Student3A and student3B as follows. Student3A (stdid, branch, sem, rn, name, marks) Student3B (marks, grade)

11 Normal Forms Third normal form (3NF)
Student3A and student3B as follows. Student3A (stdid, branch, sem, rn, name, marks) Student3B (marks, grade)

12 Normal Forms Boyce Codd Normal form (BCNF) It higher form of 3NF
This is because every relation in BCNF is als ion 3NF. However a relation in 3NF may not be in BCNF. A relation schema R is in BCNF if whenever a non trivial functional dependency X  A holds in R, then X is a superkey of R. The only difference between BCNF and 3NF is that the condition (b) of 3NF is absent from BCNF.

13 Normal Forms Boyce Codd Normal form (BCNF)
Here we can see that the relation Lots1A is not in BCNF, but it is in 3NF. FD5 violates BCNF because area is not a superkey. Fd1 and Fd2 satisfies BCNF because the LHS are super keys. So remove the attribute (county name) and place it in another relation.

14 Normal Forms Boyce Codd Normal form (BCNF)

15 Normal Forms Fourth Normal Form (4NF) A relation is in 4NF if
It is in BCNF And It has no multivalued dependency In this relation Each subject has a well-defined set of trainers Eg: MCA subject has 3 trainers Each subject has a well-defined set of textbooks Eg: mca subject has 2 textbooks The textbook that are used for a given subject are independent of the trainers Subject Trainer Textbooks MCA Aji Jinson Lisha DBMS JAVA Computer JK OS C++

16 Normal Forms Fourth Normal Form (4NF)
The table has been converted to a realtion by filling in all of empty rows The relation is in 1NF The primary key of this relation consists of all three attributes Since there is no determinants other than the primary key, this relation is actually in BCNF Subject Trainer Textbooks MCA AJI DBMS JAVA Jinson Lisha Computer JK OS C++

17 Normal Forms Fourth Normal Form (4NF)
Suppose for teaching MCA a new trainer comes, it is necessary to create 2 new tuples , one for each of the 2 text books. See that it is not necessary to include all faculty Decomposition cannot be made on the basis of functional dependencies, because there are no functional dependencies in the relation. So we introduce multi valued dependencies (MVDs) in the relation.

18 Normal Forms Multivalued dependencies and 4NF Subject  trainer
Subjecttextbook Double arrows are used here. Read as “ subject multidetermines trainer” or “trainer is multidependent on subject”) we know that a subject does not have a single corresponding trainer, ie.. functional dependency subject  trainer does not hold. But each subject has a well defined set of corresponding trainers. By well defined here means that for a given subject(MCA) and a given text book (DBMS) the set of trainers (AJI,Jinson,Lisha) matching the pair (MCA,Computer) in the relation depends on the value of MCA alone. It makes no difference what particular value of text book we choose. The second MVD can also be interpreted like this.

19 Normal Forms Definition of multi valued dependency
Let R be a table, and let A, B, C be arbitrary subsets of the set of attributes of R. Then we say that B is multidependent on A , AB. If and only if the set of B values matching a given ( A value, C value pair) in R depends only on the A value and is independent of the C value. MVDs always go together in pairs. That is given the table R (A, B, C), the MVD AB holds if and only if A C also holds.

20 Normal Forms Fourth normal form
This is based on multivalued functional dependencies. A relation schema R is in 4NF with respect to a set of dependencies F (that includes FDs and MVDs) if, for every non trivial multivalued dependency X Y, X is a super key of R. Consider the table. Subject Trainer Textbooks MCA AJI DBMS JAVA Jinson Lisha Computer JK OS C++

21 Normal Forms Fourth normal form
The table or relation CFX is not in fourth normal form because The MVDs subject  textbook and subjecttrainer are not satisfying any of the 2 conditions of fourth normal form. So decomposing it into tables Subject Trainer MCA AJI Jinson Lisha Computer JK Subject Textbooks MCA DBMS JAVA Computer OS C++

22 Normal Forms Lossless join decomposition Consider the example EMP
Suppose we decompose the EMP table into Emp_projects and Emp_dependents. Emp_projects Emp_dependents ename Pname Dname Smith X John Y Anna smith ename Pname Smith X Y ename Dname Smith John Anna

23 Normal Forms Lossless join decomposition
Suppose we again join these tables we can see that we get the original EMP table. So this decomposition of EMP table in to Emp_projects and Emp_dependents is a lossless join decomposition because nothing is lost after a decomposition.

24 Normal Forms Lossless join decomposition
Suppose we decompose the supply table in to two that is R1 and R2. We get

25 Normal Forms Lossless join decomposition
If we again join these two tables R1 and R2 we will get the join of these tables will not give our original table supply. So this is a lossy join decomposition because after decomposing the Supply table we have lost some values.

26 Normal Forms Join dependencies and fifth normal form
In some cases there may be no lossless join decomposition of a table R into 2 tables but there may be a lossless join decomposition into more than 2 tables. For example in the supply table

27 Normal Forms Join dependencies and fifth normal form
If we decompose the supply table in to 3

28 Normal Forms Join dependencies and fifth normal form
Here we can see that if we again join these tables R1, R2, R3 we will get the original table. We can see that by joining just R1 and R2 will not get the supply table. But by joining all these 3 tables we will get the supply table

29 Normal Forms Join dependencies and fifth normal form
So we are moving to another type of dependency called Join dependency. If a join dependency is present in a table we perform decomposition to fifth normal form(5NF) Here for the supply table the join dependency is specified by JD (R1, R2, R3) This is because by joining R1 and R2 and R3 tables we will get the original table ‘Supply’. JD (R1, R2, R3) can also be written as JD( (sname, partname), (sname, projname), (partname,projname) ) We can see that JD( R1, R2) is not valid for the supply table because on joining R1 and R2 we will not get the Supply table. Trivial join dependency   For a table R, a join dependency specified as JD(R1, R2, R3…) is trivial, if any of these Ri’ s is the table R.

30 Normal Forms Fifth normal form
It is also called project join normal form. A relation schema is in fifth normal form (5NF) , if for every nontrivial join dependency JD( R1, R2, R3…), every Ri is a superkey of R. example

31 Normal Forms Fifth normal form
The key of this table is (sname, partname, projname)  We have seen that it has a join dependency  JD { (sname,partname),(sname,projname), (partname,projname) }  Here the projections are (sname,partname), (sname,projname) and (partname,projname).

32 Normal Forms Fifth normal form
We can say that this table supply is not in 5NF because of this join dependency Each of these projections do not form a superkey of supply.  Superkey of supply is (sname,projname,partname). (sname,partname) is not a superkey. (sname,projname) is not a super key. (partname,projname) is not a superkey.

33 Normal Forms Fifth normal form
So we have to normalise this table supply in to tables that satisfy 5NF.  We are decomposing the table supply by considering the JD. Take each of the projections in the JD and form tables as

34 Normal Forms Fifth normal form

35 Normal Forms Fifth normal form
See that each of these R1, R2, R3 are in fifth normal form because there are no non trivial join dependencies in each of these tables. A join dependency is very difficult to detect in practice. So it is not normally applied in a database.

36 Normal Forms Fifth normal form

37 Normal Forms


Download ppt "Introduction to Schema Refinement"

Similar presentations


Ads by Google