Design Process - Where are we?

Slides:



Advertisements
Similar presentations
Functional Dependencies and Normalization for Relational Databases
Advertisements

PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Chapter 3 Notes. 3.1 Functional Dependencies A functional dependency is a statement that – two tuples of a relation that agree on some particular set.
Ch 10, Functional Dependencies and Normal forms
Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for Relational Databases
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Part 6 Chapter 15 Normalization of Relational Database Csci455 r 1.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
METU Department of Computer Eng Ceng 302 Introduction to DBMS Functional Dependencies and Normalization for Relational Databases by Pinar Senkul resources:
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Chapter 8 Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Ch 7: Normalization-Part 2 Much of the material presented in these slides was developed by Dr. Ramon Lawrence at the University of Iowa.
©Silberschatz, Korth and Sudarshan7.1Database System Concepts Chapter 7: Relational Database Design First Normal Form Pitfalls in Relational Database Design.
Chapter 10 Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 10 Functional Dependencies and Normalization for Relational Databases.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 6 NORMALIZATION FOR RELATIONAL DATABASES Instructor Ms. Arwa Binsaleh.
King Saud University College of Computer & Information Sciences Computer Science Department CS 380 Introduction to Database Systems Functional Dependencies.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Normalization for Relational Databases.
DatabaseIM ISU1 Chapter 10 Functional Dependencies and Normalization for RDBs Fundamentals of Database Systems.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Topic 10 Functional Dependencies and Normalization for Relational Databases Faculty of Information Science and Technology Mahanakorn University of Technology.
Instructor: Churee Techawut Functional Dependencies and Normalization for Relational Databases Chapter 4 CS (204)321 Database System I.
Top-Down Database Design Mini-world Requirements Conceptual schema E1 E2 R Relation schemas ?
BCNF & Lossless Decomposition Prof. Sin-Min Lee Department of Computer Science.
Functional Dependencies and Normalization for Relational Databases.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 15 Basics of Functional Dependencies and Normalization for Relational.
Chapter Functional Dependencies and Normalization for Relational Databases.
CSE314 Database Systems Basics of Functional Dependencies and Normalization for Relational Databases Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E.
Functional Dependencies and Normalization Jose M. Peña
COMP1212 COMP1212 Anomalies and Dependencies Dr. Mabruk Ali.
1 Functional Dependencies and Normalization Chapter 15.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Lecture 8: Database Concepts May 4, Outline From last lecture: creating views Normalization.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
Dr. Mohamed Osman Hegaz1 Logical data base design (2) Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
PMIT-6102 Advanced Database Systems
Normalization.
Database Design FUNCTIONAL DEPENDENCES NORMAL FORMS D. Christozov / G.Tuparov INF 280 Database Systems: DB design: Normal Forms 1.
Chapter 7 Functional Dependencies Copyright © 2004 Pearson Education, Inc.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 340 Introduction to Database Systems.
CS 338Database Design and Normal Forms9-1 Database Design and Normal Forms Lecture Topics Measuring the quality of a schema Schema design with normalization.
Ch 7: Normalization-Part 1
Deanship of Distance Learning Avicenna Center for E-Learning 1 Session - 7 Sequence - 1 Normalization DB Design Guidelines Presented by: Dr. Samir Tartir.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Al-Imam University Girls Education Center Collage of Computer Science 1 st Semester, 1432/1433H Chapter 10_part 1 Functional Dependencies and Normalization.
Chapter 8 Relational Database Design. 2 Relational Database Design: Goals n Reduce data redundancy (undesirable replication of data values) n Minimize.
11/06/97J-1 Principles of Relational Design Chapter 12.
Normalization and FUNctional Dependencies. Redundancy: root of several problems with relational schemas: –redundant storage, insert/delete/update anomalies.
Chapter 10 Functional Dependencies and Normalization for Relational Databases Copyright © 2004 Pearson Education, Inc.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
10/3/2017.
Relational Normalization Theory
Functional Dependency and Normalization
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Database Management systems Subject Code: 10CS54 Prepared By:
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Presentation transcript:

Design Process - Where are we? Conceptual Design Conceptual Schema (ER Model) Step 1: ER-to-Relational Mapping Step 2: Normalization: “Improving” the design Logical Design Logical Schema (Relational Model)

Relational Design Principles Relations should have semantic unity Information repetition should be avoided Anomalies: insertion, deletion, modification Avoid null values as much as possible Difficulties with interpretation don’t know, don’t care, known but unavailable, does not apply Specification of joins Avoid spurious joins

Information Repetition The TITLE, SALARY attribute values are repeated for each project that the engineer is involved in. Waste of space Complicates updates EMPLOYEE ENO ENAME TITLE SALARY J. Doe Elect. Eng. 40000 M. Smith 34000 Analyst A. Lee Mech. Eng. 27000 J. Miller Programmer 24000 B. Casey Syst. Anal. L. Chu R. Davis E1 E2 E3 E4 E5 E6 E7 E8 J. Jones 24 PJNO RESP DURATION P1 Manager 12 P2 6 P3 Consultant 10 P4 Engineer 48 18 36 40 » This example instance of EMPLOYEE relation violates one of the constraints in our earlier design. Which one?

Insertion Anomaly If a new employee joins the company, his name, title and salary cannot be inserted into the database unless he is assigned to a project. OR we live with nulls ENO EMPLOYEE ENAME TITLE SALARY J. Doe Elect. Eng. 40000 M. Smith 34000 Analyst A. Lee Mech. Eng. 27000 J. Miller Programmer 24000 B. Casey Syst. Anal. L. Chu R. Davis E1 E2 E3 E4 E5 E6 E7 E8 J. Jones 24 PJNO RESP DURATION P1 Manager 12 P2 6 P3 Consultant 10 P4 Engineer 48 18 36 40 » »

Insertion Anomaly Assume we had the following EMP-PROJ relation instead of the two separate EMPLOYEE and PROJECT relations It is difficult (impossible?) to store information about a new project until an employee is assigned to it Previous problem still applies ENO EMP-PROJ ENAME TITLE SALARY J. Doe Elect. Eng. 40000 M. Smith 34000 Analyst A. Lee Mech. Eng. 27000 J. Miller Programmer 24000 B. Casey Syst. Anal. L. Chu R. Davis E1 E2 E3 E4 E5 E6 E7 E8 J. Jones 24 PJNO RESP DURATION P1 Manager 12 P2 6 P3 Consultant 10 P4 Engineer 48 18 36 40 » PNAME BUDGET Instrumentation 150000 Database Develop. 135000 CAD/CAM 250000 Maintenance 310000 »

Deletion Anomaly If an engineer, who is the only employee on a project, leaves the company, his personal information cannot be deleted. OR the information about that project is lost. ENO EMP-PROJ ENAME TITLE SALARY J. Doe Elect. Eng. 40000 M. Smith 34000 Analyst A. Lee Mech. Eng. 27000 J. Miller Programmer 24000 B. Casey Syst. Anal. L. Chu R. Davis E1 E2 E3 E4 E5 E6 E7 E8 J. Jones 24 PJNO RESP DURATION P1 Manager 12 P2 6 P3 Consultant 10 P4 Engineer 48 18 36 40 » PNAME BUDGET Instrumentation 150000 Database Develop. 135000 CAD/CAM 250000 Maintenance 310000 MGR

Modification Anomaly If any attribute of project (say BUDGET of P1) is modified, all the tuples for all employees who work on that project need to be modified. EMP-PROJ » ENO ENAME TITLE SALARY PJNO PNAME BUDGET DURATION RESP MGR » E1 J. Doe Elect. Eng. 40000 P1 Instrumentation 150000 12 Manager E1 E2 M. Smith Analyst 34000 P1 Instrumentation 150000 24 Analyst E1 E2 M. Smith Analyst 34000 P2 Database Develop. 135000 6 Analyst E5 E3 A. Lee Mech. Eng. 27000 P3 CAD/CAM 250000 10 Consultant E8 E3 A. Lee Mech. Eng. 27000 P4 Maintenance 310000 48 Engineer E6 E4 J. Miller Programmer 24000 P2 Database Develop. 135000 18 Programmer E5 E5 B. Casey Syst. Anal. 34000 P2 Database Develop. 135000 24 Manager E5 E6 L. Chu Elect. Eng. 40000 P4 Maintenance 310000 48 Manager E6 E7 R. Davis Mech. Eng. 27000 P3 CAD/CAM 250000 36 Engineer E8 E8 J. Jones Syst. Anal. 34000 » P3 CAD/CAM 250000 40 Manager E8

What to do? We take each relation individually and “improve” them in terms of the desired characteristics Normalization Normal forms

Normalization Normalization is a process of concept separation which applies a top-down methodology for producing a schema by subsequent refinements and decompositions. Universal relation assumption 1NF to 3NF; 1NF to BCNF Questions: How do we decompose a schema into a desirable normal form? What criteria the decomposed schemas should follow in order to preserve the semantics of the original schema? Lossless decomposition: no information loss Can we guarantee the quality of the decomposition Reconstructability: recover the original relation  no spurious joins What happens to queries? Processing time may increase due to joins Denormalization

Normal Forms All relations 1NF 2NF 3NF BCNF 4NF 5NF

Normal Forms Normal Forms can be defined according to keys and dependencies Functional Dependencies Primary keys (1NF, 2NF, 3NF) All candidate keys ( 2NF, 3NF, BCNF) Multivalued dependencies (4NF) Join dependencies (5NF) – we won’t talk about these

Dependencies Functional dependency (FD) Multivalued dependency (MVD) X ® Y given a value of X, there is one value of Y Multivalued dependency (MVD) X ®® Y given a value of X, there is a set of values of Y.

Functional Dependency Given relation R defined over U = {A1, A2, ..., An} where X  U, Y  U. If, for all pairs of tuples t1 and t2 in any legal instance of relation scheme R, t1[X] = t2[X] fi t1[Y] = t2[Y], then the functional dependency X ® Y holds in R. Example In relation PROJ PJNO ® (PNAME, BUDGET, MGR) In relation EMPLOYEE (ENO, PJNO) ® (ENAME, TITLE, SALARY, APT#, STREET, CITY, DURATION, RESP) ENO ® (ENAME, TITLE, SALARY,APT#,STREET,CITY) TITLE ® SAL

Inferencing over FDs We would like to be able to infer from a given set of FDs. Example: ENO ® (ENAME, TITLE, SALARY,APT#,STREET,CITY) fi ENO ® ENAME This requires a set of inference rules Armstrong’s rules (axioms) Additional ones

Some Basics Attributes Full functional dependency Prime attribute is a member of any key Non-prime attribute is any attribute which is not prime Full functional dependency A FD X® Y is a full functional dependency if X is minimal, i.e., removal of any attribute A from X means the dependency does not hold anymore. Formally - iff for any A Î X, (X-{A})® Y. Partial functional dependency /

Normal Forms Based on FDs 1NF eliminates the relations within relations or relations as attributes of tuples. First Normal Form (1NF) eliminate the partial functional dependencies of non-prime attributes to key attributes Second Normal Form (2NF) eliminate the transitive functional dependencies of non-prime attributes to key attributes Third Normal Form (3NF) eliminate the partial and transitive functional dependencies of prime (key) attributes to key. Boyce-Codd Normal Form (BCNF)

First Normal Form All attribute values are atomic 1NF relation cannot have an attribute value that is: a set of values (set-value) a tuple of values (nested relation) This is a standard assumption in relational DBMSs In object-oriented DBMSs this assumption is relaxed.

Second Normal Form Two possible definitions: A relation R Î 2NF iff all non-prime attributes in R are fully functionally dependent on primary key. A relation R Î 2NF iff the attributes are either a candidate key, or fully dependent on every key. Partial functional dependencies cause problems.

2NF – Example EMP-PROJ ENO ENAME TITLE SALARY APT# STREET CITY PJNO PNAME BUDGET DURATION RESP MGR fd1 fd2 fd3 fd4 EMP-PROJ is in 1NF, but not in 2NF due to fd1, fd3 and fd4 Decompose it into EMP (ENO, ENAME, TITLE, SALARY,APT#,STREET,CITY) PROJECT (PJNO,PNAME,BUDGET,MGR) ASSIGN(ENO,PJNO,DURATION,RESP)

2NF – Example fd1 fd2 fd4 fd3 EMP PROJECT ASSIGN ENO ENAME TITLE SALARY APT# STREET CITY fd1 fd2 PJNO PNAME BUDGET MGR PROJECT fd4 ENO DURATION RESP PJNO ASSIGN fd3

Third Normal Form Intuitively: A relation R Î 3NF iff R Î 2NF (i.e., every non-prime attribute is fully functionally dependent on every key) No non-prime attribute of R is transitively dependent on the prime key. The issues is to remove the transitive dependencies

Third Normal Form Formally: A relation scheme R defined over U = {A1, A2, …, An} is in 3NF if for all functional dependencies that hold on R of the form X®Y, where X  U and X  U, at least one of the following holds: X® Y is a trivial functional dependency (i.e., Y  X or XÈ Y=U) X is a superkey for R Y is contained in a candidate key for R (Y is a prime attribute) The last condition deals with transitive dependencies.

3NF – Example EMP is not in 3NF because of fd2 Solution: fd1 fd2 ENO ENAME TITLE SALARY APT# STREET CITY fd1 fd2 EMP is not in 3NF because of fd2 TITLE ® SALARY but TITLE is not a superkey and SALARY is not prime Problem is that ENO transitively determines SALARY (as well as directly determining it) Solution: EMP PAY ENO ENAME TITLE APT# STREET CITY TITLE SALARY fd1 fd2

Boyce-Codd Normal Form You can still have transitive dependencies in 3NF if the dependent attribute(s) are prime. A 1NF relation R is in BCNF if for every functional dependency X®Y, X is a superkey. Properties of BCNF All non-prime attributes are fully dependent on every key. All prime attributes are fully dependent on the keys that they do not belong to. No attribute is fully dependent on any set of non-prime attributes.

Boyce-Codd Normal Form Formally: A relation scheme R defined over U = {A1, A2, …, An} is in BCNF if for all functional dependencies that hold on R of the form X® A, where X  U and AÎ U, at least one of the following holds : X® A is a trivial functional dependency X is a superkey for R No transitive dependencies.

BCNF – Example Our relations are all in BCNF Assume, however, that in the PROJECT relation there was another attribute HEADQTR, and the headquarters of a project was the location of the manager, and project had to be headquartered at a different city FDs would be PROJECT PJNO PNAME BUDGET MGR HEADQTR which makes PROJECT in 3NF but not in BCNF

Multivalued Dependency Relation R defined over U = {A1, A1, ..., An}; XÍ U, Y Í U, ZÍ U. If for each value of X in R, there is a set of Y values and this set is only dependent on the value of X and independent of the value of Z, then Y has a multivalued dependency (MVD) on X (X ®® Y). w1 w2 p1 p2 p3 s1 s2 s3 s4 P S W p4 p5

Multivalued Dependency – Formal Relation R defined over U = {A1, A1, ..., An}; XÍ U, Y Í U. Multivalued dependency X ®® Y holds on R if in any legal relation r of R, for all pairs of tuples t1 and t2 in r such that t1[X]= t2[X], there exists tuples t3 and t4 in r such that: 1. t1[X]= t2[X]= t3[X]= t4[X] 2. t3[Y]= t1[Y] 3. t3[U–Y]= t2[U–Y] 4. t4[Y]= t2[Y] 5. t4[U–Y]= t1[U–Y]

MVD Example Assume relation SKILL(ENO, PNO, PLACE) MVDs Each engineer can work on each project; Each engineer can work at any of the branch offices; Each project can be carried out at any of the branch offices. MVDs ENO ®® PNO ENO ®® PLACE

MVD Example Note: Information repetition SKILL ENO PNO PLACE E1 P1 Toronto E1 P1 New York E1 P1 London Note: Information repetition E1 P2 Toronto E1 P2 New York E1 P2 London E2 P1 Toronto E2 P1 New York E2 P1 London E2 P2 Toronto E2 P2 New York E2 P2 London

Fourth Normal Form (4NF) For each MVD of type X  Y in a relation R, X also functionally determines all the attributes of R. If a relation is in BCNF and all MVDs are also FDs, the relation is in 4NF. Formally: A relational scheme R defined over U={A1, A2, ..., An} is in 4NF with respect to a set D of functional and multivalued dependencies if for all multivalued dependencies in D+ of the form X ®® Y, where X Í U and YÍ U , at least one of the following hold : X ®® Y is a trivial multivalued dependency X is a superkey for scheme R

4NF – Formal A relational scheme R defined over U = {A1, A2, ..., An} is in 4NF with respect to a set D of functional and multivalued dependencies if for all multivalued dependencies in D+ of the form X  Y, where XÍ U and YÍ U, at least one of the following hold : X ®® Y is a trivial multivalued dependency X ®® Y is a trivial MVD if Y Í X or X È Y = R X is a superkey for scheme R