PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.

Slides:



Advertisements
Similar presentations
Functional Dependencies and Normalization for Relational Databases
Advertisements

Normalisation to 3NF Database Systems Lecture 11 Natasha Alechina.
1/22/20091 Study the methods of first, second, third, Boyce-Codd, fourth and fifth normal form for relational database design, in order to eliminate data.
Normalisation The theory of Relational Database Design.
Ch 10, Functional Dependencies and Normal forms
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
CS263:Revision on Normalisation
Normalization I.
1 Functional Dependency and Normalization Informal design guidelines for relation schemas. Functional dependencies. Normal forms. Normalization.
Chapter 5 Normalization Transparencies © Pearson Education Limited 1995, 2005.
Normalization. Introduction Badly structured tables, that contains redundant data, may suffer from Update anomalies : Insertions Deletions Modification.
Week 6 Lecture Normalization
DBSQL 4-1 Copyright © Genetic Computer School 2009 Chapter 4 Database Design.
Lecture 12 Inst: Haya Sammaneh
IS 230Lecture 8Slide 1 Normalization Lecture 9. IS 230Lecture 8Slide 2 Lecture 8: Normalization 1. Normalization 2. Data redundancy and anomalies 3. Spurious.
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
King Saud University College of Computer & Information Sciences Computer Science Department CS 380 Introduction to Database Systems Functional Dependencies.
Normalization. Learners Support Publications 2 Objectives u The purpose of normalization. u The problems associated with redundant data.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Normalization for Relational Databases.
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
Normalization Transparencies
Schema Refinement and Normal Forms 20131CS3754 Class Notes #7, John Shieh.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
Normalization Ioan Despi 2 The basic objective of logical modeling: to develop a “good” description of the data, its relationships and its constraints.
1 5 Normalization. 2 5 Database Design Give some body of data to be represented in a database, how do we decide on a suitable logical structure for that.
By Abdul Rashid Ahmad. E.F. Codd proposed three normal forms: The first, second, and third normal forms 1NF, 2NF and 3NF are based on the functional dependencies.
Lecture 5 Normalization. Objectives The purpose of normalization. How normalization can be used when designing a relational database. The potential problems.
Chapter 13 Normalization Transparencies Last Updated: 08 th May 2011 By M. Arief
Chapter 10 Normalization Pearson Education © 2009.
1 Functional Dependencies and Normalization Chapter 15.
1 5 Chapter 5 Database Design 1: Some Normalization Examples Spring 2006.
Lecture No 13 Functional Dependencies & Normalization ( II ) Mar 3 rd 2011 Database Systems.
1 CSE 480: Database Systems Lecture 18: Normal Forms and Normalization.
Design Process - Where are we?
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
PMIT-6102 Advanced Database Systems
Normalization.
Lecture 3 Functional Dependency and Normal Forms Prof. Sin-Min Lee Department of Computer Science.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 340 Introduction to Database Systems.
Ch 7: Normalization-Part 1
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Al-Imam University Girls Education Center Collage of Computer Science 1 st Semester, 1432/1433H Chapter 10_part 1 Functional Dependencies and Normalization.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
Chapter 14 Functional Dependencies and Normalization Informal Design Guidelines for Relational Databases –Semantics of the Relation Attributes –Redundant.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Chapter 8 Relational Database Design Topic 1: Normalization Chuan Li 1 © Pearson Education Limited 1995, 2005.
COP 6726: New Directions in Database Systems
Relational Normalization Theory
Functional Dependency and Normalization
Normalization Karolina muszyńska
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Payroll Management System
Chapter 8: Relational Database Design
Database Normalization
Module 5: Overview of Normalization
Some Normalization Examples
Normalization Dale-Marie Wilson, Ph.D..
Normalization.
Chapter 8 – Part2 Database Design.
Chapter 8 – Part2 Database Design.
國立臺北科技大學 課程:資料庫系統 2015 fall Chapter 14 Normalization.
Database Normalization.
Chapter 7a: Overview of Database Design -- Normalization
Some Normalization Examples
Presentation transcript:

PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University

Lecture 03 Relational Database Design Normalization

Outline Overview of Relational DBMS  Normalization(1 st lecture)

The aim of normalization is to eliminate various anomalies (or undesirable aspects) of a relation in order to obtain “better” relations. The following four problems might exist in a relation scheme:  Repetition anomaly  Update anomaly  Insertion anomaly  Deletion anomaly Slide 4 Normalization

Repetition Anomaly The NAME,TITLE, SAL attribute values are repeated for each project that the employee is involved in.  Waste of space  Complicates updates  Contrary to the spirit of databases ENO EMP ENAMETITLESAL J. DoeElect. Eng M. Smith34000 M. Smith Analyst A. LeeMech. Eng A. LeeMech. Eng J. MillerProgrammer24000 B. CaseySyst. Anal L. ChuElect. Eng R. DavisMech. Eng E1 E2 E3 E4 E5 E6 E7 E8J. Jones Syst. Anal PNORESPDUR P1Manager12 P1Analyst P2Analyst6 P3Consultant10 P4Engineer48 P2Programmer18 P2Manager24 P4Manager48 P3Engineer36 P3Manager40

Update Anomaly If any attribute of project (say SAL of an employee) is updated, multiple tuples have to be updated to reflect the change. ENO EMP ENAMETITLESAL J. DoeElect. Eng M. Smith34000 M. Smith Analyst A. LeeMech. Eng A. LeeMech. Eng J. MillerProgrammer24000 B. CaseySyst. Anal L. ChuElect. Eng R. DavisMech. Eng E1 E2 E3 E4 E5 E6 E7 E8J. Jones Syst. Anal PNORESPDUR P1Manager12 P1Analyst P2Analyst6 P3Consultant10 P4Engineer48 P2Programmer18 P2Manager24 P4Manager48 P3Engineer36 P3Manager40

Insertion Anomaly It may not be possible to store information about a new project until an employee is assigned to it. ENO EMP ENAMETITLESAL J. DoeElect. Eng M. Smith34000 M. Smith Analyst A. LeeMech. Eng A. LeeMech. Eng J. MillerProgrammer24000 B. CaseySyst. Anal L. ChuElect. Eng R. DavisMech. Eng E1 E2 E3 E4 E5 E6 E7 E8J. Jones Syst. Anal PNORESPDUR P1Manager12 P1Analyst P2Analyst6 P3Consultant10 P4Engineer48 P2Programmer18 P2Manager24 P4Manager48 P3Engineer36 P3Manager40

Deletion Anomaly If an engineer, who is the only employee on a project, leaves the company, his personal information cannot be deleted, or the information about that project is lost. May have to delete many tuples. ENO EMP ENAMETITLESAL J. DoeElect. Eng M. Smith34000 M. Smith Analyst A. LeeMech. Eng A. LeeMech. Eng J. MillerProgrammer24000 B. CaseySyst. Anal L. ChuElect. Eng R. DavisMech. Eng E1 E2 E3 E4 E5 E6 E7 E8J. Jones Syst. Anal PNORESPDUR P1Manager12 P1Analyst P2Analyst6 P3Consultant10 P4Engineer48 P2Programmer18 P2Manager24 P4Manager48 P3Engineer36 P3Manager40

What to do? Take each relation individually and “improve” it in terms of the desired characteristics  Normal forms o Atomic values (1NF) o Can be defined according to keys and dependencies. o Functional Dependencies ( 2NF, 3NF, BCNF) o Multivalued dependencies (4NF)  Normalization o Normalization is a process of concept separation which applies a top-down methodology for producing a schema by subsequent refinements and decompositions. o Do not combine unrelated sets of facts in one table; each relation should contain an independent set of facts. o Universal relation assumption

Normalization Issues How do we decompose a schema into a desirable normal form? What criteria should the decomposed schemas follow in order to preserve the semantics of the original schema?  Reconstructability: recover the original relation  no spurious joins  Lossless decomposition: no information loss  Dependency preservation: the constraints (i.e., dependencies) that hold on the original relation should be enforceable by means of the constraints (i.e., dependencies) defined on the decomposed relations.

A Combined Schema Without Repetition Consider combining relations  sec_class(sec_id, building, room_number) and  section(course_id, sec_id, semester, year) into one relation  section(course_id, sec_id, semester, year, building, room_number) No repetition in this case

What About Smaller Schemas? Suppose we had started with inst_dept. How would we know to split up (decompose) it into instructor and department? Write a rule “if there were a schema (dept_name, building, budget), then dept_name would be a candidate key” Denote as a functional dependency: dept_name  building, budget In inst_dept, because dept_name is not a candidate key, the building and budget of a department may have to be repeated.  This indicates the need to decompose inst_dept Not all decompositions are good. Suppose we decompose employee(ID, name, street, city, salary) into employee1 (ID, name) employee2 (name, street, city, salary) The next slide shows how we lose information -- we cannot reconstruct the original employee relation -- and so, this is a lossy decomposition.

A Lossy Decomposition

Example of Lossless-Join Decomposition Lossless join decomposition Decomposition of R = (A, B, C) R 1 = (A, B)R 2 = (B, C) AB  1212 A  B 1212 r  B,C (r)  A (r)  B (r) AB  1212 C ABAB B 1212 C ABAB C ABAB  A,B (r)

Unnormalized (UDF) First normal form (1NF) Remove repeating groups Second normal form (2NF) Remove partial dependencies Third normal form (3NF) Remove transitive dependencies Boyce-Codd normal form (BCNF) Remove remaining functional dependency anomalies Fourth normal form (4NF) Remove multivalued dependencies Fifth normal form (5NF) Remove remaining anomalies Stages of Normalization

Repeating Groups A repeating group is an attribute (or set of attributes) that can have more than one value for a primary key value. staffNojobdeptdnamecity contact Number SL10Salesman10SalesStratford , , SA51Manager20AccountsBarking DS40Clerk20AccountsBarkingNull OS45Clerk30OperationsBarking Example We have the following relation that contains staff and department details and a list of telephone contact numbers for each member of staff. Repeating Groups are not allowed in a relational design, since all attributes have to be ‘atomic’ - i.e., there can only be one value per cell in a table!

Multivalued Attributes (or repeating groups): non-key attributes or groups of non-key attributes the values of which are not uniquely identified by (directly or indirectly) (not functionally dependent on) the value of the Primary Key (or its part). Stud_IDNameCourse_IDUnits 101LennonMSI 250, MSI JohnsonMSI Repeating Groups STUDENT

Functional Dependency Formal Definition: Attribute B is functionally dependant upon attribute A (or a collection of attributes) if a value of A determines a single value of attribute B at any one time. Formal Notation: A  B This should be read as ‘A determines B’ or ‘B is functionally dependant on A’. A is called the determinant and B is called the object of the determinant. staffNo job dept dname SL10 Salesman 10 Sales SA51 Manager 20 Accounts DS40 Clerk 20 Accounts OS45 Clerk 30 Operations Example: staffNo  job staffNo  dept staffNo  dname dept  dname Functional Dependencies

Functional Dependency Full Functional Dependency: Only of relevance with composite determinants. This is the situation when it is necessary to use all the attributes of the composite determinant to identify its object uniquely. order# line# qty price A A A A Example: (Order#, line#)  qty (Order#, line#)  price Full Functional Dependencies Compound Determinants: If more than one attribute is necessary to determine another attribute in an entity, then such a determinant is termed a composite determinant.

Functional Dependency Partial Functional Dependency: This is the situation that exists if it is necessary to only use a subset of the attributes of the composite determinant to identify its object uniquely. (student#, unit#)  grade Full Functional Dependencies unit#  room Partial Functional Dependencies Repetition of data! student#unit#roomgrade A01TH A01TH A02JS A01TH22416

Partial Dependency – when an non-key attribute is determined by a part, but not the whole, of a COMPOSITE primary key. Partial Dependency Functional Dependency

Transitive Dependency Definition: A transitive dependency exists when there is an intermediate functional dependency. Formal Notation: If A  B and B  C, then it can be stated that the following transitive dependency exists: A  B  C staffNo  dept dept  dname staffNo  dept  dname Transitive Dependencies Repetition of data! staffNo jobdeptdname SL10Salesman10Sales SA51Manager20Accounts DS40Clerk20Accounts OS45Clerk30Operations Example:

Transitive Dependency – when a non-key attribute determines another non-key attribute. Transitive Dependency

Normal Forms: Review Unnormalized – There are multivalued attributes or repeating groups 1 NF – No multivalued attributes or repeating groups. 2 NF – 1 NF plus no partial dependencies 3 NF – 2 NF plus no transitive dependencies

Example 1: Determine NF ISBN  Title ISBN  Publisher Publisher  Address All attributes are directly or indirectly determined by the primary key; therefore, the relation is at least in 1 NF

Example 1: Determine NF ISBN  Title ISBN  Publisher Publisher  Address The relation is at least in 1NF. There is no COMPOSITE primary key, therefore there can’t be partial dependencies. Therefore, the relation is at least in 2NF

Example 1: Determine NF ISBN  Title ISBN  Publisher Publisher  Address Publisher is a non-key attribute, and it determines Address, another non-key attribute. Therefore, there is a transitive dependency, which means that the relation is NOT in 3 NF.

Example 1: Determine NF ISBN  Title ISBN  Publisher Publisher  Address We know that the relation is at least in 2NF, and it is not in 3 NF. Therefore, we conclude that the relation is in 2NF.

Example 1: Determine NF ISBN  Title ISBN  Publisher Publisher  Address In your solution you will write the following justification: 1) No M/V attributes, therefore at least 1NF 2) No partial dependencies, therefore at least 2NF 3) There is a transitive dependency (Publisher  Address), therefore, not 3NF Conclusion: The relation is in 2NF

Product_ID  Description Example 2: Determine NF All attributes are directly or indirectly determined by the primary key; therefore, the relation is at least in 1 NF

Product_ID  Description Example 2: Determine NF The relation is at least in 1NF. There is a COMPOSITE Primary Key (PK) (Order_No, Product_ID), therefore there can be partial dependencies. Product_ID, which is a part of PK, determines Description; hence, there is a partial dependency. Therefore, the relation is not 2NF. No sense to check for transitive dependencies!

Product_ID  Description Example 2: Determine NF We know that the relation is at least in 1NF, and it is not in 2 NF. Therefore, we conclude that the relation is in 1 NF.

Product_ID  Description Example 2: Determine NF In your solution you will write the following justification: 1) No M/V attributes, therefore at least 1NF 2) There is a partial dependency (Product_ID  Description), therefore not in 2NF Conclusion: The relation is in 1NF

Thank You