Database Normalization

Slides:



Advertisements
Similar presentations
The Relational Model System Development Life Cycle Normalisation
Advertisements

1 Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
Normalization I.
Chapter 5 Normalization Transparencies © Pearson Education Limited 1995, 2005.
Chapter 14 Advanced Normalization Transparencies © Pearson Education Limited 1995, 2005.
Introduction to Schema Refinement. Different problems may arise when converting a relation into standard form They are Data redundancy Update Anomalies.
Normalization. Introduction Badly structured tables, that contains redundant data, may suffer from Update anomalies : Insertions Deletions Modification.
FUNCTIONAL DEPENDENCIES
Lecture 12 Inst: Haya Sammaneh
Normalization. 2 Objectives u Purpose of normalization. u Problems associated with redundant data. u Identification of various types of update anomalies.
Chapter 13 Normalization Transparencies. 2 Last Class u Access Lab.
Normalization. Learners Support Publications 2 Objectives u The purpose of normalization. u The problems associated with redundant data.
1 Pertemuan 23 Normalisasi Matakuliah: >/ > Tahun: > Versi: >
Lecture 6 Normalization: Advanced forms. Objectives How inference rules can identify a set of all functional dependencies for a relation. How Inference.
Normalization Transparencies
Normalization Fundamentals of Database Systems. Lilac Safadi Normalization 2 Database Design Steps in building a database for an application: Real-world.
Chapter 13 Normalization Transparencies. 2 Chapter 13 - Objectives u Purpose of normalization. u Problems associated with redundant data. u Identification.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
Chapter 13 Normalization © Pearson Education Limited 1995, 2005.
Lecture 5 Normalization. Objectives The purpose of normalization. How normalization can be used when designing a relational database. The potential problems.
Chapter 13 Normalization Transparencies Last Updated: 08 th May 2011 By M. Arief
Chapter 10 Normalization Pearson Education © 2009.
Chapter 7 Normalization Chapter 14 & 15 in Textbook.
© Pearson Education Limited, Normalization Bayu Adhi Tama, M.T.I. Faculty of Computer Science University of Sriwijaya.
9/23/2012ISC329 Isabelle Bichindaritz1 Normalization.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Lecture 3 Functional Dependency and Normal Forms Prof. Sin-Min Lee Department of Computer Science.
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
Normalization. Overview Earliest  formalized database design technique and at one time was the starting point for logical database design. Today  is.
ITD1312 Database Principles Chapter 4C: Normalization.
Chapter 7 Normalization Chapter 14 & 15 in Textbook.
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Chapter 9 Normalization Chapter 14 & 15 in Textbook.
Chapter 8 Relational Database Design Topic 1: Normalization Chuan Li 1 © Pearson Education Limited 1995, 2005.
SAN DIEGO SUPERCOMPUTER CENTER Introduction to Database Design July 2006 Ken Nunes sdsc.edu.
Normalization.
Advanced Normalization
Normalization Karolina muszyńska
Functional Dependency
Normalization DBMS.
Normalization Dongsheng Lu Feb 21, 2003.
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Chapter 7 Normalization Chapter 14 & 15 in Textbook.
Advanced Normalization
Normalization Lecture 7 May Aldoayan.
Chapter 14 Normalization
Chapter 15 Basics of Functional Dependencies and Normalization for Relational Databases.
Functional Dependencies and Normalization for Relational Databases
Normalization.
Chapter 9 Normalizing Database Designs
Database Management systems Subject Code: 10CS54 Prepared By:
Chapter 14 & Chapter 15 Normalization Pearson Education © 2009.
Normalization.
Normalization and FD.
Normalization Dongsheng Lu Feb 21, 2003.
Chapter 7 Normalization Chapter 13 in Textbook.
Chapter 14 Normalization – Part I Pearson Education © 2009.
Normalization Dale-Marie Wilson, Ph.D..
Normalization.
Database Design Agenda
Normalization Normalization theory is based on the observation that relations with certain properties are more effective in inserting, updating and deleting.
Chapter 14 Normalization.
Minggu 9, Pertemuan 18 Normalization
Chapter 14 Normalization.
Normalization February 28, 2019 DB:Normalization.
國立臺北科技大學 課程:資料庫系統 2015 fall Chapter 14 Normalization.
Relational Database Design
Introduction to Database Design
Chapter 7 Normalization Chapter 14 & 15 in Textbook.
Presentation transcript:

Database Normalization Dr. M Krishnamurthy Professor / CSE / SVCE Name and level of experience with topic.

Normalization A logical design method which minimizes data redundancy and reduces design flaws. Consists of applying various “normal” forms to the database design. The normal forms break down large tables into smaller subsets. Accomplish normalization by analyzing the interdependencies among attributes in tables and taking subsets of larger tables to form smaller ones. The subsets are created from examining the interdependencies among the table attributes.

Anomalies Relations that have redundant data may have problems called update anomalies, which are classified as: Insertion anomalies Deletion anomalies Modification anomalies

Example of Anomalies To insert a new staff with branchNo B007 into the StaffBranch relation; To delete a tuple that represents the last member of staff located at a branch B007; To change the address of branch B003. StaffBranch staffNo sName position salary branchNo bAddress SL21 John White Manager 30000 B005 22 Deer Rd, London SG37 Ann Beech Assistant 12000 B003 163 Main St,Glasgow SG14 David Ford Supervisor 18000 SA9 Mary Howe 9000 B007 16 Argyll St, Aberdeen SG5 Susan Brand 24000 SL41 Julie Lee

Example of Anomalies (Contd ...) Staff staffNo sName position salary branceNo SL21 John White Manager 30000 B005 SG37 Ann Beech Assistant 12000 B003 SG14 David Ford Supervisor 18000 SA9 Mary Howe 9000 B007 SG5 Susan Brand 24000 SL41 Julie Lee Branch branceNo bAddress B005 22 Deer Rd, London B007 16 Argyll St, Aberdeen B003 163 Main St,Glasgow

Relationship of Normal Forms

Stages of Normalisation First normal form (1NF) Remove repeating groups Second normal form (2NF) Remove partial dependencies Third normal form (3NF) Remove transitive dependencies Boyce-Codd normal form (BCNF) Remove remaining functional dependency anomalies Fourth normal form (4NF) Remove multivalued dependencies Fifth normal form (5NF) Remove remaining anomalies

Unnormalized Relations Normalization Unnormalized Relations First normal form Second normal form Boyce- Codd and Higher Third normal form Functional dependencyof nonkey attributes on the primary key - Atomic values only Full Functional dependencyof nonkey attributes on the primary key No transitive dependency between nonkey attributes All determinants are candidate keys - Single multivalued dependency 8

First Normal Form (1NF) Each attribute must be atomic: First Normal Form is a relation in which the intersection of each row and column contains one and only one value. No repeating columns within a row. No multi-valued columns.

Employee (unnormalized) 1NF Employee (unnormalized) Employee (1NF)

Second Normal Form (2NF) Each attribute must be functionally dependent on the primary key. Functional dependence - the property of one or more attributes that uniquely determines the value of other attributes. Any non-dependent attributes are moved into a smaller (subset) table. 2NF improves data integrity. Prevents update, insert, and delete anomalies.

Functional Dependencies Functional dependency describes the relationship between attributes in a relation. For example, if A and B are attributes of relation R, and B is functionally dependent on A ( denoted A B), if each value of A is associated with exactly one value of B. ( A and B may each consist of one or more attributes.) B is functionally A B dependent on A Determinant Refers to the attribute or group of attributes on the left-hand side of the arrow of a functional dependency

Functional dependency Full functional dependency indicates that if A and B are attributes of a relation, B is fully functionally dependent on A if B is functionally dependent on A, but not on any proper subset of A. A functional dependency AB is partially dependent if there is some attributes that can be removed from A and the dependency still holds.

Functional Dependencies (Contd ...) Trival functional dependency means that the right-hand side is a subset ( not necessarily a proper subset) of the left- hand side. For example: staffNo, sName  sName staffNo, sName  staffNo They do not provide any additional information about possible integrity constraints on the values held by these attributes. We are normally more interested in nontrivial dependencies because they represent integrity constraints for the relation.

Identifying the primary key Functional dependency is a property of the meaning or semantics of the attributes in a relation. When a functional dependency is present, the dependency is specified as a constraint between the attributes. An important integrity constraint to consider first is the identification of candidate keys, one of which is selected to be the primary key for the relation using functional dependency.

Identifying the primary key (Contd ...) Examples STORE(SNAME, ADDR, ZIP, ITEM, PRICE) FDs: SNAME  ADDR ADDR  ZIP SNAME, ITEM  PRICE Finding a key: SNAME does not appear in RHS, so SNAME must be a part of the key. Since SNAME  ADDR  ZIP, we know SNAME  ADDR, ZIP But SNAME alone cannot determine any more. How can we determine ITEM and PRICE? If we have ITEM, then we can determine PRICE So, SNAME, ITEM  SNAME, ADDR, ZIP, ITEM, PRICE. So it satisfies the definition of the key

Inference Rules Armstrong’s axioms Relfexivity: If B is a subset of A, them A  B Augmentation: If A  B, then A, C  B Transitivity: If A  B and B  C, then A C Self-determination: A  A Decomposition: If A  B,C then A  B and A C Union: If A  B and A  C, then A B,C Composition: If A  B and C  D, then A,C B,D

Functional Dependence Employee (1NF) Name, dept_no, and dept_name are functionally dependent on emp_no. (emp_no -> name, dept_no, dept_name) Skills is not functionally dependent on emp_no since it is not unique to each emp_no.

2NF Employee (1NF) Employee (2NF) Skills (2NF)

Data Integrity Employee (1NF) Insert Anomaly - adding null values. eg, inserting a new department does not require the primary key of emp_no to be added. Update Anomaly - multiple updates for a single name change, causes performance degradation. eg, changing IT dept_name to IS Delete Anomaly - deleting wanted information. eg, deleting the IT department removes employee Barbara Jones from the database

Third Normal Form (3NF) Remove transitive dependencies. Transitive dependence - two separate entities exist within one table. Any transitive dependencies are moved into a smaller (subset) table. 3NF further improves data integrity. Prevents update, insert, and delete anomalies.

Transitive Dependence Employee (2NF) dept_name is functionally dependent on dept_no. dept_no is functionally dependent on emp_no, so via the middle step of dept_no, dept_name is functionally dependent on emp_no. (emp_no -> dept_no , dept_no -> dept_name, thus emp_no -> dept_name) Note, dept_name is functionally dependent on dept_no. Dept_no is functionally dependent on emp_no, so via the middle step of dept_no, dept_name is functionally dependent on emp_no. (emp_no -> dept_no , dept_no -> dept_name, thus emp_no -> dept_name)

3NF Employee (2NF) Employee (3NF) Department (3NF)

Other Normal Forms Boyce-Codd Normal Form (BCNF) Strengthens 3NF by requiring the keys in the functional dependencies to be superkeys (a column or columns that uniquely identify a row) Fourth Normal Form (4NF) Eliminate trivial multivalued dependencies. Fifth Normal Form (5NF) Eliminate dependencies not determined by keys.