Ahsan Abdullah 1 Data Warehousing Lecture-6Normalization Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.

Slides:



Advertisements
Similar presentations
 Definition  Components  Advantages  Limitations Contents  Definition Definition  Normal Forms Normal Forms  First Normal Form First Normal Form.
Advertisements

Normalisation The theory of Relational Database Design.
Fundamentals, Design, and Implementation, 9/e Chapter 4 The Relational Model and Normalization.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Boyce-Codd Normal Form Kelvin Nishikawa SE157a-03 Fall 2006 Kelvin Nishikawa SE157a-03 Fall 2006.
1 5 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 5 Database Design: Normalization.
Chapter 5 Normalization of Database Tables
Lecture-33 DWH Implementation: Goal Driven Approach (1)
1 5 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 5 Database Design 1: Normalization.
NORMALIZATION N. HARIKA (CSC).
DWH-Ahsan Abdullah 1 Data Warehousing Lab Lect-2 Lab Data Set Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Chapter 3 The Relational Model and Normalization
XP Chapter 1 Succeeding in Business with Microsoft Office Access 2003: A Problem-Solving Approach 1 Level 3 Objectives: Identifying and Eliminating Database.
Functional Dependence An attribute A is functionally dependent on attribute(s) B if: given a value b for B there is one and only one corresponding value.
Concepts and Terminology Introduction to Database.
CBAD2103 Data Analysis and Modeling. Chapter 7 Conceptual Design Methodology.
1 Data Warehousing Lecture-13 Dimensional Modeling (DM) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research.
Avoiding Database Anomalies
Normalization. 2 Objectives u Purpose of normalization. u Problems associated with redundant data. u Identification of various types of update anomalies.
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
Database Systems: Design, Implementation, and Management Tenth Edition
RDBMS Concepts/ Session 3 / 1 of 22 Objectives  In this lesson, you will learn to:  Describe data redundancy  Describe the first, second, and third.
Chapter 4 The Relational Model and Normalization.
Concepts of Database Management, Fifth Edition
A Normalisation Example Mark Kelly McKinnon Secondary College Vceit.com Based on work by Robert Timmer-Arends.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
Ahsan Abdullah 1 Data Warehousing Lecture-7De-normalization Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Concepts of Relational Databases. Fundamental Concepts Relational data model – A data model representing data in the form of tables Relations – A 2-dimensional.
Normalization. We will take a look at –First Normal Form –Second Normal Form –Third Normal Form There are also –Boyce-Codd, Fourth and Fifth normal forms.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
M Taimoor Khan Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)
In this chapter, you learn about the following: ❑ Anomalies ❑ Dependency and determinants ❑ Normalization ❑ A layman’s method of understanding normalization.
Chapter 7 1 Database Principles Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall, Modified by Dr. Mathis 3-1 David M. Kroenke’s Chapter Three: The Relational.
MS Access: Creating Relational Databases Instructor: Vicki Weidler Assistant: Joaquin Obieta.
Ahsan Abdullah 1 Data Warehousing Lecture-9 Issues of De-normalization Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
Data Warehousing 1 Lecture-28 Need for Speed: Join Techniques Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.
1 Data Warehousing Lecture-15 Issues of Dimensional Modeling Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Data Warehousing Lecture-30 What can Data Mining do? Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research.
Normalization Transparencies 1. ©Pearson Education 2009 Objectives How the technique of normalization is used in database design. How tables that contain.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
A337 - Reed Smith1 Structure What is a database? –Table of information Rows are referred to as records Columns are referred to as fields Record identifier.
Lecture Nine: Normalization
Database Processing: Fundamentals, Design and Implementation, 9/e by David M. KroenkeChapter 4/1 Copyright © 2004 Please……. No Food Or Drink in the class.
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
Normalisation RELATIONAL DATABASES.  Last week we looked at elements of designing a database and the generation of an ERD  As part of the design and.
NORMALIZATION. What is Normalization  The process of effectively organizing data in a database  Two goals  To eliminate redundant data  Ensure data.
Brian Thoms.  Databases normalization The systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain.
Logical Database Design and the Relational Model.
April 20022CS3X1 Database Design Normalisation (1) John Wordsworth Department of Computer Science The University of Reading Room.
RELATIONAL TABLE NORMALIZATION. Key Concepts Guidelines for Primary Keys Deletion anomaly Update anomaly Insertion anomaly Functional dependency Transitive.
Lecture 4: Logical Database Design and the Relational Model 1.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
Normalization ACSC 425 Database Management Systems.
Objectives of Normalization  To create a formal framework for analyzing relation schemas based on their keys and on the functional dependencies among.
Advanced Database System
Ahsan Abdullah 1 Data Warehousing Lecture-8 De-normalization Techniques Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
NORMALIZATION Handout - 4 DBMS. What is Normalization? The process of grouping data elements into tables in a way that simplifies retrieval, reduces data.
Decomposition and Normalization Fan Qi
Logical Database Design and Relational Data Model Muhammad Nasir
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
Lecture # 17 Chapter # 10 Normalization Database Systems.
Database Normalization. What is Normalization Normalization allows us to organize data so that it: Normalization allows us to organize data so that it:
Lecture-32 DWH Lifecycle: Methodologies
Lecture-35 DWH Implementation: Pitfalls, Mistakes, Keys
Database Normalization.
Database.
Presentation transcript:

Ahsan Abdullah 1 Data Warehousing Lecture-6Normalization Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research National University of Computers & Emerging Sciences, Islamabad

Ahsan Abdullah 2 Normalization

3Normalization What is normalization? What are the goals of normalization?  Eliminate redundant data.  Ensure data dependencies make sense. What is the result of normalization? What are the levels of normalization? Always follow purists approach of normalization?NO

Ahsan Abdullah 4Normalization SID: Student ID Degree: Registered as BS or MS student Campus: City where campus is located Course: Course taken Marks: Score out of max of 50 Consider a student database system to be developed for a multi-campus university, such that it specializes in one degree program at a campus i.e. BS, MS or PhD. SIDDegreeCampusCourseMarks 1BSIslamabadCS BSIslamabadCS BSIslamabadCS BSIslamabadCS BSIslamabadCS BSIslamabadCS MSLahoreCS MSLahoreCS MSLahoreCS BSIslamabadCS BSIslamabadCS BSIslamabadCS-10540

Ahsan Abdullah 5 Normalization: 1NF Only contains atomic values, BUT also contains redundant data. 40CS-105IslamabadBS4 30CS-104IslamabadBS4 20CS-102IslamabadBS4 20CS-102LahoreMS3 40CS-102LahoreMS2 30CS-101LahoreMS2 10CS-106IslamabadBS1 10CS-105IslamabadBS1 20CS-104IslamabadBS1 40CS-103IslamabadBS1 20CS-102IslamabadBS1 30CS-101IslamabadBS1 MarksCourseCampusDegreeSID FIRST

Ahsan Abdullah 6 Normalization: 1NF Update anomalies INSERT. Certain student with SID 5 got admission in a different campus (say) Karachi cannot be added until the student registers for a course. DELETE. If student graduates and his/her corresponding record is deleted, then all information about that student is lost. UPDATE. If student migrates from Islamabad campus to Lahore campus (say) SID = 1, then six rows would have to be updated with this new information.

Ahsan Abdullah 7 Normalization: 2NF Every non-key column is fully dependent on the PK FIRST is in 1NF but not in 2NF because degree and campus are functionally dependent upon only on the column SID of the composite key (SID, course). This can be illustrated by listing the functional dependencies in the table: SID —> campus, degree campus —> degree (SID, Course) —> Marks To transform the table FIRST into 2NF we move the columns SID, Degree and Campus to a new table called REGISTRATION. The column SID becomes the primary key of this new table. SID & Campus are NOT unique

Ahsan Abdullah 8 Normalization: 2NF SIDDegreeCampus 1BSIslamabad 2MSLahore 3MSLahore 4BSIslamabad 5PhDPeshawar SIDCourseMarks 1CS CS CS CS CS CS CS CS CS CS CS CS REGISTRATION PERFORMANCE SID is now a PK PERFORMANCE in 2NF as (SID, Course) uniquely identify Marks

Ahsan Abdullah 9 Normalization: 2NF Presence of modification anomalies for tables in 2NF. For the table REGISTRATION, they are:  INSERT: Until a student gets registered in a degree program, that program cannot be offered!  DELETE: Deleting any row from REGISTRATION destroys all other facts in the table. Why there are anomalies? The table is in 2NF but NOT in 3NF

Ahsan Abdullah 10 Normalization: 3NF All columns must be dependent only on the primary key. Table PERFORMANCE is already in 3NF. The non-key column, marks, is fully dependent upon the primary key (SID, degree). REGISTRATION is in 2NF but not in 3NF because it contains a transitive dependency. A transitive dependency occurs when a non-key column that is a determinant of the primary key is the determinate of other columns. The concept of a transitive dependency can be illustrated by showing the functional dependencies in REGISTRATION: REGISTRATION.SID —> REGISTRATION.Degree REGISTRATION.SID —> REGISTRATION.Campus REGISTRATION.Campus —> REGISTRATION.Degree Note that REGISTRATION.Degree is determined both by the primary key SID and the non-key column campus.

Ahsan Abdullah 11 Normalization: 3NF To transform REGISTRATION into 3NF, we create a new table called CAMPUS_DEGREE and move the columns campus and degree into it. Degree is deleted from the original table, campus is left behind to serve as a foreign key to CAMPUS_DEGREE, and the original table is renamed to STUDENT_CAMPUS to reflect its semantic meaning.

Ahsan Abdullah 12 Normalization: 3NF PeshawarPhD5 IslamabadBS4 LahoreMS3 LahoreMS2 IslamabadBS1 CampusDegreeSID REGISTRATION Peshawar5 Islamabad4 Lahore3 2 Islamabad1 CampusSID STUDENT_CAMPUS PhDPeshawar MSLahore BSIslamabad DegreeCampus CAMPUS_DEGREE

Ahsan Abdullah 13 Normalization: 3NF Removal of anomalies and improvement in queries as follows:  INSERT: Able to first offer a degree program, and then students registering in it.  UPDATE: Migrating students between campuses by changing a single row.  DELETE: Deleting information about a course, without deleting facts about all columns in the record.

Ahsan Abdullah 14Normalization Conclusions:  Normalization guidelines are cumulative.  Generally a good idea to only ensure 2NF.  3NF is at the cost of simplicity and performance.  There is a 4NF with no multi-valued dependencies.  There is also a 5NF.