A SIMPLE GUIDE TO FIVE NORMAL FORMS (See the next slide for required reading) Prof. Ghandeharizadeh 2018/11/14.

Slides:



Advertisements
Similar presentations
Relational Terminology. Normalization A method where data items are grouped together to better accommodate business changes Provides a method for representing.
Advertisements

Normalization What is it?
NORMALIZATION FIRST NORMAL FORM (1NF): A relation R is in 1NF if all attributes have atomic value = one value for an attribute = no repeating groups =
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
The Relational Database Model:
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
IST Databases and DBMSs Todd S. Bacastow January 2005.
Week 6 Lecture Normalization
Concepts and Terminology Introduction to Database.
Copyright, Harris Corporation & Ophir Frieder, Normal Forms “Why be normal?” - Author unknown Normal.
Component 4: Introduction to Information and Computer Science Unit 6: Databases and SQL Lecture 4 This material was developed by Oregon Health & Science.
Concepts of Database Management, Fifth Edition
Normalization. Learners Support Publications 2 Objectives u The purpose of normalization. u The problems associated with redundant data.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
Concepts of Relational Databases. Fundamental Concepts Relational data model – A data model representing data in the form of tables Relations – A 2-dimensional.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
Chapter 7 1 Database Principles Data Normalization Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that.
MS Access: Creating Relational Databases Instructor: Vicki Weidler Assistant: Joaquin Obieta.
Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems.
GIS Data Models GEOG 370 Christine Erlien, Instructor.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
Database Management Supplement 1. 2 I. The Hierarchy of Data Database File (Entity, Table) Record (info for a specific entity, Row) Field (Attribute,
Normalization. 2 u Main objective in developing a logical data model for relational database systems is to create an accurate representation of the data,
NORMALIZATION. What is Normalization  The process of effectively organizing data in a database  Two goals  To eliminate redundant data  Ensure data.
Brian Thoms.  Databases normalization The systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain.
IST Database Normalization Todd Bacastow IST 210.
Ch 7: Normalization-Part 1
11/10/2009GAK1 Normalization. 11/10/2009GAK2 Learning Objectives Definition of normalization and its purpose in database design Types of normal forms.
1 CS 430 Database Theory Winter 2005 Lecture 7: Designing a Database Logical Level.
Lecture 4: Logical Database Design and the Relational Model 1.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
Logical Database Design and Relational Data Model Muhammad Nasir
1 The Relational Data Model David J. Stucki. Relational Model Concepts 2 Fundamental concept: the relation  The Relational Model represents an entire.
4TH NORMAL FORM By: Karen McVay.
Databases and DBMSs Todd S. Bacastow January
Understanding Data Storage
Relational Databases Chapter 4.
Revised: 2 April 2004 Fred Swartz
Chapter 4 Logical Database Design and the Relational Model
Normalization Karolina muszyńska
A brief summary of database normalization
Chapter 5: Logical Database Design and the Relational Model
Module 5: Overview of Database Design -- Normalization
Relational Database Design by Dr. S. Sridhar, Ph. D
Chapter 4 Relational Databases
Database Design Determinacy.
Chapter 8: Relational Database Design
Database Normalization
Entity-Relationship Model and Diagrams (continued)
Introduction lecture1.
Normalization Referential Integrity
File Systems and Databases
Database Normalization
Normalization of Databases
Module 5: Overview of Normalization
Chapter 14 Normalization – Part I Pearson Education © 2009.
Normalization.
Database solutions Chosen aspects of the relational model Marzena Nowakowska Faculty of Management and Computer Modelling Kielce University of Technology.
Normalization Normalization theory is based on the observation that relations with certain properties are more effective in inserting, updating and deleting.
4 Normal Form.
CHAPTER 4: LOGICAL DATABASE DESIGN AND THE RELATIONAL MODEL
Chapter 14 Normalization.
Unit 7 Normalization (表格正規化).
Normalization Organized by Farrokh Alemi, Ph.D.
Relational Database Design
NORMALIZATION FIRST NORMAL FORM (1NF):
Database Normalization.
Chapter 7a: Overview of Database Design -- Normalization
Normalisation 1 Unit 3.1 Dr Gordon Russell, Napier University
Presentation transcript:

A SIMPLE GUIDE TO FIVE NORMAL FORMS (See the next slide for required reading) Prof. Ghandeharizadeh 2018/11/14

READING This lecture is based on a seminal paper by William Kent, see William Kent, A simple guide to five normal forms in relational database theory, Communications of the ACM, Volume 26, Number 2, pages 120-125, February 1983. All USC students should download this paper from the ACM Digital Library Portal. Prof. Ghandeharizadeh 2018/11/14

A SIMPLE GUIDE TO FIVE NORMAL FORMS Given an application, how to structure the tables that support this application? Solution: Use the five normal forms as guidelines. Why is this important? If one is not careful then: Information might be duplicated, resulting in: update anomalies, and data inconsistencies. Loss of information attribute = field First normal form: all occurrences of a record type must contain the same number of fields. Example: Emp(SS#, name, age, salary, dno) is a relation with five attributes. All tuples/records of this table have five attributes. Second and third normal forms: A non-key attribute is a fact about the key, the whole key and nothing but the key. Key is defined as a set of one or more attributes which, taken collectively, allow us to uniquely identify one record from the others in a table. Prof. Ghandeharizadeh 2018/11/14

A SIMPLE GUIDE TO FIVE NORMAL FORMS (Cont…) The second normal form is violated when a non-key field is a fact about a subset of a key. This circumstance arises when the key is composite. PART WAREHOUSE QUANTITY WAREHOUSE-ADDRESS ……key…………. Prof. Ghandeharizadeh 2018/11/14

A SIMPLE GUIDE TO FIVE NORMAL FORMS (Cont…) Limitations: Warehouse address is repeated in every record (every part) If the address of a warehouse changes, many records must be updated The data may become inconsistent if the update is not done correctly If a warehouse becomes empty then the database management system might loose track of it. This is because no records may reflect the existence of the warehouse in the database. Solution: Satisfy the second normal form by decomposing the above relation into two different relations PART WAREHOUSE QUANTITY ……key…………. WAREHOUSE WAREHOUSE-ADDRESS …..key……. Prof. Ghandeharizadeh 2018/11/14

A SIMPLE GUIDE TO FIVE NORMAL FORMS (Cont…) The process of replacing un-normalized records with normalized records is termed normalization. Advantage of normalization: it enhances the integrity of data by minimizing redundancy and inconsistency. Disadvantage of normalization: performance might be lost. For example, the execution of the following query now requires the execution of a join operator: Retrieve the address of all warehouses that contain part-id=5. The third normal form is violated when a non-key attribute is a fact about another non-key attribute EMPLOYEE DEPARTMENT LOCATION ….key…… Prof. Ghandeharizadeh 2018/11/14

A SIMPLE GUIDE TO FIVE NORMAL FORMS (Cont…) Address is a fact about department and not the faculty. In addition to suffering from the limitations of the second normal form, violating the third normal form suffers from update ambiguities: Logically speaking, an update that changes address of a faculty (say Shahram) from SAL to HNB has two alternative meaning: either Shahram is moving to a different department and his department attribute value will be updated later or the computer science department is changing address and all the faculty in this department will observe an update in their address attribute later. It is difficult to design algorithms to maintain consistency with such ambiguities. Solution: decompose the above record into two records: EMPLOYEE DEPARTMENT DEPARTMENT LOCATION ….key…… ……key.…… Prof. Ghandeharizadeh 2018/11/14

A SIMPLE GUIDE TO FIVE NORMAL FORMS (Cont…) Functional dependency: An attribute Y is functionally dependent on attribute (or set of attributes) X if it is invalid to have two records with the same X value but different Y values. A given X value must always occur with the same Y value. If X is the key of a relation then all other attributes of a relation are by definition functionally dependent on X. The fourth and fifth normal forms minimize the number of attributes involved in a composite key. They deal with multi-valued facts. A multi-valued fact represents either a many-to-many relationship (e.g., employees having skills) or a many-to-one relationship (e.g., children of an employee assuming only one parent is employed). Prof. Ghandeharizadeh 2018/11/14

A SIMPLE GUIDE TO FIVE NORMAL FORMS (Cont…) With the fourth normal form, a record type should not contain two or more independent multivalued facts about an entity. In addition, the record must satisfy third normal form. Example: bilingual employees with multiple skills. If language is independent of skills when we have two many-to-many relationships: (1) between employees and skills; (2) between employees and languages. Fourth normal form is violated when these two relationships are represented in a single record: To satisfy fourth normal form, they should be represented as two records: EMPLOYEE SKILL LANGUAGE ……………key………….… EMPLOYEE SKILL EMPLOYEE LANGUAGE …..…key …….… ………..key………….… Prof. Ghandeharizadeh 2018/11/14

A SIMPLE GUIDE TO FIVE NORMAL FORMS (Cont…) Limitations: With repetitions, the update must be done in multiple records and the records could become inconsistent. Insertions of a new skill may involve looking for a record with a blank skill, inserting a new record with possibly blank language, or inserting multiple records pairing the new skill with some or all of the languages. Deletion of a skill may involve blanking out the skill field in one or many records(perhaps with a check that this does not leave two records with the same language and a blank skill) or deleting one or more records, coupled with a check that the last mention of some language has not been deleted also. Note that the fourth normal form is not violated if there is dependence between these multi-valued fields: e.g., if an employee can exercise certain skills in certain languages. If Smith can cook French cuisine only, but type French, German, and Greek then the pairing of skills and languages is meaningful, and there is not longer an ambiguity of maintenance policies. Prof. Ghandeharizadeh 2018/11/14

A SIMPLE GUIDE TO FIVE NORMAL FORMS (Cont…) Fifth normal form deals with those cases where information can be reconstructed from smaller pieces of information which can be maintained with less redundancy. It deals with semantics and constraints! Consider the example represented in the first column of Page 70. Although the normalized form involves more record types, there may be fewer total record occurrences. This is because the normalized relations increase in an additive fashion while the unnormalized relations increase in a multiplicative fashion. For example, if we add a new agent who sells x product for y companies, where each of these companies makes each of these products, we have to add x+y new records to the normalized set of new relations, but x×y new records to the unnormalized relations. Prof. Ghandeharizadeh 2018/11/14