Normalization Information Systems II Ioan Despi. Informal approach Building a database structure : A process of examining the data which is useful & necessary.

Slides:



Advertisements
Similar presentations
Copyright: ©2005 by Elsevier Inc. All rights reserved. 1 Author: Graeme C. Simsion and Graham C. Witt Chapter 3 The Entity-Relationship Approach.
Advertisements

Normalisation.
Relational Terminology. Normalization A method where data items are grouped together to better accommodate business changes Provides a method for representing.
First Normal Form Second Normal Form Third Normal Form
1 Class Agenda (04/03 and 04/08)  Review and discuss HW #8 answers  Present normalization process Enhance conceptual knowledge of database design. Improve.
Athabasca University Under Development for COMP 200 Gary Novokowsky
Jump to first page Normalization Jump to first page Topics n Why normalization is needed n What causes anomalies n What the 4 normal forms are n How.
Database table design Single table vs. multiple tables Sen Zhang.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Understand Normalization
Relational Data Analysis II. Plan Introduction Structured Methods –Data Flow Modelling –Data Modelling –Relational Data Analysis Feasibility Maintenance.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
Database Design.  Define a table for each entity  Give the table the same name as the entity  Make the primary key the same as the identifier of the.
Advanced Tables Lesson 9. Objectives Creating a Custom Table When a table template doesn’t suit your needs, you can create a custom table in Design view.
Normalization Rules for Database Tables Northern Arizona University College of Business Administration.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
Normalization. Introduction Badly structured tables, that contains redundant data, may suffer from Update anomalies : Insertions Deletions Modification.
Week 6 Lecture Normalization
Lecture 12 Inst: Haya Sammaneh
Modelling Techniques - Normalisation Description and exemplification of normalisation.Description and exemplification of normalisation. Creation of un-normalised.
1 Class Agenda (04/09 through 04/16)  Review HW #8  Present normalization process Enhance conceptual knowledge of database design. Improve practical.
A Guide to SQL, Eighth Edition Chapter Two Database Design Fundamentals.
Chapter 1 Overview of Database Concepts Oracle 10g: SQL
1 Class Agenda (11/07 and 11/12)  Review HW #8 answers  Present normalization process Enhance conceptual knowledge of database design. Improve practical.
FILE VS. DATABASES Let’s examine some basic principles about how data are stored in computer systems. – An entity is anything about which the organization.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
RDBMS Concepts/ Session 3 / 1 of 22 Objectives  In this lesson, you will learn to:  Describe data redundancy  Describe the first, second, and third.
Concepts of Database Management Sixth Edition Chapter 5 Database Design 1: Normalization.
Concepts of Database Management, Fifth Edition
A Normalisation Example Mark Kelly McKinnon Secondary College Vceit.com Based on work by Robert Timmer-Arends.
1 A Guide to MySQL 2 Database Design Fundamentals.
1 Database Design and Development: A Visual Approach © 2006 Prentice Hall Chapter 4 DATABASE DESIGN AND DEVELOPMENT: A VISUAL APPROACH Chapter 4 Normalization.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
資料庫正規化 Database Normalization 取材自 AIS, 6 th edition By Gelinas et al.
Database Normalization Lynne Weldon July 17, 2000.
M Taimoor Khan Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)
1 CMPT 275 Phase: Design. Janice Regan, Map of design phase DESIGN HIGH LEVEL DESIGN Modularization User Interface Module Interfaces Data Persistance.
CORE 2: Information systems and Databases NORMALISING DATABASES.
Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
1 A Guide to MySQL 2 Database Design Fundamentals.
Chapter 1Introduction to Oracle9i: SQL1 Chapter 1 Overview of Database Concepts.
Chapter 10 Normalization Pearson Education © 2009.
Component 4/Unit 6d Topic IV: Design a simple relational database using data modeling and normalization Description and Information Gathering Data Model.
Normalization Transparencies 1. ©Pearson Education 2009 Objectives How the technique of normalization is used in database design. How tables that contain.
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
Normalization Data Design - Mr. Ahmad Al-Ghoul
A337 - Reed Smith1 Structure What is a database? –Table of information Rows are referred to as records Columns are referred to as fields Record identifier.
Creating Databases Data normalization. Integrity and Robustness. Work session. Homework: Prepare short presentation on enhancement projects. Continue working.
Concepts of Database Management Seventh Edition Chapter 5 Database Design 1: Normalization.
1 Class Agenda (04/06/2006 and 04/11/2006)  Discuss use of Visio for ERDs  Learn concepts and ERD notation for data generalization  Introduce concepts.
IST Database Normalization Todd Bacastow IST 210.
Database Design. Database Design Process Data Model Requirements Application 1 Database Requirements Application 2 Requirements Application 4 Requirements.
Normalisation 1NF to 3NF Ashima Wadhwa. In This Lecture Normalisation to 3NF Data redundancy Functional dependencies Normal forms First, Second, and Third.
Databases Database Normalisation. Learning Objectives Design simple relational databases to the third normal form (3NF).
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
NORMALIZATION Handout - 4 DBMS. What is Normalization? The process of grouping data elements into tables in a way that simplifies retrieval, reduces data.
Logical Database Design and Relational Data Model Muhammad Nasir
What Is Normalization  In relational database design, the process of organizing data to minimize redundancy  Usually involves dividing a database into.
Database Planning Database Design Normalization.
MS Access. Most A2 projects use MS Access Has sufficient depth to support a significant project. Relational Databases. Fairly easy to develop a good user.
Normalisation FORM RULES 1NF 2NF 3NF. What is normalisation of data? The process of Normalisation organises your database to: Reduce or minimise redundant.
Database Normalization. What is Normalization Normalization allows us to organize data so that it: Normalization allows us to organize data so that it:
NORMALISATION OF DATABASES. WHAT IS NORMALISATION? Normalisation is used because Databases need to avoid have redundant data, which makes it inefficient.
Relational Databases Chapter 4.
Database Normalization
Chapter 4 Relational Databases
Relational Database Model
Presentation transcript:

Normalization Information Systems II Ioan Despi

Informal approach Building a database structure : A process of examining the data which is useful & necessary for an application Then breaking it down into a relative simple row and column format

There are two points to understand about tables and columns that are the essence of any database: 1. Tables store data about an entity An entity may be a person, a part in a machine, a book, or any other tangible or intangible object, but the primary consideration is that a table only contain data about one thing 2. Columns contain the attributes of an entity Just as a table contains data about a single entity, each column should only contain one item of data about that entity

Personal tricks: 1. If (for example) you’re creating a table of addresses, there is no point in having a single column contain the city, state and postal code when it is just as easy to create three columns and record each attribute separately. 2. I use a plural form of a noun for table names (Authors, Books, Orders,aso) and a noun or a noun and adjective for column names (FirstName, City) 3. If I’m coming up with names that require the use of the word “and” or the use of two nouns, it’s an indication I haven’t gone enough in breaking down data

An ugly table

Problems with this structure: 1. Repeating Groups The CourseID, Description and Instructor are repeated for each class. If a student need a second or a third class, you need to go back and modify the table design in order to record it. Additionally, adding all those fields when most students would never use them is a waste of storage

2. Delete anomalies If you no longer wish to track Joe Killy’s Intro to DAO class, you would need to delete a student, an adviser and an instructor in order to do it. 3. Insert anomalies Perhaps the department head wishes to add a new class, “Intro to C++”, but hasn’t yet set up a schedule or even an instructor. What would you enter for the student, advisor and instructor names?

4. Inconsistent data If after entering these rows you’ll discover that Bruce Lee’s course is actually “Intro to Advanced Visual Basic”, you would need to examine all the rows and change each individually, in order to reflect this change. This introduces the potential for errors if one the changes is omitted or done incorrectly. As you can see, this single simple flat table introduced a number of problems- all of which can be solved by normalizing the table design

Normalization = the process of taking a wide table with lots of columns but few rows and redesigning it as several narrow tables with fewer columns but more rows. A properly normalized design allows: 1. To use storage space efficiently 2. To eliminate redundant data 3. To reduce or eliminate inconsistent data 4. To ease the data maintenance burden The rule: you must be able to reconstruct the original flat view of the data

Relational db theorists have divided normalization into several rules, called normal forms : First normal form ( 1NF ) : No repeating groups Second normal form ( 2NF ) : 1NF + No nonkey attributes depend on a portion of the primary key Third normal form (3NF ) : 2NF + No attributes depend on other nonkey attributes

1NF A repeating group : StudentName AdvisorName CourseID1 CourseDescription1 CourseInstructorName1 CourseID2 CourseDescription2 CourseInstructorName2 CourseID3 CourseDescription3 CourseInstructorName3 Columns for course information have been duplicated to allow the student to take 3 courses. The problem occurs when the student wants to take 4 courses or more. The proper solution is to remove the repeating group of columns to another table

Ugly( StudentName, AdvisorName, CourseID1, CourseDescription1, CourseInstructorName1) Students ( StudentID, StudentName, AdvisorName) StudentCourses ( SCStudentID, SCCourseID, SCCourseDescription, SCCourseInstructorName) The primary keys are shown in italics. The new field SCStudentID is a foreign key to the Students table. We’ve divided the table so that the student can now take as many courses he wants by removing the course information from the original table and creating two tables: one for the student information and one for the course list. The repeating group of columns in the original table is gone. We can still reconstruct the original table using StudentID and SCStudentID columns from the new two tables.

2NF : No nonkey attributes depend on a portion of the primary key 2NF really only apply to tables where the primary key is defined by two or more columns. The essence is that if there are columns which can be identified by only part of the primary key, they need to be in their own table.

StudentCourses ( SCStudentID, SCCourseID, SCCourseDescription, SCCourseInstructorName) The primary key is the combination: SCStudentID, SCCourseID The columns SCCourseDescription, SCCourseInstructorName are only dependent on the SCCourseID column. In other words, the description and instructor’s name will be the same regardless of the student. To solve the problem, we split the table StudentCourses, obtaining three tables from the original one: Students ( StudentID, StudentName, AdvisorName) StudentCourses ( SCStudentID, SCCourseID) Courses (CourseID, CourseDescription, CourseInstructorName)

What we’ve done is to remove the details of the course information to their own table Courses. The relationship between students and courses revealed to be a many -to many relationship: each student can take many courses and each course can have many students The StudentCourses table now contains only the two foreign keys to Students and Courses. It is also called a intersection entity. Let us add a little more detail to the sample tables to make them look something more like the real world

Students StudentID StudentName StudentPhone StudentAddress StudentCity StudentState StudentZIP AdvisorName AdvisorPhone StudentCourses SCStudentID SCCourseID Courses CourseID CourseDescription CourseInstructorName CourseInstructorPhone

3NF: No attributes depend on other nonkey attributes All the columns in the table containd data about the entity that is defined by the primary key. The columns in the table must contain data about only one thing. This is really a extension of 2NF : both are used to remove columns that belong in their own table. To complete the normalization we need to look for columns that are not dependent on the primary key of the table.

Students table: the advisor information is not dependent on the student: if the student leaves the school, the advisor’s name & phone number will remain the same Courses table: the same logic applies to the instructor information: the data for the instructor is not dependent on the primary key CourseID since the instructor will be unaffected if the course is dropped from the curriculum

Students StudentID StudentName StudentPhone StudentAddress StudentCity StudentState StudentZIP StudentAdvisorID StudentCourses SCStudentID SCCourseID Courses CourseID CourseDescription CourseInstructorID Advisors AdvisorID AdvisorName AdvisorPhone Instructors InstructorID InstructorName InstructorPhone