M1G505190 Introduction to Database Development 4. Improving the database design.

Slides:



Advertisements
Similar presentations
Build a database I: Design tables for a new Access database
Advertisements

Normalization What is it?
Topic Database Normalisation S McKeever Advanced Databases 1.
Normalization Lite Pepper. Golden Rule Every attribute must depend upon the key, --- > 1NF the whole key, --- > 2NF and nothing but the key. -  3NF and.
Normalisation Ensuring data integrity in database design 1.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
Boyce-Codd Normal Form Kelvin Nishikawa SE157a-03 Fall 2006 Kelvin Nishikawa SE157a-03 Fall 2006.
Normalization of Database Tables
Chapter 5 Normalization of Database Tables
Why Normalization? To Reduce Redundancy to 1.avoid modification, insertion, deletion anomolies 2.save space Goal: One Fact in One Place.
Michael F. Price College of Business Chapter 6: Logical database design and the relational model.
Introduction to Schema Refinement. Different problems may arise when converting a relation into standard form They are Data redundancy Update Anomalies.
Week 6 Lecture Normalization
CREATE THE DIFFERENCE Normalisation (special thanks to Janet Francis for this presentation)
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Concepts and Terminology Introduction to Database.
Relational databases and third normal form As always click on speaker notes under view when executing to get more information!
Microsoft ® Office Access ® 2007 Training Build a database I: Design tables for a new Access database ICT Staff Development presents:
Component 4: Introduction to Information and Computer Science Unit 6: Databases and SQL Lecture 4 This material was developed by Oregon Health & Science.
Database Systems: Design, Implementation, and Management Tenth Edition
RDBMS Concepts/ Session 3 / 1 of 22 Objectives  In this lesson, you will learn to:  Describe data redundancy  Describe the first, second, and third.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
Normalization (Codd, 1972) Practical Information For Real World Database Design.
Concepts of Relational Databases. Fundamental Concepts Relational data model – A data model representing data in the form of tables Relations – A 2-dimensional.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
SALINI SUDESH. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of.
Schema Refinement and Normal Forms 20131CS3754 Class Notes #7, John Shieh.
CORE 2: Information systems and Databases NORMALISING DATABASES.
XP New Perspectives on Microsoft Office Access 2003 Tutorial 9 1 Microsoft Office Access 2003 Tutorial 9 – Using Action Queries, and Defining Table Relationships.
1 What is database 2? What is normalization? What is SQL? What is transaction?
Introduction to Databases Trisha Cummings. What is a database? A database is a tool for collecting and organizing information. Databases can store information.
Copyright © 2005 Ed Lance Fundamentals of Relational Database Design By Ed Lance.
Normalization Well structured relations and anomalies Normalization First normal form (1NF) Functional dependence Partial functional dependency Second.
M1G Introduction to Database Development 2. Creating a Database.
What's a Database A Database Primer Let’s discuss databases n Why they are hard n Why we need them.
M1G Introduction to Database Development 5. Doing more with queries.
INFORMATION TECHNOLOGY DATABASE MANAGEMENT. Adding a new field 1Right click the table name and select design view 2Type the field information at the end.
Description and exemplification of entity-relationship modelling.
Component 4/Unit 6d Topic IV: Design a simple relational database using data modeling and normalization Description and Information Gathering Data Model.
Normalization of Database Tables
In this session, you will learn to: Describe data redundancy Describe the first, second, and third normal forms Describe the Boyce-Codd Normal Form Appreciate.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 16 Using Relational Databases.
A337 - Reed Smith1 Structure What is a database? –Table of information Rows are referred to as records Columns are referred to as fields Record identifier.
Lection №4 Development of the Relational Databases.
Normalisation RELATIONAL DATABASES.  Last week we looked at elements of designing a database and the generation of an ERD  As part of the design and.
NORMALIZATION. What is Normalization  The process of effectively organizing data in a database  Two goals  To eliminate redundant data  Ensure data.
Data modeling Process. Copyright © CIST 2 Definition What is data modeling? –Identify the real world data that must be stored on the database –Design.
Database Design Process For many businesses, the database is the most important set of computer files they have. For some, like EBay or Facebook, the database.
Understand Relational Database Management Systems Software Development Fundamentals LESSON 6.1.
Logical Database Design and the Relational Model.
Ch 7: Normalization-Part 1
CS411 Database Systems Kazuhiro Minami 04: Relational Schema Design.
RELATIONAL TABLE NORMALIZATION. Key Concepts Guidelines for Primary Keys Deletion anomaly Update anomaly Insertion anomaly Functional dependency Transitive.
Normalization. Overview Earliest  formalized database design technique and at one time was the starting point for logical database design. Today  is.
NormalisationNormalisation Normalization is the technique of organizing data elements into records. Normalization is the technique of organizing data elements.
Copyright © Curt Hill Schema Refinement II 2 nd NF to 3 rd NF to BCNF.
What Is Normalization  In relational database design, the process of organizing data to minimize redundancy  Usually involves dividing a database into.
Database Planning Database Design Normalization.
MS Access. Most A2 projects use MS Access Has sufficient depth to support a significant project. Relational Databases. Fairly easy to develop a good user.
1 The Relational Data Model David J. Stucki. Relational Model Concepts 2 Fundamental concept: the relation  The Relational Model represents an entire.
Normalisation Unit 6: Databases. Just to recap  What is an Entity  What is an Attribute?
Systems Analysis & Design Methods III Classic normalization rules for relational databases III Classic normalization rules for relational databases.
1 CS490 Database Management Systems. 2 CS490 Database Normalization.
Relational Databases – Further Study I think we’ve covered all you need to know for GCSE about relational databases I’m not aware of any practical coursework.
Dr Gordon Russell, Napier University Normalisation 1 - V2.0 1 Normalisation 1 Unit 3.1.
Databases: What they are and how they work
Normalization of Databases
Chapter 4.1 V3.0 Napier University Dr Gordon Russell
Normalization Organized by Farrokh Alemi, Ph.D.
Normalisation 1 Unit 3.1 Dr Gordon Russell, Napier University
Presentation transcript:

M1G Introduction to Database Development 4. Improving the database design

Representing the data model You’ve now seen how to take a data model and represent it in a relational database. To recap: each class is represented by a table each object becomes a row in the table each table has a primary key which is a field or set of fields which uniquely identifies each row a relationship between two tables is represented by the foreign key field(s) which refer to the primary key of the related table Introduction to Database Development 4. Improving the database design 2

Database normalisation However, there may still be problems which can result in data in the database becoming inaccurate or difficult to retrieve We need to do some further checking of the database design to help prevent such problems The process of formally checking and modifying a relational database design is called normalisation Introduction to Database Development 4. Improving the database design 3

Rules of normalisation Normalisation uses a set of rules to check: that all fields are in the right tables whether we need to restructure or add tables to the schema These rules were first proposed by E.F. Codd in about 1970 Codd’s rules have become a key part of relational database design Introduction to Database Development 4. Improving the database design 4

Using normalisation to design a database Start with the “no-design” database Put everything you can think of into one enormous table Following the rules of normalisation can fix the database Data will be separated out into more workable tables Not a recommended approach! Introduction to Database Development 4. Improving the database design 5

Data model and normalisation As a general rule, a well thought-out data model tends to lead to a pretty well normalised database schema However, any flaws in the data model design will have been translated in the database schema Also, there may be problems which arise because of the way we have chosen to represent some features of the data model Introduction to Database Development 4. Improving the database design 6

Problems with un-normalised data Problems with databases are usually a result of having attributes in the wrong tables The solution usually involves moving attributes to different tables and creating additional tables The GCUTours database is fairly well normalised, so we’ll look at some other examples, based on an IT Consultancy company Introduction to Database Development 4. Improving the database design 7

Repeated data Assignments contains repeated data Repeated data can often become inaccurate as values need to be entered repeatedly Which is correct contact no. for Acme Ltd.? Would we have spotted this problem by looking at first 3 rows only? Introduction to Database Development 4. Improving the database design 8

Update anomalies These are problems which arise when you try to add or remove data from a database Update anomalies can make it impossible to get data into the database Can cause important data to be lost from the database Introduction to Database Development 4. Improving the database design 9

Problems with inserting data Possible primary key: (consultantID, clientnumber) What if we add a new client, but don’t want to assign a consultant right away? Can’t do this as primary key fields cannot be NULL Introduction to Database Development 4. Improving the database design 10

Problems with deleting data Consultant 1001 is no longer to work for SuperPrint We want to delete that assignment Whoops! We have just deleted all the information we have in the database about SuperPrint Introduction to Database Development 4. Improving the database design 11

A better design All of these problems arise because some of the fields in the Assignments table should really not be in that table What we need is an additional table to store information about clients Introduction to Database Development 4. Improving the database design 12

Design flaws In this example, the problem has arisen because the data model was flawed The purpose of the Assignment entity simply wasn’t clear enough Had we identified the need for a Client entity when designing the data model, and then the database problems would not have arisen However, normalisation will pick up problems like this too Introduction to Database Development 4. Improving the database design 13

Functional dependencies Normalisation is based on the idea of a functional dependency The following statement is a functional dependency in the Consultants table If we know the value of a consultant’s consultantID we can tell the value of his or her last name Introduction to Database Development 4. Improving the database design 14

Functional dependencies Does it work the other way round? No If we know a consultant’s last name is Jones, can we say for certain what her ID is? It looks like it from the data shown. However, the ID is unique and the last name is not, so we could add another row later for a consultant called Bob Jones with ID 1006 Given the last name Jones the ID could be either 1002 or 1006 Introduction to Database Development 4. Improving the database design 15

Think about these examples Introduction to Database Development 4. Improving the database design 16

Functional dependencies and keys Keys are closely related to functional dependencies as follows: The key fields of a table should functionally determine all the other fields in the table lastname does not functionally determine consultantID, so it cannot be a key field However: consultantID → (firstname, lastname) consultantID is a key Introduction to Database Development 4. Improving the database design 17

Primary keys If we know that the ID is 1001 and the last name is Smith, can we say for sure what the first name is? Yes we can – we can write this functional dependency as: (consultantID, lastname) → firstname So the combination of these two fields is a key But... Introduction to Database Development 4. Improving the database design 18

Primary keys..we don’t actually need lastname in order to know firstname The ID is sufficient as it is itself a key Therefore, this isn’t a primary key – a primary key must have no unnecessary fields. The rule is: A primary key has no subset of its fields that is also a key Introduction to Database Development 4. Improving the database design 19

Foreign keys Remember that foreign keys must match primary keys If we defined (consultantID, lastname) as the primary key of Consultants would need an additional lastname field in Assignments This is an example of redundant data Introduction to Database Development 4. Improving the database design 20 need both fields as foreign key need both fields as foreign key

Normalisation and normal forms Now that we know all about functional dependencies and primary keys, we are ready to do some normalisation There are several levels of normalisation, called normal forms We proceed through the forms, refining the tables and addressing additional problems each time Introduction to Database Development 4. Improving the database design 21

First Normal Form (1NF) Ensures that we are not trying to cram several pieces of data into a single field the data in a table should be atomic The following example is not in 1NF How would we query for all consultants with ‘databases’ skills? Introduction to Database Development 4. Improving the database design 22

Not the solution Each field only contains one piece of information now What if Jane Lee adds web design to her skills? Would have to add a new field to the table To be properly atomic, a table can’t have multiple fields with the same kind of data Introduction to Database Development 4. Improving the database design 23

1NF rule General rule for checking for 1NF: A table is not in first normal form if it contains data which is not atomic – that is, it keeps multiple values for a piece of information Introduction to Database Development 4. Improving the database design 24

1NF fix Normalisation gives us rules – it also gives us ways to fix tables which don’t obey the rules The fix for a table not in 1NF is: Remove the multivalued information from the table. Create a new table with that information and the primary key of the original table. Introduction to Database Development 4. Improving the database design 25

1NF solution Now have two tables: This is a one-to-many relationship Introduction to Database Development 4. Improving the database design 26

Second Normal Form (2NF) Getting to first normal form is a good start, but there can still be plenty of problems lurking The following table is in 1NF: Introduction to Database Development 4. Improving the database design 27 Primary key is (consultantID, skill) Table is prone to update anomalies Primary key is (consultantID, skill) Table is prone to update anomalies

2NF rule General rule for checking for 2NF: A table is in second normal form if it is in first normal form AND we need ALL the fields in the key to determine to values of the non-key fields Introduction to Database Development 4. Improving the database design 28

Why is the example not in 2NF? Value of hourlyrate is functionally dependent only on the value of skill only part of the primary key. skill → hourlyrate hourlyrate does not depend on consultantID database work is charged at £31 no matter who is doing the work So, we don’t need all the fields in the key to determine the value of the non-key field, hourlyrate Introduction to Database Development 4. Improving the database design 29

2NF fix The fix for a table not in 2NF is: Remove the non-key fields that are not dependent on the whole of the primary key. Create another table with those fields and the part of the primary key they do depend on. Introduction to Database Development 4. Improving the database design 30

2NF solution Create extra table Skills The Consultants table is shown here also to illustrate the relationships between the full set of tables Introduction to Database Development 4. Improving the database design 31

Many-to-many relationships Careful design of the data model would probably have led to the same set of tables This situation might have been modelled in the design process as a many-to-many relationship between Consultant and Skill Solution is a typical representation in the database of this type relationship Introduction to Database Development 4. Improving the database design 32

Third Normal Form (3NF) Tables in 2NFcan still cause problems Now, the company has two offices and each consultant is based in one office 2NF, as the primary key is the single field consultantID Still a problem with repeated data Introduction to Database Development 4. Improving the database design 33 inaccurate repeated data

Functional dependencies and 3NF The problem arises because the values of address and phone are dependent on the values of more than one field given the consultantID is 1001, you know that the address is “Cowcaddens Road” given the office is Glasgow, you also know that the address is “Cowcaddens Road” So: consultantID → address, phone office → address, phone Introduction to Database Development 4. Improving the database design 34 office is NOT part of the primary key

3NF rule General rule for checking for 3NF: A table is in third normal form if it is in second normal form AND no non-key fields depend on any fields that are not the primary key Introduction to Database Development 4. Improving the database design 35

3NF fix The fix for a table not in 3NF is: Remove the non-key fields that are dependent on a field (or fields) that is not the primary key. Create another table with those fields and the field(s) that they do depend on Introduction to Database Development 4. Improving the database design 36

3NF solution The values of address and phone for each office are stored once only It’s not possible to enter an invalid value for office in Consultants because each value must match a value in the Offices table Introduction to Database Development 4. Improving the database design 37

Summing up A table is based on: the key; the whole key; and nothing but the key (so help me Codd) Introduction to Database Development 4. Improving the database design 38

Higher normal forms For most cases, normalising to third normal form will take care of the likely problems There are two more normal forms, 4NF and 5NF, which deal with more subtle problems There is also Boyce-Codd normal form, which privdes a single statement which approximately encapsulates the first three normal forms Introduction to Database Development 4. Improving the database design 39

Why normalise? To check the database schema and highlight any flaws in the data model design or the way it has been represented or To fix a database which has already been created without the aid of a suitable data modelling process Introduction to Database Development 4. Improving the database design 40