Presentation is loading. Please wait.

Presentation is loading. Please wait.

Normalization Part 1: The Concept.

Similar presentations


Presentation on theme: "Normalization Part 1: The Concept."— Presentation transcript:

1 Normalization Part 1: The Concept

2 Objectives How to undertake the normalization process.
How normalization uses functional dependencies to group attributes into relations that are in a known normal form. How to identify the most commonly used normal forms: First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) The problems associated with relations that break the rules of 1NF, 2NF, or 3NF. How to represent attributes shown on a form as 3NF relations using normalization.

3 Introduction 2 strategies in creating data model:
We have adopted a top-down approach to database design that begins by identifying the entities and relationship. Normalization is a bottom-up approach to database design that begins by examining the relationships between attributes.

4 Database Design Strategies
Two classical approaches/strategies to dB design: Top-down Conceptual Model Entity Attribute Bottom-up { Normalization }

5 Table & Relational Schema
staff branch staffNO sName position salary branchNO S21 Johan Manager 3000 B005 S37 Ana Assistant 1200 B003 S14 Daud Supervisor 1800 S9 Mary 900 B007 S5 Siti 2400 S41 Jani branchNO bAddress B005 123, Kepong B007 456, Nilai B003 789, PTP branch (branchNO, bAddress) staff (staffNO, sName, position, salary, *branchNO)

6 Normalization When we design a database, the main objective is to create an accurate representation of data, relationship between the data, and constraints on the data that is relevant. To achieve this objective, we have to identify suitable set of relations (table) by creating good table structure/process of assigning attributes to entities. The process is known as Normalization. Process for evaluating and correcting table structures to minimize/control data redundancies {reduces data anomalies}. Works through a series of stages called normal forms.

7 Normalization The most commonly used normal forms:
First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) 1NF < 2NF < 3NF Highest normalization is not always desirable More JOINS are required Affect data retrieval performance/high response time For most business database design purposes, 3NF is as high as we need to go in normalization process

8 Normalization Every normal form is based on functional dependencies between attributes in a relationship. Each relationship can be normalized into a specific form to avoid anomalies. Anomalies? Anomaly = abnormality Ideally a field value change, should be made only in a single place. Data redundancy, promotes an abnormal condition by forcing field value changes in many different locations. Insertion anomalies Deletion anomalies Modification/Update anomalies

9 Functional Dependencies
An important concept associated with normalization is functional dependency which describes the relationship between attributes. In this section, you will learn about functional dependency and then focus on the particular characteristics of functional dependency that are useful for normalization.

10 Functional Dependencies
Functional dependency can be divided into two types: Full functional dependency/Partial dependency (PD) Will be used to transform 1NF  2NF Transitive dependency (TD) Will be used to transform 2NF  3NF

11 Functional Dependencies
Multivalued Attributes (or repeating groups): non-key attributes or groups of non-key attributes the values of which are not uniquely identified by (directly or indirectly) (not functionally dependent on) the value of the Primary Key (or its part). 1st row 2nd row Relational Schema STUDENT(Stud_ID, Name, (Course_ID, Units))

12 Functional Dependencies
Partial Dependency – when an non-key attribute is determined by a part, but not the whole, of a COMPOSITE primary key (The Primary Key must be a Composite Key). Cust_ID → Name

13 Functional Dependencies
Transitive Dependency – when a non-key attribute determines another non-key attribute. Dept_ID → Dept_Name

14 Functional Dependencies
Consider a relation with attributes A and B, where attribute B is functionally depends on attribute A. Let say an A is a PK of R. To describe the relationship between attributes A and B is to say that “A functionally determines B”. R(A,B) A  B A B B is functionally depends on A

15 Functional Dependencies
When a functional dependency exist, the attribute or group of attributes on the left-handed side of the arrow is called determinant. Determinant: Refers to the attributes, or a group of attributes, on the left handed side of the arrow of a functional dependency. A B A functionally determines B

16 Functional Dependencies
staff staffNO sName position salary branchNo S21 Johan Manager 3000 B005 S37 Ana Assistant 1200 B003 S14 Daud Supervisor 1800 S9 Mary 900 B007 S5 Siti 2400 S41 Jani branch branchNO bAddress B005 123, Kepong B007 456, Nilai B003 789, PTP Determinant

17 Functional Dependencies
Consider the attributes staffNO and position of the staff relation. For a specific staffNO (S21), we can determine the position of that member of staff as Manager. staffNO functionally determines position. staffNO position position is functionally depends on staffNO Staff number (S21) Position (manager)

18 Functional Dependencies
However the next figure illustrate that the opposite is not true, as position does not functionally determines staffNO. A member of staff holds one position; however, they maybe several members of staff with the same position. position staffNO staffNO does not functionally depends on position staff number (S21) Position(manager) staff number (S5)

19 Functional Dependencies
Partial Dependencies: Full functional dependency indicates that if A and B are attributes of a relation, B is fully functionally dependent on A, if B is functionally dependent on A, but not on any proper subset of A. staff(staffNO,sName,position,salary,branchNO) staffNO, staffName  branchNO True!!! each value of (staffNO, sName) is associated with a single value of branchNO. however, branchNO is also functionally dependent on staffNO.

20 Functional Dependencies
Transitive Dependencies: staff(staffNO,sName,position,salary,*branchNO) branch(branchNO,bAddress) staffNO  sName,position,salary,branchNO,bAddress branchNO  bAddress True for transitive dependency!!!  branchNO → bAddress exists on staffNO via branchNO

21 Normalization Process
Formal technique for analyzing relations based on their Primary Key (or candidate keys) and functional dependencies. The technique executed as a series of steps (stage). Each step corresponds to a specific normal form, that have specific characteristic. As normalization proceeds, the relations become progressively more restricted (stronger) in format and also less vulnerable to anomalies. Data Redundancies 0NF/UNF 1NF 2NF 3NF

22 Normalization Process
Relationship between Normalize Form Denormalization Figure 1: Diagrammatic illustration of the relationship between the normal forms Normalization

23 Normalization Process
Data Sources Users Users’ requirements specification Forms/reports that are used or generated by the enterprise Sources describing the enterprise such as data dictionary and corporate data model Transfer attributes into table format Unnormalized Form (UNF) Remove repeating group First Normal Form (1NF) Remove partial dependencies Second Normal Form (2NF) Remove transitive dependencies Third Normal Form (3NF)

24 Normalization Process
Relation/Table Format Have repeating group -PK not defined UNF 1)Repeat Group 2)PK is not defined (1 Table) 1NF 1)Remove Repeat Group 2)Defined PK  composite PK consist of attributes (1 or 2 Tables) No repeating group -PK defined -Test partial dependency Test for partial dependency If (exist) (2 or 3 Tables) (more then 1 table) (a b …. TD) 1 (a  ……. TD) 2 (b  ….… TD) 3 2NF No repeating group -PK defined -No partial dependency -Test transitive dependency Test for transitive dependency If (exist) (a, b  x, y) (a c, d) (b  z) (c  d) No repeating group -PK defined -No partial dependency -No transitive dependency 3NF (3 or 4 Tables)

25 Normalization Process
Remember this!!! Unnormalized (UNF): There are multivalued attributes or repeating groups, not CK (or PKs) 1NF: NO multivalued attributes or repeating groups, has CK (or PKs) 2NF: 1NF with NO Partial Dependencies (PD) 3NF: 2NF with NO Transitive Dependencies (TD)

26 Unnormalized Form (UNF)
To create an unnormalized table Transform the data from the information source (e.g. form) into table format with columns and rows. Unnormalized Form = A table that contains one or more repeating UNF/0NF groups

27 First Normal Form (1NF) First Normal Form = A relation in which the intersection of each row 1NF and column contains one and only one value. A relation is in 1NF if every attribute for every tuple have a value and domain for each attribute can not be simplified anymore.

28 Transforming UNF to 1NF UNF 1NF 1 2 To-do list:
Remove repeating groups Identify Composite Key 1 1NF 2 3

29 Example 1: Determine UNF

30 Example 1: Determine UNF

31 Example 1: Determine UNF

32 Example 1: Determine UNF

33 Example 1: Determine UNF

34 Example 2: Determine UNF

35 Example 2: Determine UNF

36 Example 2: Determine UNF

37 Example 2: Determine UNF

38 Example 3: Determine UNF

39 Example 3: Determine UNF

40 UNF to 1NF Nominate an attribute or group of attributes to act as the key for the unnormalized table. Identify the repeating group(s) in the unnormalized table which repeats for the key attribute(s). Step 1: Eliminate the Repeating Groups Eliminate nulls: each repeating group attribute contains an appropriate data value Step 2: Identify the Primary Key Must uniquely identify attribute value New key must be composed Step 3: Identify All Dependencies Dependencies are depicted with a diagram

41 UNF to 1NF Relational Schema
STUDENT(Stud_ID, Name, (Course_ID, Units))

42 UNF to 1NF Key Attribute: Stud_ID, Course_ID
Repeating Group: (Course_ID, Units)

43 UNF to 1NF Ensure a single value at the intersection of each row and column Enter appropriate student data in each row Relational Schema STUDENT(Stud_ID, Name, Course_ID, Units)

44 UNF to 1NF Key Attribute: Stud_ID, Course_ID
Repeating Group: (Course_ID, Units)

45 UNF to 1NF

46 UNF to 1NF Dependency diagram:
Depicts all dependencies found within given table structure Helpful in getting bird’s-eye view of all relationships among table’s attributes Makes it less likely that you will overlook an important dependency

47 UNF to 1NF Remove the repeating group by
Entering appropriate data into the empty columns of rows containing the repeating data Fill the blanks by duplicating the non repeating data, where required. This approach is commonly referred to as ”flattening table”. This approach will produce redundancy in a relationship, but it can be eliminated in higher normalization process.

48 UNF to 1NF Example: DreamHome Case Study
A collection of DreamHome leases (rent) form is shown in Figure 2. The lease on top is for a client called Rannia who is leasing a property in Skudai, Johor, which is owned by Dollah. For this worked example, we assume that a client rents a given property only once and cannot rent more than one property at any one time. Sample data is taken from two leases for two different clients called Rannia and Ahmad and is transformed into table format with rows and columns, as shown in Figure 3. This is an example of unnormalized table.

49 Figure 2 : Collection of Dream Home leases (rent) form
UNF to 1NF Alamat Rumah Mula Sewa Tamat Harga No Pemilik Nama PG04 PG16 Subang Jaya, Selangor. Pasir Gudang, Johor. 1/7/93 1/9/00 31/8/00 1/9/01 750 850 C040 C093 Karim Fendi Kasim Selamat House Page DREAMHOUSE LEASE Date : 28/02/2007 Client Rental Information Client Name : Ahmad Client Number : CR56 Address Rent Start Finish Monthly Owner Name Skudai, Johor Ampang, Selangor Dolah Abdullah Page DREAMHOUSE LEASE Date : 28/02/2007 Client Name : Rannia Client Number : CR76 Figure 2 : Collection of Dream Home leases (rent) form

50 Figure 3 : ClientRental UNF
UNF to 1NF clientNo CR76 CR56 cName houseNo houseAdd rentStart rentFinish rent ownerNo oName Rannia PG04 PG16 Skudai, Johor. Ampang, Selangor 1/7/93 1/9/00 31/8/00 1/9/01 750 850 C040 C093 Dolah Abdullah Ahmad PG 36 PG 16 Kuantan, Pahang. Ampang, Selangor. 20/3/90 21/6/93 25/1/00 19/6/93 3/1/00 30/8/00 1000 Figure 3 : ClientRental UNF Key attribute: clientNo Repeating group in the unnormalized table as the property rented details, which repeat for each client: Repeating Group: ( houseNo, houseAdd, rentStart, rentFinish, rent, ownerNo, oName)

51 UNF to 1NF Key attribute: clientNo Repeating group in the unnormalized table as the property rented details, which repeat for each client: Repeating Group: (houseNo, houseAdd, rentStart, rentFinish, rent, ownerNo, oName) As a consequences, there are multiple value at the intersection of certain rows and columns. For examples, there are two value for houseNo (PG4 and PG16) for the client Rannia. To transform an unnormalized table into 1NF, we ensure that there is a single value at the intersection of each row and column. This is achieved by removing the repeating group. With this approach, remove the repeating group (house rented details) by entering the appropriate client data into each row. The resulting First Normalize Form (1NF) ClientRental relation is shown in Figure 4.

52 Figure 4 : 1NF ClientRental relation
UNF to 1NF clientNo CR76 CR56 cName houseNo houseAdd rentStart rentFinish rent ownerNo oName Rannia PG04 PG16 Skudai, Johor. Ampang, Selangor 1/7/93 1/9/00 31/8/00 1/9/01 750 850 C040 C093 Dolah Abdullah Ahmad PG 36 PG 16 Kuantan, Pahang. Ampang, Selangor. 20/3/90 21/6/93 25/1/00 19/6/93 3/1/00 30/8/00 1000 Figure 4 : 1NF ClientRental relation ClientRental (clientNo, houseNo, cName, houseAdd, rentStart, rentFinish, rent, ownerNo, oName) Primary Key for the ClientRent relation is a composite key that are clientNo and houseNo ClientRent relation is in 1NF as there is a single value at the intersection of each row and columns.

53 UNF to 1NF Relationship in Figure 4 contains data that describes client, house for rent and owner of the house, which is repeated for several times. As a consequences, the ClientRental relation contains significant data redundancy. If implemented, the 1NF relation would be subject to the update anomalies. To remove some of these, transform 1NF  2NF

54 Borrow(bID, bDATE,mID,mNAME,mADD,ISBN,title,rDATE) Functional dependencies: bID, ISBN  bDATE (PK) bID  mID,mNAME,mADD,rDATE (PD) ISBN  title (PD) mID  mNAME,mADD (TD)

55 INSURANCE(cNO,cNAME,cADD,pNO,pTYPE, pDATE,purchaseDATE,pPRICE,aNO,aNAME,aTEL) Functional Dependencies: cNO,pNO,aNO  purcahseDATE (PK) cNO  cNAME,cADD (PD) pNO  pTYPE,pDATE,pPRICE (PD) aNO  aNAME,aTEL (PD)

56 RECEIPT(rNO,date,tNO,sID,sNAME,iNO,mNO,mDES,qty,price,amt,total,pay,balance) Functional Dependencies: rNO,mNO  qty, iNO, amt (PK) rNO  date,tNO,sID,sNAME,total,pay,balance (PD) mNO  mDES,price (PD) sID  sNAME (TD)

57 Second Normal Form (2NF)
Second Normal Form = A relation that is in 1NF and every non-PK NF attribute is fully functionally depends on the PK. Based on the concept of partial dependency. Dependencies based on only a part of composite primary key 2NF applies to relations with composite keys, that is, relations with PK composed of two or more attributes. A relation with a single-attribute PK is automatically in at least 2NF.

58 CONTINUE TO NEXT PRESENTATION TRANSFORMING UNF TO 2NF


Download ppt "Normalization Part 1: The Concept."

Similar presentations


Ads by Google