Presentation on theme: "Normalization Rules for Database Tables"— Presentation transcript:
1Normalization Rules for Database Tables Normalization Rules for Database TablesNorthern Arizona UniversityCollege of Business Administration
2Normalization - Some Definitions Normalization - Some Definitionsa relation is a two-dimensional array with a single-valued entry in each cell which has no duplicate rows and has columns whose meaning is the same across all rows.All tables used in the relational model must be relations.Normalization is a process for evaluating table structure and reorganizing them as needed to product a set of stable, well-structured relations.An anomaly is a condition which interferes with the storage, or retention of data or creates the potential for inconsistent data.There are insertion, modification, and deletion anomalies.The Normalization process should eliminate anomalies.
3Unnormalized TablesAn Unnormalized table is a table that does not meet the definition of a relation.it contains rows with multiple values for an attribute (repeating groups) orcontains duplicate rows.A table is said to be in first normal form if it meets the definition of a relationGenerally this means it contains no repeating groups of attributes.The next slide shows an example of an unnormalized table.
4EMPLOYEEThis EMPLOYEE table is unnormalized - It has cells that donot contain single-valued entries.As shown this table has no logical primary key. The E ID# doesnot functionally determine the value of Skill.
5EMPLOYEEThe above employee table shows the same set of data as theprevious slide. It has been reorganized into a form that couldbe implemented under some file processing systems, usingCOBOL, for instance.However, it is still not in a form that can be used by the relationalmodel. The Skills are a multi-valued (repeating group) ofattributes which cannot be identified by the primary key.
6Eliminating Repeating Groups Original TableEMPLOYEEIn most cases, Unnormalized tables can be converted to Sets of Tables that are in at least First Normal form by:placing any repeating groups of fields in a separate table which includes the primary key attribute from the original table along with a single occurrence of the repeating attribute (Skill in our example).A Table is in first normal form if it contains no multi-valued attributes
7Eliminating Repeating Groups EMPLOYEEOriginalNormalizedEMPLOYEE_SKILLEMPLOYEE
8Logical ER Diagram in ER Studio Notation 1st Normal Form Example
9This Table is in First Normal Form This Table is in First Normal FormSchedule of ClassesThere are no repeating groups of attributes.NOTE:The primary key of this table is a Concatenated key - no single attributeuniquely identifies a row of the table, but the Combination of Course #and Section # does uniquely identify a row.If I know that the Course is CIS 120 and the Section is section 1, I canidentify a unique schedule occurrence.
10Although this Table is in First Normal Form, it Contains Anomalies Although this Table is in First Normal Form, it Contains AnomaliesSchedule of ClassesIf the description of CIS220 Changes from VB Prog. to Visual C#, I must record the new value in two places (as shown)- This is a modification anomaly
11Although this Table is in First Normal Form, it Contains Anomalies Although this Table is in First Normal Form, it Contains Anomalies?Schedule of ClassesIf a new course has been designed and I know its description and credit hours(ACC 266, Pers. Acc., 2 hrs), I still cannot record this data until at least onesection of the course is offered - an insertion anomaly.
12A Table in First Normal Form Containing Anomalies A Table in First Normal Form Containing AnomaliesSchedule of ClassesIf no section of ACC 255 is offered this semester, I will lose the informationabout the description and credit hours of this course. - A deletion anomaly
13Schedule of ClassesThis table has anomalies because it contains partial dependencies.A partial dependency occurs when one or more attributes in a table depends upon (is functionally determined by) only a portion of a concatenated primary key.In this case the Description and Cr. Hrs. attributes depend only on Course #. To correct this problem, those attributes determined by only a part of the key should be placed in a separate table. Its Primary key will be the portion of the original primary key required to identify them.
14Schedule of Classes Original Revised Schedule of Classes COURSE Notice how this structure eliminates the anomalies we found
15Logical ER Diagram in ER Studio Notation 2nd Normal Form Example
16Second Normal FormPartial Dependencies occur when nonkey attributes are functionally determined by only a portion of a concatenated primary key.Partial dependencies can occur only in tables with a concatenated key.Partial dependencies can be corrected by removing those attributes to a separate table whose primary key is just the portion of the key from the original table needed to functionally determine them.A table is in second Normal Form if it is in first normal form and it contains no partial dependencies.
17A Table in Second Normal Form Which Has Anomalies PROFESSORThis table is in 2nd normal form since it has no repeating groups of attributes(first normal form) and its primary key is not concatenated.However, the table above still has anomalies.
18Anomalies in the Example Professor Table PROFESSORModification Anomalies -if the Dept Aide serving the ECO departmentchanges, or if the Fax # of the ECO department changes, this changewould need to be made in several records.Insertion Anomalies - I want to start a new department and have a Dept Code,a Dept Aide, and a Dept Fax # (e.g., MKT, T. Taylor, ).I can’t add this data to the table until at least one professor is hired to teach inthis new department.
19Anomalies in the Example Professor Table PROFESSORDeletion Anomalies - If Prof # L29 (the only professor in the CIS department in our example table) is deleted, we would lose the information about the name of the Dept Aide for CIS and the Dept Fax # for CIS.
20This Professor Table has Transitive Dependencies PROFESSORThe anomalies we have found occur because the Professor table has transitive dependencies.Dept Code, Dept Aide, and Dept Fax # are all attributes of a DEPARTMENT entity which is uniquely identified by Dept Code - If I know Dept Code I can uniquely identify Dept Aide and Dept Fax #.Knowing Prof # allows me to identify these attributes, but only through a chain of inferences - Prof # uniquely identifies Dept Code which, in turn uniquely identifies the other DEPARTMENT attributes.The anomalies can be resolved by removing the attributes determined by a non-key attribute to a separate table.
22Logical ER Diagram in ER Studio Notation 3rf Normal Form Example
23Third Normal FormTransitive dependencies occur when non-key attributes are functionally determined by other non-key attributes.Transitive dependencies can be corrected by removing the attributes to a separate table whose primary key is the attribute of the original table which functionally determines them.The functionally determining attribute serves as a foreign key in the original table.A table is in Third Normal Form if it is in second normal form and it contains no transitive dependencies.
24A Proposed Normalization Process for Database Designers A Proposed Normalization Process for Database DesignersExamine each table of the proposed structure and perform the following operations:Remove any repeating groups of attributes (multi-valued attributes) to a separate table. If there are independent sets of multi-valued attributes place each set in a separate table.Remove any attributes that are functionally determined by only a portion of a concatenated key to a separate table.Remove any attributes that are functionally determined by a non-key attribute to a separate table.
25Review Question: What Normalization rule(s) are violated by the table below? How would you revise the table Structure?Write out your answer on a piece of scratch paper.
26Review Question Solution: Review Question Solution:This table violates both 2nd & 3rd normal forms.Emp. Name and Emp. Class both depend only on Emp No. Which is part of the concatenated key - violates 2nd normal form.Wage Rate is actually determined by Emp Class a non-key attribute which violates 3rd normal form.Original Table:Normalized TablesEmployee ClassEmployee HoursEmployee
27Logical ER Diagram in ER Studio Notation Review Question Example
28Merging RelationsView Integration–Combining entities from multiple ER models into common relationsIssues to watch out for when merging entities from different ER models:Synonyms–two or more attributes with different names but same meaningHomonyms–attributes with same name but different meaningsTransitive dependencies–even if relations are in 3NF prior to merging, they may not be after mergingSupertype/subtype relationships–may be hidden prior to merging
29Enterprise KeysPrimary keys that are unique in the whole database, not just within a single relationCorresponds with the concept of an object ID in object-oriented systems
30Figure 4-31 Enterprise keys a) Relations with enterprise keyb) Sample data with enterprise key
31Mapping Unary Relationships One-to-Many–Recursive foreign key in the same relationMany-to-Many–Two relations:One for the entity typeOne for an associative relation in which the primary key has two attributes, both taken from the primary key of the entity
32Figure 4-17 Mapping a unary 1:N relationship (a) EMPLOYEE entity with unary relationship(b) EMPLOYEE relation with recursive foreign keyER StudioNotation
33Figure 4-17 Mapping a unary 1:N relationship (a) EMPLOYEE entity with unary relationship(b) EMPLOYEE relation with recursive foreign key
34Figure 4-18 Mapping a unary M:N relationship (a) Bill-of-materials relationships (M:N)(b) ITEM and COMPONENT relationsER Studio Notation with Sample dataITEMCOMPONENTItemNoItemDescripComponentNoQuantityADC8Audio CardPCD21MBD2MotherboardPC Dual CoreRAM94PCQ5PC Quad Core. . .1GB RAM Chip8
35Figure 4-18 Mapping a unary M:N relationship (a) Bill-of-materials relationships (M:N)(b) ITEM and COMPONENT relations