Presentation on theme: "Normalization What is it?"— Presentation transcript:
1 Normalization What is it? It is the process for assigning attributes to entities. Normalization reduces data redundancies and , by extension, helps eliminate the data anomalies that result from those redundancies.
2 Goal of NormalizationOrganize data element in such a way that they are stored in one place and one place only (with the exception of foreign keys, which are shared).
3 Unnormalized Data No normalization Puppy NumberPuppy NameKennel CodeKennel NameKennel LocationBreederBreedTrick ID 1…nTrick Name 1…nTrick Where Learned 1…nSkill Level 1…nCostume 1…nNo normalizationTrick ID, Trick Name, Trick Where Learned, Skill Level, and Costume all repeat multiple times
4 First Normal FormA relation R is in 1NF if and only if all underlying domains contain atomic values only.
5 First Normal Form Eliminate repeating groups Make a separate table for each set of related attributes, and give each table a primary key
6 1st Normal FormTrick (along with skill and costume, assuming that skill and costume relate to trick) is a repeating groupForm new table to hold trick information
7 Second Form NormalA relation R is in 2NF if it is in 1NF and every non-key attribute is fully dependent on the primary key.
8 Second Form Normal Eliminate Redundant Data If an attribute depends on only part of a multi-valued key, remove it to a separate table
9 2nd Normal FormTrick Name is only partially Dependent on Puppy Number, Trick IDTrick Name is fully dependent on Trick IDChange Trick Table so it only holds information dependent on Trick IDForm new table to hold information about the Puppy and Trick
10 Third Form NormalA relation R is in 3NF if it is in 2NF and every non-key attribute is non-transitively dependent on the primary key.A relation R is in 3NF if and only if it is in 2NF and every determinant is a candidate key.
11 Third Normal Form Eliminate columns not dependant on primary key If attributes do not contribute to a description of the key, remove them to a separate table
12 Third Normal FormKennel Information is not dependent on the puppy numberKennel Name, Kennel Location, and Breeder are dependent on Kennel CodeForm a Kennel table, with Kennel Code as key
13 Fourth Normal FormA relation R is in 4NF if and only if all multi-valued dependencies are functional dependencies
14 Fourth Normal Form Isolate Independent Multiple Relationships No table may contain two or more 1:n or n:n relationships that are not directly related
15 Fourth Normal AppliedTrick and Costume are currently in the same tableAre Trick and Costume directly related?Does the Costume dictate the Trick the puppy does?Does the Trick dictate the Costume the Puppy wears?If not, separate them
16 Fourth Normal FormTrick and Costume are two different 1:n relations that are not directly related to each other. Separate them into two tables
17 Fifth Normal FormA relation R is in 5NF if and only if every join dependency in R is implied by the candidate keys
18 Fifth Normal Form Isolate Semantically related Multiple Relationships There may be practical constrains on information that justify separating logically related many-to-many relationships
19 Why Fifth Form NormalSuppose the database will support which breeds are available at each kennel and which breeders supplies those breedsWe could satisfy this with a Kennel-Breeder-Breed table
21 What’s The ProblemNow suppose a kennel selling any breed must offer that breed from all breeders it deals with. In other words, if Khabul Khennels sells Afghans and wants to sell any Daisy Hill puppies, it must sell Daisy Hill Afghans.The need for fifth normal form becomes clear when we consider inserts and deletes. Suppose that a kennel (whose number in the database happens to be 5) decides to offer three new breeds: Spaniels, Dachshunds, and West Indian Banana-Biters. Suppose further that this kennel already deals with three breeders that can supply those breeds. This will require nine new rows in the database, one for each breeder-and-breed combination.Breaking up the table reduces the number of inserts to six. Here are the tables necessary for fifth normal form, shown with the six newly inserted rows.
23 Fifth Normal FormIf significant update is involved, Fifth Normal Form can mean significant savingsIt is possible to lose information with Fifth Normal Form
24 Normalization (summary) Take projections of original 1NF relation to eliminate non-full functional dependenciesTake projections of these 2NF relations to eliminate transitive functional dependenciesTake projections of these 3NF relations to eliminate any remaining functional dependencies that do not arise from candidate keys
25 Normalization (summary) Take projections of these 3NF relations to eliminate multi-dependencies that are not also functional dependenciesTake projections of these 4NF relations to eliminate any remaining join dependencies that are not also multi-dependencies
26 Normalization GuideSingle membership of an instance in a set is recognized by a stable, unique identifier (key)All the attributes in an entity depend on all the key attributes of that entityNone of the attributes depend on any other attributes other than the keysAny attributes which can be recognized as a separate set have their own entity and key
27 Normalization (simplified) The key, the whole key, and nothing but the key, so help me Codd.
28 Denormalization Derived Columns Deliberate Duplication Removal or Disabling of Constraints
29 Derived ColumnsCalculated fields, such as Total Amount (Qty x Unit Price)While useful, are not part of a fully normalized modelMay be added back into the physical database
30 Deliberate Duplication Duplicating the same column in 2 or more tablesIt might seem desirable to duplicate a column(s) to avoid joins, such as duplicating an employee name where the employee number is a foreign keyThis would require the update of multiple tables if that employee changed their name
31 Removal of Constraints The removal of referential integrity (relationship) constraints to speed up update processesThe goal of the logical data model is to translate the business model (CDM) into a fully normalized database design. Part of that is the relationshipsConstraints may be removed from the physical database, but not the LDM
32 DenormalizationDenormalization may be done to the physical database designAny denormalization is deliberate and for rational and supportable reasons
33 DBA’s dirty little secret Normalization is over-valued by those that do it.Normalization is under-valued by those that don’t.
Your consent to our cookies if you continue to use this website.