Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dr. Mohamed Osman Hegaz1 Logical data base design (2) Normalization.

Similar presentations


Presentation on theme: "Dr. Mohamed Osman Hegaz1 Logical data base design (2) Normalization."— Presentation transcript:

1 Dr. Mohamed Osman Hegaz1 Logical data base design (2) Normalization

2 Dr. Mohamed Osman Hegaz2 Normalization: The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations (Bad relation: relation contains redundancy, or duplicated values and cause update anemones) Normal form: Condition using keys and FDs of a relation to certify whether a relation schema is in a particular normal form

3 Dr. Mohamed Osman Hegaz3 The Process of Normalization Formal technique for analyzing a relation based on its primary key and functional dependencies between its attributes. Often executed as a series of steps. Each step corresponds to a specific normal form, which has known properties. As normalization proceeds, relations become progressively more restricted (stronger) in format and also less vulnerable to update anomalies.

4 Dr. Mohamed Osman Hegaz4 Relationship Between Normal Forms

5 Dr. Mohamed Osman Hegaz5 “Key” Concepts - Superkey - A set of attributes such that no two tuples have the same values for these attributes – Primary key - A selected candidate key

6 Dr. Mohamed Osman Hegaz6 Unnormalized Form (UNF) A table that contains one or more repeating groups. To create an unnormalized table: transform data from information source (e.g. form) into table format with columns and rows.

7 Dr. Mohamed Osman Hegaz7 First Normal Form (1NF) A relation in which intersection of each row and column contains one and only one value. A relation schema is in 1NF if domains of attributes include only atomic (simple, indivisible) values and the value of an attribute is a single value from the domain of that attribute 1NF disallows – having a set of values, a tuple of values, or a combination of both as an attribute value for a single tuple – “ relations within relations ” and “ relations as attributes of tuples

8 Dr. Mohamed Osman Hegaz8 UNF to 1NF Nominate an attribute or group of attributes to act as the key for the unnormalized table. Identify repeating group(s) in unnormalized table which repeats for the key attribute(s).

9 Dr. Mohamed Osman Hegaz9 UNF to 1NF Remove repeating group by: entering appropriate data into the empty columns of rows containing repeating data ( ‘ flattening ’ the table). Or by placing repeating data along with copy of the original key attribute(s) into a separate relation.

10 Dr. Mohamed Osman Hegaz10 Non- 1NF Relation

11 Dr. Mohamed Osman Hegaz11 Relations in 1NF

12 Dr. Mohamed Osman Hegaz12 Relations in 1NF

13 Dr. Mohamed Osman Hegaz13 (a) Relation schema that is not in 1NF. (b) Example relation instance. (c) 1NF relation with redundancy.

14 Dr. Mohamed Osman Hegaz14 (a) Schema of the EMP_PROJ relation with a "nested relation“ PROJS. (b) Example extension of the EMP_ PROJ relation showing nested relations within each tuple

15 Dr. Mohamed Osman Hegaz15 Decomposing EMP_PROJ into 1NF relations EMP_PROJ1 and EMP_PROJ2 by propagating the primary key.

16 Dr. Mohamed Osman Hegaz16 Second Normal Form (2NF) A relation schema is in 2NF if it is in 1NF, and every non- prime attribute is fully functionally dependent on the primary key A FD X -> Y is termed “full” if removal of any attribute from X means that the FD no longer holds A FD X -> Y is termed “partial” if some attribute can be removed from X and the dependency still holds

17 Dr. Mohamed Osman Hegaz17 1NF to 2NF Identify primary key for the 1NF relation. Identify functional dependencies in the relation. If partial dependencies exist on the primary key remove them by placing them in a new relation along with copy of their determinant.

18 Dr. Mohamed Osman Hegaz18 Normalizing EMP_PROJ into 2NF relations

19 Dr. Mohamed Osman Hegaz19 Third Normal Form (3NF) Based on concept of transitive dependency: A, B and C are attributes of a relation such that if A B and B C, then C is transitively dependent on A through B. (Provided that A is not functionally dependent on B or C). 3NF - A relation that is in 1NF and 2NF and in which no non-primary-key attribute is transitively dependent on the primary key.

20 Dr. Mohamed Osman Hegaz20 2NF to 3NF Identify the primary key in the 2NF relation. Identify functional dependencies in the relation. If transitive dependencies exist on the primary key remove them by placing them in a new relation along with copy of their determinant.

21 Dr. Mohamed Osman Hegaz21 Normalizing EMP_DEPT into 3NF relations

22 Dr. Mohamed Osman Hegaz22 General Definitions of 2NF and 3NF Second normal form (2NF) A relation that is in 1NF and every non-primary- key attribute is fully functionally dependent on any candidate key. Third normal form (3NF) A relation that is in 1NF and 2NF and in which no non-primary-key attribute is transitively dependent on any candidate key.

23 Dr. Mohamed Osman Hegaz23 Boyce – Codd Normal Form (BCNF) Based on functional dependencies that take into account all candidate keys in a relation, however BCNF also has additional constraints compared with general definition of 3NF. BCNF - A relation is in BCNF if and only if every determinant is a candidate key.

24 Dr. Mohamed Osman Hegaz24 Boyce – Codd normal form (BCNF) Difference between 3NF and BCNF is that for a functional dependency A  B, 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key. Whereas, BCNF insists that for this dependency to remain in a relation, A must be a candidate key. Every relation in BCNF is also in 3NF. However, relation in 3NF may not be in BCNF.

25 Dr. Mohamed Osman Hegaz25 Summary : 2NF, 3NF, BCNF based on keys and FDs of a relation schema 4NF based on keys, multi-valued dependencies : MVDs; 5NF based on keys, join dependencies : JDs Additional properties may be needed to ensure a good relational design (lossless join, dependency preservation)

26 Dr. Mohamed Osman Hegaz26 Summary (cont) : Normalization is carried out in practice so that the resulting designs are of high quality and meet the desirable properties The practical utility of these normal forms becomes questionable when the constraints on which they are based are hard to understand or to detect The database designers need not normalize to the highest possible normal form. (usually up to 3NF, BCNF or 4NF) Denormalization: the process of storing the join of higher normal form relations as a base relation — which is in a lower normal form

27 Dr. Mohamed Osman Hegaz27 Summary (cont): Definitions of Keys and Attributes Participating in Keys (1) A superkey of a relation schema R = {A 1, A 2,...., A n } is a set of attributes S subset-of R with the property that no two tuples t 1 and t 2 in any legal relation state r of R will have t 1 [S] = t 2 [S] A key K is a superkey with the additional property that removal of any attribute from K will cause K not to be a superkey any more.

28 Dr. Mohamed Osman Hegaz28 Summary (cont): Definitions of Keys and Attributes Participating in Keys (1) If a relation schema has more than one key, each is called a candidate key. One of the candidate keys is arbitrarily designated to be the primary key, and the others are called secondary keys. A Prime attribute must be a member of some candidate key A Nonprime attribute is not a prime attribute — that is, it is not a member of any candidate key.


Download ppt "Dr. Mohamed Osman Hegaz1 Logical data base design (2) Normalization."

Similar presentations


Ads by Google