Presentation is loading. Please wait.

Presentation is loading. Please wait.

Normalization Lecture 7 May Aldoayan.

Similar presentations


Presentation on theme: "Normalization Lecture 7 May Aldoayan."— Presentation transcript:

1 Normalization Lecture 7 May Aldoayan

2 13.1:The purpose of normalization.
Normalization: A technique for producing a set of relations with desirable properties, given the data requirements of an enterprise. How normalization can be used when designing a relational database. The potential problems associated with redundant data in base relations. The concept of functional dependency, which describes the relationship between attributes. May Aldoayan

3 13.1:The purpose of normalization.
How to identify functional dependencies for a given relation. How functional dependencies identify the primary key for a relation. How to undertake the process of normalization. How normalization uses functional dependencies to group attributes into relations that are in a known normal form. May Aldoayan

4 13.1:The purpose of normalization.
The characteristics of functional dependencies used in normalization. How to identify the most commonly used normal forms, namely First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form (3NF). The problems associated with relations that break the rules of 1NF, 2NF, or 3NF. How to represent attributes shown on a form as 3NF relations using normalization. May Aldoayan

5 Data Redundancy and Update Anomalies
Major aim of relational database design is to group attributes into relations to minimize data redundancy. Problems associated with data redundancy are illustrated by comparing the Staff and Branch relations shown in figure 13.1 with the StaffBranch relation shown in Figure 13.2. May Aldoayan

6 Figure 13.1: Staff and Branch relations
Figure 13.1: Staff Branch relation May Aldoayan

7 Data Redundancy and Update Anomalies
Staff (staffNo, sName, psition,slaray,branchNo) Branch(branchNo, bAddress) StaffBranch(staffNo, sName, psition,slaray,branchNo, bAddress) Note that the primary key for each relation underline. In the StaffBranch relation has redundant data; the details of a branch are repeated for every member of staff. In contrast, the branch information appears only once for each branch in the Branch relation and only the branch number (branchNo) is repeated in the Staff relation, to represent where each member of staff is located. May Aldoayan

8 Data Redundancy and Update Anomalies
Relations that contain redundant information may potentially suffer from update anomalies. Types of update anomalies include Insertion Deletion Modification Ex: To insert a new staff tuple into staffbranch, we must include either the attribute values correctly for the branch that the employee works for, or NULLs to insert a new branch that has no employees as yet in the staffbranch relation. The only way to do this is to place NULL values in the attributes for staff. This violates the entity integrity forstaffbranchbecause Staffnum is its primary key May Aldoayan

9 Lossless-join and Dependency Preservation Properties
Two important properties of decomposition. Lossless-join property enables us to find any instance of the original relation from corresponding instances in the smaller relations. Dependency preservation property enables us to enforce a constraint on the original relation by enforcing some constraint on each of the smaller relations. May Aldoayan

10 Functional Dependencies
Important concept associated with normalization. Functional dependency describes relationship between attributes. For example, if A and B are attributes of relation R, B is functionally dependent on A (denoted A  B), if each value of A in R is associated with exactly one value of B in R. May Aldoayan

11 Functional Dependencies
Property of the meaning or semantics of the attributes in a relation. Diagrammatic representation. The determinant of a functional dependency refers to the attribute or group of attributes on the left-hand side of the arrow. Determinate Dependent May Aldoayan

12 Figure 13.1: Staff and Branch relations
May Aldoayan

13 Example 13.1:identifying a functional dependency
Consider the attribute staffNo and position of the Staff relation in figure For a specific staffNo, for example SL21, we can determine the position of that member of staff as Manager. In other words, the position attribute is functionally dependent on staffNo, as shown in Figure 13.4(a). However, Figure 13.4(b) illustrates that the opposite is not true, as staffNo is not functionally dependent on position. A member of staff holds one position, however, there may be several members of staff with the same position. The relationship between staffNo and position is one-to-one (1:1) for each staff number there is only one position. On the other hand, the relationship between position and staffNo is one-to-many (1:*): there are several staff numbers associated with a given position. In this example, staffNo is the determinant of this functional dependency. For the purposes of normalization we are interested in identifying functional dependencies between attributes of a relation that have a one-to-one relationship. May Aldoayan

14 Figure 13.4: (a) position is functionally dependent on StaffNo (StaffNo  position);
(b) staffNo is not functionally dependent on position (position  staffNo) May Aldoayan

15 Example 13.2: Identifying a functional dependency that holds for all time.
Consider the values shown in staffNo and sName attributes of the staff relation in figure we see that for a specific staffNo, for example SL21, we can determine the name of that member of staff as John White. Furthermore, it appears that for a specific sName, for example, John White, we can determine the staff number for that member of staff as SL21. Can we therefore conclude that the staffNo attribute is functionally dependent on the sName attribute and/or that the sName attribute is functionally dependent on the staffNo? If the values shown in the staff relation of Figure 13.1 represent the set of all possible values for staffNo and sName attributes then the following functional dependencies hold: May Aldoayan

16 Example 13.2: Identifying a functional dependency that holds for all time.
StaffNo  sName sName  StaffNo However, the only functional dependency that remains true for all possible values for the staffNo and sName attributes of the Staff relation is: staffNo → sName May Aldoayan

17 Example 13.3: Trivial functional dependencies
Example of trivial dependencies for the staff relation include: StaffNo, sName  sName StaffNo, sName  StaffNo Although these functional dependncies are true for the staffNo and sName attributes of the StaffNo relation, they do not provide any additional information about possible integrity constraints on the values held by these attributes. As the name implies, trivial dependencies are not very interesting in practical terms, we are normally more interested in nontrivial dependencies because they represent integrity constraints for the relation. May Aldoayan

18 The main characteristics of functional dependencies that we use in normalization:
Have a one-to-one relationship between attributes on the left- and right-hand side of a dependency Hold for all time Are nontrivial. We demonstrate the process of identifying a set of useful functional dependencies for a given relation in the following example May Aldoayan

19 Figure 13.1: Staff Branch relation
May Aldoayan

20 Example 13.4: identfying a set of functional dependencies for the StaffBranch
Identify the functional dependencies for the StaffBranch relation as: staffNo → sName, position, salary, branchNo, bAddress branchNo → bAddress bAddress → branchNo branchNo, position → salary bAddress, position → salary We identify five functional dependencies in the StaffBranch relation with: StaffNo, BranchNo, bAddress, (branchNo, position), and (address, city, position) as Determinants For each functional dependency, we ensure that all the attributes on the right-hand side are functionally dependent on the determinat on the left-hand side. May Aldoayan

21 Identifying the PK for a relation using Functional dependencies
The main purpose of identifying a set of functional dependency for relation, to specify the set of integrity constraints that must hold on a relation The determinant attribute(s) are candidate of the relation 1:1 relationship between determinant & dependent No subset of determinant attribute(s) is a determinant. (nontrivial) If (A, B)  C, then NOT A  B, and NOT B  A May Aldoayan

22 Identifying the PK for a relation using Functional dependencies
All attributes that are not part of the CK should be functionally dependent on the key CK  all attributes of R Hold for all time PK is the candidate attribute(s) with the minimal set of functional dependency May Aldoayan

23 13.3.3 Inference Rules for Functional Dependencies
Closure: the set of functional dependencies that are implied by a given set of functional dependencies X. Armstrong’s aximos (inference rules): the set of inference rules specifies how functional dependencies can be inferred from given one Inference rules: Reflexivity If B  A, then A  B Augmentation If A  B, then A,C  B,C Transitivity If A  B and B  C, then A  C Self-Determination A  A Decomposition If A  B,C, then A  B and A  C Union If A  B and A  C, then A  B,C Composition If A  B and C  D, then A,C  B,D May Aldoayan

24 Minimal Sets of Functional Dependencies
Complete set of functional dependencies for a relation can very large We need to reduce the set to a manageable size, by applying the inference rules repeatedly until they stop producing new FDs Assume S1 & S2 are set of dependencies S1  S2, then S2 is a cover for S1 or S1 is covered by S2 if S2 is a cover for S1 & S1 is a cover for S2 S1 equivalent to S2 May Aldoayan

25 Minimal Sets of Functional Dependencies
A set of functional dependencies X is minimal if it satisfies the following: Every dependency in X has a single attribute for its right-hand side Can’t replace any dependency A  B in X with C B , where C  A, & still have a set of dependencies equivalent to X Can’t remove any dependency from X and still have a set of dependencies that is equivalent to X May Aldoayan

26 Example 13.6: Identifying the minimal set of functional dependencies of the StaffBranch relation
Produce the following functional dependencies: staffNo  sName staffNo  position staffNo  salary staffNo  branchNo staffNo  bAddress branchNo bAddress bAddress branchNo branchNo, p0sition salary bAddress, p0sition salary These functional dependencies satisfy the three conditions for producing a minimal set of functional dependencies for the StaffBranch relation May Aldoayan

27 The Process of Normalization
As normalization proceeds, the relations become progressively more restricted (stronger) in format and also less vulnerable to update anomalies. May Aldoayan

28 The Process of Normalization
Normalization is a bottom-up approach to database design that begins by examining the relationships between attributes. It is performed as a serious of tests on a relation to determine whether it satisfies or violates the requirements of a given normal form. Purpose: Guarantees no redundancy due to FDs Guarantees no update anomalies Normal Forms: First Normal Form (1NF) Boyce-Codd Normal Form (BCNF) Second Normal Form (2NF) Fourth Normal Form (4NF) Third Normal Form (3NF) Fifth Normal Form (5NF) May Aldoayan

29 First Normal Form (1NF) First Normal Form (1NF)
Unnormalized Form (UNF) A table that contains one or more repeating groups. To create an unnormalized table Transform the data from the information source (e.g. form) into table format with columns and rows. First Normal Form (1NF) A relation in which the intersection of each row and column contains one and only one value. May Aldoayan

30 First Normal Form (1NF) UNF to 1NF: to transfer the unnormalize table to first nomal form we identify and remove repeating groups within a table. There are two common approaches to removing repeating groups from unnormalized tables : In the first approach, we Remove the repeating group by Entering appropriate data into the empty columns of rows containing the repeating data (‘flattening’ the table). In the second approach, we Remove the repeating group by Placing the repeating data along with a copy of the original key attribute(s) into a separate relation. May Aldoayan

31 ClientRental Figure13.7: ClientRental unormalized table OName ownerNo
rentFinish rentStart pAddress PoertyNo cName clientNo Tina Murphy Tony Shaw CO40 CO39 350 450 31-jul-01 1-sep-02 1-jul-00 1-sep-01 6 Lawernce St, Glassgow 5 Novar Dr, Glasgow PG4 PG16 John Kay CR76 CO93 375 10-June-00 1-Dec-01 10-Aug-03 1-sep-99 10-Oct-00 1-Nov-02 2 Manor Rd, Glasgow PG36 Aline Stewart CR56 Figure13.7: ClientRental unormalized table May Aldoayan

32 UNF to 1NF We identify the key attribute for the clientRental unnormalized table as clientNo. Next, we identify the repeating group in the unnormalized table as the property rented details, which repeats for each client. The structure of the repeating group is: Repeating Group = (propertyNo, pAddress, rentStart, rentFinish, rent, ownerNo, oName) There are multiple values at the intersection of certain rows and column. There are two values for ropertyNo (PG4 and PG16) for the client Name John Kay. To transfor an unnormalized table into 1NF: We ensure that there is a single value at the intersection of each row and column. May Aldoayan

33 UNF  1NF Approach 1 Expand the key so that there will be a separate tuple in the original relation for each repeated attribute(s). Primary key becomes the combination of primary key and redundant value OName ownerNo rent rentFinish rentStart pAddress PoertyNo cName clientNo Tina Murphy CO40 350 31-jul-01 1-jul-00 6 Lawernce St, Glassgow PG4 John Kay CR76 Tony Shaw CO39 450 1-sep-02 1-sep-01 5 Novar Dr, Glasgow PG16 10-June-00 1-sep-99 Aline Stewart CR56 375 1-Dec-01 10-Oct-00 2 Manor Rd, Glasgow PG36 CO93 10-Aug-03 1-Nov-02 1NF relation May Aldoayan

34 UNF  1NF Approach 1 ClientRental(ClientNo, PropertyNo, cName,pAddress, rentStart, rentFinish, rent, ownerNo, oName) Disadvantage: introduce redundancy in the relation May Aldoayan

35 UNF  1NF Approach 2 The second approaches: we remove the repeating group (property rented details) by placing the repeating data along with a copy of the original key attribute (clientNo) in a separate relation. We identify a primary key for the new relation. The format of the resulting 1NF relation are as followers : Client( ClientNo, cName) propertyRenatOwner( clientNo, propertyNo, pAddress, rentStart, rentFinish, rent, ownerNo, oName ) May Aldoayan

36 Client PropertyRentalOwner
cName ClientNo John Kay CR76 Aline Stweart CR56 Client OName ownerNo rent rentFinish rentStart pAddress PoertyNo clientNo Tina Murphy CO40 350 31-jul-01 1-jul-00 6 Lawernce St, Glassgow PG4 CR76 Tony Shaw CO39 450 1-sep-02 1-sep-01 5 Novar Dr, Glasgow PG16 10-June-00 1-sep-99 CR56 375 1-Dec-01 10-Oct-00 2 Manor Rd, Glasgow PG36 CO93 10-Aug-03 1-Nov-02 PropertyRentalOwner May Aldoayan

37 Second Normal Form (2NF)
Second Normal Form (2NF):Based on the concept of full functional dependency. If A and B are attributes of a relation B is fully functionally dependent on A if B is functionally dependent on A, but not on any proper set of A B is partial functional dependent on A if some attributes can be removed from A & the dependency still holds StaffNo, Sname  BranchNo Partial dependency ClientNo, PropertyNo  RentDate Full dependency May Aldoayan

38 Second Normal Form (2NF)
Second normal form (2NF): A 1NF relation in which every attribute is fully nontrivial functionally dependent on the PK.non-prime attributes fully dependent on PK. Applies to relations with composite primary keys & partial dependencies May Aldoayan

39 1NF  2NF Identify the primary key for the 1NF relation.
Identify the functional dependencies in the relation. If partial dependencies exist on the primary key remove them by placing then in a new relation along with a copy of their determinant. May Aldoayan

40 Example OName ownerNo rent rentFinish rentStart pAddress cName propertyNo clientNo Fd1 (primary key) Fd2 (partial dependency) Fd3 (partial dependency) Fd4 (Transitive dependency) Fd5 (Candidate Key) Fd6 (Candidate Key) Fd1 clientNo, propertyNo  rentStart, rentFinish (primary Key) Fd2 clientNo  cName (partial dependency) Fd3 propertyNo pAddress,rent, ownerNo, oName (partial dependency) Fd4 ownerNo  oName (Transitive dependency) Fd5 clientNo, rentStart  propertyNo, pAddress, rentFinish, rent, ownerNo, oName (candidate dependency) Fd6 propertyNo, rentstart  clientNo, cName, rentfinish (candidate dependency) May Aldoayan

41 1NF  2NF 3. Remove partial dependencies by placing the functionally dependent attributes in a new relation along with a copy of their determinants RentFinish Rentstart PropertyNo ClientNo 31-Aug-01 1-jul-00 PG4 CR76 1-sep-02 1 – Sep-01 PG16 10-jun-00 1-Sep-99 CR56 1-DEC-01 10-OCT-00 PG36 10-AUG-03 1-Nov-02 RENTAL CLIENT cName clientNo John Kay CR76 Aline Stewart CR56 2NF Relation PROPERTY_OWNER 2NF Relation Oname ownerNo Rent pAddress PoertyNo Tina Murphy CO40 350 6 Lawernce St, Glassgow PG4 Tony Shaw CO93 450 5 Novar Dr, Glasgow PG16 375 2 Manor Rd, Glasgow PG36 2NF Relation May Aldoayan

42 Transitive Dependency
A, B, C are attributes of a relation, such that If A  B and B  C, then C is transitively dependent on A via B Provided A is NOT functionally dependent on B or C (nontrivial FD) Example StaffNo  BranchNo , BranchNo  Address StaffNo  Address May Aldoayan

43 Third Normal Form (3NF) Third normal form (3NF): A 2NF relation in which NO non-prime attribute is transitively dependent on the PK Based on the concept of transitive dependency. Transitive Dependency is a condition where A, B and C are attributes of a relation such that if A  B and B  C, then C is transitively dependent on A through B. (Provided that A is not functionally dependent on B or C). May Aldoayan

44 CLIENT RENTAL 2NF Relation 2NF Relation PROPERTY_OWNER 2NF Relation
cName clientNo RentFinish Rentstart PropertyNo ClientNo 2NF Relation 2NF Relation PROPERTY_OWNER Oname ownerNo Rent pAddress PoertyNo 2NF Relation May Aldoayan

45 Third Normal Form (3NF) 2NF to 3NF:
A relation that is in 1NF and 2NF and in which no non-primary-key attribute is transitively dependent on the primary key. 2NF to 3NF: Identify the primary key in the 2NF relation. Identify functional dependencies in the relation. If transitive dependencies exist on the primary key remove them by placing them in a new relation along with a copy of their dominant. May Aldoayan

46 2NF  3NF Owner 3NF relation PROPERTY_FOR_RENT ownerNo rent pAddress
oName OwnerNo Tiny Murphy CO40 Tony Shaw CO93 3NF relation PROPERTY_FOR_RENT ownerNo rent pAddress PropertyNo CO40 350 6 Lawernce St, Glassgow PG4 CO93 450 5 Novar Dr, Glasgow PG16 375 2 Manor Rd, Glasgow PG36 3NF relation May Aldoayan

47 The Decompositions of the ClientRental 1NF relation into 3NF relation
1NF ClientRental 2NF 3NF PROPERTY_OWNER CLIENT RENTAL OWNER PROPERTY_FOR_RENT Client ( ClientNo, cName ) Rental ( ClientNo, ProepertyNo, rentStart, rentFinish) PropertyForRent ( PropertyNo, pAddress, rent, ownerNo) Owner ( OwnerNo , oName ) May Aldoayan

48 General Definition of 2NF & 3NF
Second normal form (2NF) A relation that is in first normal form and every non-primary-key attribute is fully functionally dependent on any candidate key. Third normal form (3NF) A relation that is in first and second normal form and in which no non-primary-key attribute is transitively dependent on any candidate key. May Aldoayan

49 Boyce-Codd Normal Form (BCNF)
Based on functional dependencies that take into account all candidate keys in a relation, however BCNF also has additional constraints compared with the general definition of 3NF. Boyce-Codd normal form (3NF): A 3NF relation in which every determinant in a nontrivial FD is a CK Difference between 3NF & BCNF: A  B 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key. BCNF insists that for this dependency to remain in a relation, A must be a candidate key. May Aldoayan

50 Boyce-Codd Normal Form (BCNF)
Every relation in BCNF is also in 3NF. However, a relation in 3NF is not necessarily in BCNF. Violation of BCNF is quite rare. The potential to violate BCNF may occur in a relation that: contains two (or more) composite candidate keys; the candidate keys overlap, that is have at least one attribute in common. May Aldoayan

51 3NF  BCNF Examine FDs for a relation
2. If determinant is NOT a CK, decompose relation into 2 relations CLIENT_INTERVIEW ClientNo Int_Date Int_Time StaffNo RoomNo Fd1 ClientNo, Int_Date  Int_Time, StaffNo, RoomNo (primary Key) Fd2 StaffNo, Int_Date, Int_Time  ClientNo (Candidate Key) Fd3 RoomNo, Int_Date, Int_Time  StaffNo, ClientNo (Candidate Key) Fd4 StaffNo, Int_Date  RoomNo May Aldoayan

52 3NF  BCNF 3. Remove non-CK dependencies by placing the functionally dependent attributes in a new relation STAFF_ROOM INTERVIEW RoomNo Int_Date ClientNo StaffNo Int_time Int_date ClientNo BCNF Relation BCNF Relation May Aldoayan

53 Review of Normalization (UNF to BCNF)
May Aldoayan

54 Review of Normalization (UNF to BCNF)
Fd1 propertyNo, iDate  iTime, comments, staffNo, sName, carReg (primary Key) Fd2 propertyNo  pAddress (Partial dependency) Fd3 staffNo  sName (Transitive dependency) Fd4 staffNo, iDate  carReg Fd5 carReg, iDate, iTime  propertyNo, pAddress, comments, staffNo, sName (CK) Fd6 staffNo, iDate, iTime  propertyN, pAddress, comments (CK) May Aldoayan

55 Review of Normalization (UNF to BCNF)
May Aldoayan

56 Review of Normalization (UNF to BCNF)
1NF  2NF PROPERTY_INSPECTION Pno iDate iTime comments StaffNo sName CarReg 2NF PROPERTY Pno pAddress 2NF Pno, iDate iTime, comments, StaffNo, Sname, CarReg StaffNo Sname Transitive Dependency iDate, StaffNo CarReg iDate, iTime, CarReg Pno, comments, StaffNo, Sname iDate, iTime, StaffNo Pno, comments May Aldoayan

57 Review of Normalization (UNF to BCNF) 1NF  2NF
PROPERTY_INSPECTION Pno iDate iTime comments StaffNo CarReg 3NF STAFF StaffNo sName 3NF PROPERTY(Pno, pAddres) STAFF(StaffNo, sName) PROPERTY_INSPECT(Pno, iDate, iTime, comments, staffNo, CarReg) May Aldoayan

58 Review of Normalization (UNF to BCNF)
3NF  BCNF PROPERTY_INSPECTION Pno iDate comments StaffNo CarReg iTime 3NF Pno, iDate iTime, comments, staffNo, CarReg) StaffNo, iDate carReg CarReg, iDate, iTime pno, comments, staffNo StaffNo, iDate, iTime pno, comments STAFF_CAR(StaffNo, iDate, CarReg) BCNF PROPERTY_INSPECT(pno, iDate, iTime, comments, StaffNo) BCNF May Aldoayan

59 Fourth Normal Form (4NF)
Multi-Valued Dependency (MVD) Although BCNF removes anomalies due to functional dependencies, another type of dependency called a multi-valued dependency (MVD) can also cause data redundancy. Possible existence of multi-valued dependencies in a relation is due to 1NF and can result in data redundancy. May Aldoayan

60 Multi-Valued Dependency (MVD)
Represents a dependency between attributes A, B, C in a relation, such that for each value of A, there is a set of values for B and a set of values of values for C. However, the set of values for B & C are independent of each others. Denoted by: A B, A C Example BranchNo SName, BranchNo OName BRANCH_STAFF_OWNER BranchNo SName OName B003 Ann David Carol Tina May Aldoayan

61 Trivial MVD : A B trivial MVD if: B  A OR A  B = R
May Aldoayan

62 Fourth Normal Form (4NF)
Fourth Normal Form (4NF) : a relation that is in Boyce- Codd Normal Form and contains no nontrivial multi- valued dependencies May Aldoayan

63 Fourth Normal Form (4NF)
Start with a BCNF relation Examine FDs for a relation If nontrivial MVD exists, remove the MVD by placing the attributes in a new relation along with a copy of their determinant 4NF May Aldoayan

64 Fifth Normal Form (5NF) Lossless-Join Dependency
A relation decompose into two relations must have the lossless-join property, which ensures that no spurious tuples are generated when relations are reunited through a natural join operation. However, there are requirements to decompose a relation into more than two relations. Although rare, these cases are managed by join dependency and fifth normal form (5NF). May Aldoayan

65 Fifth Normal Form (5NF) Fifth Normal Form (5NF): a relation that has no join dependency. 5NF – Example May Aldoayan

66 Fifth Normal Form (5NF) May Aldoayan


Download ppt "Normalization Lecture 7 May Aldoayan."

Similar presentations


Ads by Google