Presentation is loading. Please wait.

Presentation is loading. Please wait.

Database Design: Normalization Reading: C&B, Chaps 14.

Similar presentations


Presentation on theme: "Database Design: Normalization Reading: C&B, Chaps 14."— Presentation transcript:

1 Database Design: Normalization Reading: C&B, Chaps 14

2 Dept. of Computer Science, University of Aberdeen2 In this lecture you will learn Mathematical notions behind relational model Normalization

3 Dept. of Computer Science, University of Aberdeen3 Introduction Relations derived from ER model may be faulty –May cause data redundancy, and insert/delete/update anomalies We use some mathematical (semantic?) properties of relations to –locate these faults and –fix them

4 Dept. of Computer Science, University of Aberdeen4 Mathematical notions behind relational model Set – a collection of objects characterized by some defining property –E.g. a column in a database table such as last names of all staff Cross Product of sets – one of the operations (X) on sets –E.g. consider two sets, set of all first names and set of all last names in the staff table –fName = {Mary, David} –lName = {Howe, Ford} –fNameXlName = {(Mary,Howe), (Mary,Ford), (David, Howe), (David, Ford)} Relation – defined between two sets and is a subset of cross product between those two sets –E.g. FirstNameOf = {(Mary, Howe), (David, Ford)}

5 Dept. of Computer Science, University of Aberdeen5 Relational model The name relational model comes from this mathematical notion of relation –Where a relation is a set (collection) of tuples that have related objects such as first name and last name of the same person –E.g. (fName, lName) is a relation We can have relations over any number of sets –E.g. (staffNo, fName, lName, position) In general we can denote a relation as (A,B,C,D,….,Z) where A, B, C and Z are all its attribute sets

6 Dept. of Computer Science, University of Aberdeen6 Function A function is a special kind of relation In a relation (X,Y), if every value of X is associated with exactly one value of Y, then we say Y is a function of X. –E.g. the relation {(1,2),(2,4),(3,6),(4,8)} is a function, Y = 2*X for 0

7 Dept. of Computer Science, University of Aberdeen7 Functional Dependency If Y is a function of X –Y is dependent on X, –there is a relationship of functional dependency between Y and X In databases, we work with relations in general form (A,B,C,D,……,Z) Functional Dependency –Describes relationship between attributes in a relation. –If A and B are attributes of relation R, B is functionally dependent on A, if each value of A in R is associated with exactly one value of B in R. We are interested in finding such functional dependencies among database relations

8 Dept. of Computer Science, University of Aberdeen8 Functional Dependency Is a property of the meaning (or semantics) of the attributes in a relation. Diagrammatic representation: Determinant of a functional dependency refers to attribute or group of attributes on left-hand side of the arrow. If the determinant can maintain the functional dependency with a minimum number of attributes, then we call it full functional dependency

9 Dept. of Computer Science, University of Aberdeen9 Data Redundancy Major aim of relational database design is –to group attributes into relations to minimize data redundancy and –to reduce file storage space required by base relations. Data redundancy is undesirable because of the following anomalies –Insert anomalies –Delete anomalies –Update anomalies We illustrate these anomalies with an example

10 Dept. of Computer Science, University of Aberdeen10 Data Redundancy

11 Dept. of Computer Science, University of Aberdeen11 Anomalies Insert anomalies –Try to insert details for a new member of staff into StaffBranch –You also need to insert branch details that are consistent with existing details for the same branch –Hard to maintain data consistency with StaffBranch Delete anomalies –Try to delete details for a member of staff from StaffBranch –You also loose branch details in that tuple (row) Update anomalies –Try to update the value of one of the attributes of a branch –You also need to update that information in all the tuples about the same branch

12 Dept. of Computer Science, University of Aberdeen12 Decomposition of Relations Staff and Branch relations which are obtained by decomposing StaffBranch do not suffer from these anomalies Two important properties of decomposition –Lossless-join property enables us to find any instance of original relation from corresponding instances in the smaller relations. –Dependency preservation property enables us to enforce a constraint on original relation by enforcing some constraint on each of the smaller relations.

13 Dept. of Computer Science, University of Aberdeen13 The Process of Normalization Formal technique for analyzing a relation based on its primary key and functional dependencies between its attributes. Often executed as a series of steps. Each step corresponds to a specific normal form, which has known properties. As normalization proceeds, relations become progressively more restricted (stronger) in format and also less vulnerable to update anomalies. Given a relation, use the following cycle –Find out what normal form it is in –Transform the relation to the next higher form by decomposing it to form simpler relations –You may need to refine the relation further if decomposition resulted in undesirable properties

14 Dept. of Computer Science, University of Aberdeen14 Unnormalized Form (UNF) A table that contains one or more repeating groups. To create an unnormalized table: –transform data from information source (e.g. form) into table format with columns and rows. NameAddressPhone Sally Singer123 Broadway New York, NY, 11234(111) Jason Jumper456 Jolly Jumper St. Trenton NJ, 11547(222) Example 1 – address and name fields are composite

15 Dept. of Computer Science, University of Aberdeen15 Another example of UNF Rep IDRepresentativeClient 1Time 1Client 2Time 2Client 3Time 3 TS-89Gilroy GladstoneUS Corp.14 hrsTaggarts26 hrsKilroy Inc.9 hrs RK-56Mary MayhemItaliana67 hrsLinkers2 hrs Example 2 – repeating columns for each client & composite name field

16 Dept. of Computer Science, University of Aberdeen16 First Normal Form (1NF) A relation in which intersection of each row and column contains one and only one value. UNF to 1NF –Nominate an attribute or group of attributes to act as the key for the unnormalized table. –Identify repeating group(s) in unnormalized table which repeats for the key attribute(s).

17 Dept. of Computer Science, University of Aberdeen17 UNF to 1NF Remove repeating group by: –entering appropriate data into the empty columns of rows containing repeating data (flattening the table). Or by –placing repeating data along with copy of the original key attribute(s) into a separate relation.

18 Dept. of Computer Science, University of Aberdeen18 Example 1 IDFirstLastStreetCityStateZipPhone 564SallySinger123 BroadwayNew YorkNY11234(111) JasonJumper456 Jolly Jumper St.TrentonNJ11547(222) Address field has been expressed in terms of constituent parts, such as street, city and postcode Name field has been expressed in terms of last name and first name

19 Dept. of Computer Science, University of Aberdeen19 Example 2 Rep IDRep First NameRep Last NameClientTime With Client TS-89GilroyGladstoneUS Corp14 hrs TS-89GilroyGladstoneTaggarts26 hrs TS-89GilroyGladstoneKilroy Inc.9 hrs RK-56MaryMayhemItaliana67 hrs RK-56MaryMayhemLinkers2 hrs Table structure has been changed Data related to representative repeated Representative name expressed in terms of last name and first name

20 Dept. of Computer Science, University of Aberdeen20 Example 2 Rep ID*Rep First NameRep Last NameClient ID*ClientTime With Client TS-89GilroyGladstone978US Corp14 hrs TS-89GilroyGladstone665Taggarts26 hrs TS-89GilroyGladstone782Kilroy Inc.9 hrs RK-56MaryMayhem221Italiana67 hrs RK-56MaryMayhem982Linkers2 hrs A new field ClientID introduced RepId and ClientID combination acts as the primary key

21 Dept. of Computer Science, University of Aberdeen21 Second Normal Form (2NF) Based on concept of full functional dependency: –A and B are attributes of a relation R, –B is fully dependent on A (denoted A->B) if B is functionally dependent on A but not on any proper subset of A. 2NF - A relation that is in 1NF and every non- primary-key attribute is fully functionally dependent on the primary key.

22 Dept. of Computer Science, University of Aberdeen22 1NF to 2NF Identify primary key for the 1NF relation. Identify functional dependencies in the relation. If partial dependencies exist on the primary key remove them by placing them in a new relation along with copy of their determinant.

23 Dept. of Computer Science, University of Aberdeen23 Example 2NF Rep ID*Client ID*Time With Client TS hrs TS hrs TS hrs RK hrs RK hrs RK hrs Rep ID*First NameLast Name TS-89GilroyGladstone RK-56MaryMayhem Client ID*Client Name 978US Corp 665Taggarts 782Kilroy Inc. 221Italiana 982Linkers Original table decomposed into smaller tables Each of them are in 2NF

24 Dept. of Computer Science, University of Aberdeen24 Third Normal Form (3NF) Based on concept of transitive dependency: –A, B and C are attributes of a relation such that if A -> B and B -> C, –then C is transitively dependent on A through B. (Provided that A is not functionally dependent on B or C). 3NF - A relation that is in 1NF and 2NF and in which no non-primary-key attribute is transitively dependent on the primary key.

25 Dept. of Computer Science, University of Aberdeen25 2NF to 3NF Identify the primary key in the 2NF relation. Identify functional dependencies in the relation. If transitive dependencies exist on the primary key remove them by placing them in a new relation along with copy of their determinant.

26 Dept. of Computer Science, University of Aberdeen26 Normalization Flow UNF 1NF 2NF 3NF Remove repeating groups Remove partial dependencies Remove transitive dependencies More normalized forms

27 Dept. of Computer Science, University of Aberdeen27 Conclusion Quality of the relations derived from ER models is unknown Normalization is a systematic process of either assessing or converting these relations into progressively stricter normal forms Advanced normal forms such as Boyce- Codd normal form (BNCF), 4NF and 5NF exist


Download ppt "Database Design: Normalization Reading: C&B, Chaps 14."

Similar presentations


Ads by Google