Presentation is loading. Please wait.

Presentation is loading. Please wait.

11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003.

Similar presentations


Presentation on theme: "11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003."— Presentation transcript:

1 11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003

2 11/07/2003Akbar Mokhtarani (LBNL)2 Overview Relational Model Basics Functional Dependencies Modification Anomalies Normalization

3 11/07/2003Akbar Mokhtarani (LBNL)3 Relational Model Introduced by E. F. Codd in 1970 It consists of: Data structure: in the form of tables Data manipulation: operation used to manipulate data in the relations (e.g SQL) Data integrity: facilities to maintain the integrity of data when they are manipulated

4 11/07/2003Akbar Mokhtarani (LBNL)4 Data structure (tables) Each table consists of a set of named columns (attributes corresponding to some real-world entity) Each row corresponds to a record containing data values for a single entity attributes are single-valued and have domains (set of values)

5 11/07/2003Akbar Mokhtarani (LBNL)5 Properties of Relations (Not all tables are relations) A table is called a relation if: The table has a unique name Values are atomic (no repeating group) Each row is uniquely determined by a key Each attribute has a unique name The order of columns is insignificant The order of rows is insignificant

6 11/07/2003Akbar Mokhtarani (LBNL)6 Keys A super key is a group of one or more attributes that uniquely identifies a row Candidate keys: irreducible super keys Primary key: candidate key selected to identify the row Alternate key: candidate key other than the primary key Foreign key:a set of attributes of one relation whose values match values of some candidate key of another relation

7 11/07/2003Akbar Mokhtarani (LBNL)7 Example Custumer_IDCustumer_nameAddressCityStateZip Order_IDOrder_DateCustomer_ID Order_IDProduct_IDQuantity Product_IDProduct_DecriptionProduct_FinfishProd_PriceOn_Hand CUSTOMER ORDER ORDERLINE PRODUCT FK

8 11/07/2003Akbar Mokhtarani (LBNL)8 Integrity Constraints Major integrity constraints (business rules): Domain constraints Values in a column have the same domain (data type and size) Entity integrity Non-null primary key Referential integrity If there is a foreign key, each FK must either match the primary key value in another relation or the FK must be null Action assertions Action constraints (e.g no student can take more than 15 units per term)

9 11/07/2003Akbar Mokhtarani (LBNL)9 Functional Dependency (FD) (Relationship Among Attributes) A Functional Dependency is a special integrity constraint that states: FD: X  Y means if t 1.X = t 2.X then t 1.Y = t 2.Y Where: X and Y are subsets of attributes of a relation R t 1 and t 2 are tuples of any relational instance of R X is said to functionally determine Y, or Y is functionally dependent on X X is called determinant

10 11/07/2003Akbar Mokhtarani (LBNL)10 Functional Dependency (Cont’d) Full FD: FD X  Y is a full FD if removal of any attribute from X destroys the dependency Partial FD: FD X  Y is partial if one or more non-key attributes are determined by a subset of X

11 11/07/2003Akbar Mokhtarani (LBNL)11 FD Rules 1.Reflexive If X  Y, then Y  X 2.Augmentation: If X  Y, then XZ  YZ 3.Transitive: If X  Y and Y  Z, then X  z 4.Decomposition: If X  YZ, then X  Y and X  Z 5.Union: If X  Y and X  Z, then X  YZ 6.Pseudo transitive: If X  Y and WY  Z, then WX  Z

12 11/07/2003Akbar Mokhtarani (LBNL)12 FD Example A BCD a1b1c1d1 a1b2c2d1 a2b1c1d2 a1b1c1d2 AB  C, but AB  D

13 11/07/2003Akbar Mokhtarani (LBNL)13 Modification Anomalies Anomalies are unexpected side effects that occurs when modifying the contents of a table with excessive redundancies Insertion anomaly: Need to add extra data in order to add the desired data to DB Deletion anomaly: Deleting a row causes other data to be deleted Update anomaly: Need to change multiple rows to modify a single fact

14 11/07/2003Akbar Mokhtarani (LBNL)14 Normalization Normalization is the process of decomposing relations with anomalies to produce smaller, well structured relations It is built around the concept of Normal Forms A relation is said to be in a particular normal form if it satisfies certain conditions

15 11/07/2003Akbar Mokhtarani (LBNL)15 Levels of Normalization a 1NF 2NF BCNF 3NF 4NF 5NF Domain/Key NF

16 11/07/2003Akbar Mokhtarani (LBNL)16 A relation is in 1NF if it contains no multivalued attributes 2NF if it is in 1NF and every non-key attribute is fully functionally dependent on the PK 3NF if it is in 2NF and no transitive dependencies exit BCNF if every determinant is a candidate key

17 11/07/2003Akbar Mokhtarani (LBNL)17 Steps in Normalization First normal form Second normal form Third formal form Boyce-Codd normal form Table with multivalued attributes Remove Multivalued attributes Remove partial dependencies Remove remaining Anomalies resulting From FD Remove transitive dependencies

18 11/07/2003Akbar Mokhtarani (LBNL)18 First Normal Form This relation contains : Insertion anomaly: adding new department or class require a student to sign up for it Deletion anomaly: deleting “Lisa Gilmore” causes information about “Sociology” department and “Soc 101” class Update anomaly: if course description for “Math 105” changes, many rows need to be updated

19 11/07/2003Akbar Mokhtarani (LBNL)19 FD diagram SID StName Major ClassName Desc. Grade SSN

20 11/07/2003Akbar Mokhtarani (LBNL)20 2NF and 3NF Normal Form We still have insertion and deletion anomalies for the “Major”

21 11/07/2003Akbar Mokhtarani (LBNL)21 Anomaly free Form

22 11/07/2003Akbar Mokhtarani (LBNL)22 Another example A B C D E G 1NF B C D E GC A 2NF B C D E D GC A 3NF B C D E C D E D GC A E B BCNF Switch keys

23 11/07/2003Akbar Mokhtarani (LBNL)23 Relational Algebra The manipulative part of relational model is called relational algebra. It is a collection of operators that take relations as their operand and return a relation as their result. Two groups of operators: Set operators: union, intersection, difference, and cartesian product Relational operators: restrict (select), project, join, and divide.

24 11/07/2003Akbar Mokhtarani (LBNL)24 Set Operators Restrict: Returns a relation containing all tuples from a special relation that satisfy a specified condition. Project: Returns a relation containing all (sub)tuples that remain in a specified relation after specified attributes have been removed

25 11/07/2003Akbar Mokhtarani (LBNL)25 Set Operators (cont’d) Product: Returns a relation containing all possible tuples that are a combination of two tuples, one from each two specified relations. Union: Returns a relation containing all tuples that appear in either or both of two specified relations.

26 11/07/2003Akbar Mokhtarani (LBNL)26 Relational Operators Intersect: Returns a relation containing all tuples that appear in both of two specified relations. Difference: Returns a relation containing all tuples that appear in the first and not in the second of two specified relations.

27 11/07/2003Akbar Mokhtarani (LBNL)27 Relational operators (cont’d) Join:Returns a relation containing all possible tuples that are a combination of two tuples, one from each of two specified relations, such that the two tuples contributing to any given combination have a common value for the common attributes of the two relations

28 11/07/2003Akbar Mokhtarani (LBNL)28 Relational Operators (cont’d) Divide: Takes two unary relations and one binary relation and returns a relation containing all tuples from one unary relation that appear in the binary relation matched with all tuples in the other unary relation


Download ppt "11/07/2003Akbar Mokhtarani (LBNL)1 Normalization of Relational Tables Akbar Mokhtarani LBNL (HENPC group) November 7, 2003."

Similar presentations


Ads by Google