Presentation on theme: "FEN 2012-09-011 Introduction to the database field: Quality checking table design: Design Guidelines Normalisation Seminar: Introduction to relational."— Presentation transcript:
FEN 2012-09-011 Introduction to the database field: Quality checking table design: Design Guidelines Normalisation Seminar: Introduction to relational databases Is this OK?
Design of Relational Tables Informal Design Guidelines: Table Semantics A table should hold information about one and only one entity/concept from the real world Dont mix information about more things in one table Avoid Redundant Information Waste of storage Update Anomalies Minimise NULL-values Storage requirements Multiple interpretations (ambiguity) Disallowing the generation of spurious tuples when joining tables. FEN 2012-09-012
Design of Relational Tables Table Semantics: Consider this table, a part of a system to handle loans from a library: Loan:[title, matNo, lno, lname, laddress, date, status] Information about different things in the same table: Loaner Material Loan What will you suggest? FEN 2012-09-013
Design of Relational Tables Redundant information: Loan:[title, matNo, lno, lname, laddress, date, status] A loaner has many loans. Books and other materials may be lend by many loaners. What will you suggest? FEN 2012-09-014
Design of Relational Tables Minimise NULL-values. If NULL is the most common value for an attribute, then that attribute may not belong in the table. FEN 2012-09-015 One out of ten employees has a company car. One out of ten cars are assigned to certain employee. On which side should the foreign key be included?
Design of Relational Tables Spurious Tuples Again consider this table part of system to handle loans from a library: Loaner:[lNo, fname, lname,…….] Copy:[matNo,…, lname, …] The relationship between Loaner and Copy is designed by including the loaners last name in Copy. When Loaner and Copy are joined over lname spurious tuples probably will be generated since lname hardly is unique. For instance 117 Joe Smith will be associated with all copies borrowed by someone with last name Smith, and all other Smiths will be associated with copies borrowed by Joe. The problem arises when relations are represented by anything else than primary – foreign keys FEN 2012-09-016
Design of Relational Tables Normalisation Normal forms are the formal way to state design guidelines. Normalisation is the process. 6 normal forms (NF) are defined: 1st, 2nd, 3rd, and Boyce-Codd (BCNF). 4th and 5th NF BCNF is the one of most practical interest. FEN 2012-09-017
Design of Relational Tables First Normal Form (1NF) A table is on 1NF if All attributes are atomic 1NF has become part of the definition of a relation in the relational model and is achieved trivially. FEN 2012-09-018
Design of Relational Tables Functional Dependencies - the foundation of 2NF, 3NF and BCNF: Y is functional dependent (FD) of X, if there for any given value of X always is the same value for Y ( X and Y being any set of attributes). FD is written X -> Y Y is FD of X or X is determinant for Y If X is a candidate key, then X -> Y for all sets of attributes Y. X -> Y implies nothing about Y -> X. Classic example: in an address city is FD of postalCode (or postalCode determines city). FEN 2012-09-019
Design of Relational Tables A Side: Often in literature functional dependencies and normal forms are described using a lot of math and it may seem quite theoretical and complicated BUT FDs are business rules and normal forms are common sense constraints on table design The theory and the math are very useful building tools FEN 2012-09-0110
Design of Relational Tables Second Normal Form (2NF) Is about partial FDs A FD X->Y is a full functional dependency (FFD), if no attribute can be removed from X without also removing the FD X->Y. A FD that is not FFD is called partial. A table is on 2NF if: It is on 1NF All non-key attributes are FFD of all candidate keys. Example: Loan:[title, matNo, lno, lname, laddress, date, status] FEN 2012-09-0111 Which FDs do we have here?
Design of Relational Tables Partial FDs FEN 2012-09-0112 What can we do here?
Design of Relational Tables Third Normal Form (3NF) Is about transitive FDs A FD (X->Y) is transitive, if there exists a set of attributes Z satisfying X -> Z and Z -> Y. A table is on 3NF if: It is on 2NF No non-key attribute is transitively dependent of a candidate key. postal code - city-problem! FEN 2012-09-0113
Design of Relational Tables Transitive FDs: FEN 2012-09-0114 Ssn Dmgr_ssn since Ssn Dnumber and Dnumber Dmgr_ssn Dnumber is not a key or part of a candidate key for EMP_DEPT Lots of redundancy here. What will you do? Lots of redundancy here. What will you do?
Design of Relational Tables Transitive FDs: FEN 2012-09-0115
Design of Relational Tables Boyce-Codd Normal Form (BCNF) If a table is on BCNF, then it is also on 1., 2. and 3. NF. A table is on BCNF, if all determinants are candidate keys. That is: only candidate key may determine the value of other attributes. Only a few tables (tables with two or more overlapping candidate keys) can be one 3NF and not on BCNF. We will not investigate that further in this brief introduction. FEN 2012-09-0116
Design of Relational Tables FEN 2012-09-0117 Guideline for Normalisation All attributes are to depend on the key, the whole key, and nothing but the key. So help me Codd. And remember: FDs are business rules Normalisation fights redundancy and other maladies in table design
Example/Exercise FEN 2012-09-0118 Show all FDs. Normalise the design to BCNF