Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fundamentals/ICY: Databases 2013/14 WEEK 9 –Friday John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Similar presentations


Presentation on theme: "Fundamentals/ICY: Databases 2013/14 WEEK 9 –Friday John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,"— Presentation transcript:

1 Fundamentals/ICY: Databases 2013/14 WEEK 9 –Friday John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham, UK

2 Reminder

3 Relation from a Table The relation at the moment is   ‘9568876A’, ‘Chopples’, 37 >,  ‘2544799Z’, ‘Blurp’, NULL >,  ‘1698674F’, ‘Rumpel’, 88 >  PERS-IDNAMEAGE 9568876AChopples37 2544799ZBlurp 1698674FRumpel88 People

4 A Table as a Relation? uPeople loosely talk about tables being relations. This is mathematically inaccurate for several reasons: 1)The table properly speaking includes not just the rows but also the attribute names themselves, their domains, specification of primary and foreign keys, etc. 2)It’s only the rows at any given moment that form a relation. When a value in the table changes or a row is added or deleted, the mathematical relation is replaced by a different one. 3)Relations do not cater for tables with repeated rows. ((ASIDE: But see next slide for a way out.)) But OK if you know what you (and those people) mean.

5 New (last on maths for now)

6 ((ASIDE: “Bags” in Maths)) uA variant of sets called “bags” (or “multisets”) is used in maths (and CS) and allows repeated members. There are union, etc. operations that respect the repetitions. uSo bags and their operations are a better fit to DB tables and notably their repetition-respecting operations (e.g. UNION ALL) than sets and their operations are. uBut bags are non-standard and they’re not normally covered at an introductory level. uSee the databases textbook by Garcia-Molina et al 2009 for bags and their use in the DB area.

7 — Back to Database Design — NORMALIZATION

8 Normalization uNormalization is often used within ER modeling, to help produce a good database design. Evaluates entity types, and when appropriate creates new entity types and adjusts attributes in existing ones (mainly) to minimize certain types of data redundancy, and in some cases to avoid certain types of complexity uSome situations require non-normalization or denormalization for efficiency reasons: Normalization generally increases the number of tables and makes many queries more elaborate (in straightforward ways, though).

9 Normal Forms uNormalization can be divided into a series of stages called normal forms, giving more and more protection: l First normal form (1NF) l Second normal form (2NF) l Third normal form (3NF) l Boyce-Codd normal form (BCNF) l ((Fourth normal form (4NF) )) l Yet others! u1NF is mandatory and we have implicitly already covered it.

10 First Normal Form (1NF) uJust insists on some restrictions we have already explicitly or implicitly imposed on entity types and tables: l In the entity type there is a candidate key whose attributes never have NULL values, and one such key has been chosen as the primary key. l There are no “repeating groups” in the table implementing the entity type: A repeating group is a group of related rows that have some empty cells that are to be thought of as copying values from some other row in the group. That’s my definition. More usually expressed in terms of having cells with multiple values, but I think this is inaccurate and misleading.

11 A Sample Report Layout with “repeated groups”

12 Another Unusual Feature of that Table uThe table has another feature that departs from DB- style tables. uWhat is it?

13 (Partially) Corresponding Attempt at a DB-Style Table

14 The Problem with Repeating Groups uQ: Why are they a problem? uA: First reason: l Rows in a DB-style table are unordered, so how do you know which row(s) to “copy” PROJ_NUM and PROJ_NAME values from/to? (Previous diagram is deceptive.) uA: Second reason: l Even if you could work out which row(s) to copy from/to, the copying would make many queries much more complex.

15 That Table put into 1NF (assuming there is a PK)

16 Dependencies and Determinants uThese concepts are needed for most of the remaining normal forms. uAny set S of attributes in an entity type “determines” each attribute within it, i.e.: Each attribute in S is “functionally dependent” on the whole set S. But in the following discussion of normalization… uWhen we say X is functionally dependent on S – i.e. S determines X – we will mainly be talking about non-trivial cases—cases where X is outside S (though still in the same entity type). uA [non-trivial] “determinant” will be a set of attributes D in a table such that it determines some attribute X outside D in the same entity type.

17 1NF can have Undesirable Dependencies u1NF entity types can contain “partial,” “transitive” and other generally undesirable functional dependencies of an attribute X on a determinant D. uBy “undesirable” I will mean mainly that the determinant D is not a superkey, so that at least one attribute Y in the entity type is not determined by D, so Y can have different values in the entity type for equal D values, so redundancy on D  X (repetition of the association between D and X values) can arise.

18 Partial and Transitive Dependencies

19 1NF can have Partial Dependencies uPartial dependency: where the determinant is part but not all of the primary key (and NB: is therefore not a superkey) The determined attribute X is necessarily outside the whole PK—exercise: why?

20 Second Normal Form uAn entity type is in second normal form (2NF) if: l It is in 1NF and l It includes no partial dependencies

21 Conversion to 2NF uFor each determinant D involved in a partial dependency in the original entity type T, use D as, also, the PK for a new entity type NT(D) and move out the attributes X determined by D into NT(D). uD itself stays in T as well as being copied into NT(D).

22 Reminder: Partial and Transitive Dependencies

23 Second Normal Form (2NF) Conversion results on example on previous slide


Download ppt "Fundamentals/ICY: Databases 2013/14 WEEK 9 –Friday John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,"

Similar presentations


Ads by Google