Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fundamentals/ICY: Databases 2010/11 WEEK 11 John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham, UK.

Similar presentations


Presentation on theme: "Fundamentals/ICY: Databases 2010/11 WEEK 11 John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham, UK."— Presentation transcript:

1 Fundamentals/ICY: Databases 2010/11 WEEK 11 John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham, UK

2 Today uMaths uStructure of Exam uLecture by Funmi on his industrial experience

3 Remember: ((Items in double round brackets are optional material))

4 Reminder of Week 10 on Normalization and Relational Operators

5 Summary: Normalization and Database Design uNormalization helps eliminate data redundancies and some other aspects of poor structure. uNormalization focusses on problems in individual entity types. uDifficult to separate normalization from overall ER modelling process. uNormalization cannot, by itself, guarantee good designs. u1NF, 2NF, and 3NF are the most commonly encountered, and 3NF is often enough, but BCNF, 4NF etc. may also need to be considered. uNon-normalized tables may be desirable in some cases, to increase processing speed and/or reduce conceptual complexity of operations.

6 Natural Join ( continued ) uSQL: l SELECT …all the attributes but including only one version of each shared one … FROM T1, T2 WHERE … explicit condition of equalities for ALL the shared attributes... l SELECT * FROM T1 NATURAL JOIN T2; uRelational algebra notation:   l Result table is T1  T2 where T1 and T2 are the given tables.  is the “bow tie” symbol.

7 Outer Join Of CUSTOMER and AGENT, using equal AGENT_CODE Left outer l Uses all the rows in the CUSTOMER table, by doing equijoin on AGENT_CODE but also including non-matching CUSTOMER rows. Right outer l Uses all the rows in the AGENT table, doing equijoin on AGENT_CODE but also including non-matching AGENT rows. Full outer l Using all the rows in the AGENT and CUSTOMER tables, doing equijoin on AGENT_CODE but also including non-matching rows from each table. l Union of Left Outer Join result and Right Outer Join result.

8 Outer Joins ( continued ) uSQL: l SELECT * FROM T1, T2 WHERE … explicit join condition … UNION … a SELECT expression that gets the extra LEFT rows UNION … a SELECT expression that gets the extra RIGHT rows l SELECT * FROM T1 LEFT/RIGHT/FULL JOIN T2 / USING (… some shared attribs …) / ON … explicit join cond … uRelational algebra notation:  l Variants of bow tie symbol .  See R,C&Crockett sec. 4.2.3 (though their symbols need a subscript stating the join condition unless natural).

9 DIVIDE operation: optional

10 Reminder of Week 9 on Mathematical Background

11 Relation from a Table The relation at the moment is   ‘9568876A’, ‘Chopples’, 37 >  ‘2544799Z’, ‘Blurp’, NULL >  ‘1698674F’, ‘Rumpel’, 88 >  PERS-IDNAMEAGE 9568876AChopples37 2544799ZBlurp 1698674FRumpel88 People

12 A Table as a Relation? uPeople loosely talk about tables being relations. This is mathematically inaccurate for several reasons: 1)The table properly speaking includes not just the rows but also the attribute names themselves, their domains, specification of primary and foreign keys, etc. 2)It’s only the rows at any given moment that form a relation. When a value in the table changes or a row is added or deleted, the mathematical relation is replaced by a different one. 3)Relations do not cater for tables with repeated rows. ((But there is a more advanced notion of relation, based on “bags” rather than sets, that does cater for repeated rows.)) But OK if you know what you (and those people) mean.

13 New for Week 10 on Mathematical Background

14 Some “Relational Operations”: Set Operations Applied to Relations uUnion of relations R and S: R  S = the set of tuples that are in R or S (or both). NB: no repetitions created! uIntersection of relations R and S: R  S = the set of tuples that are in both R and S. uDifference of relations R and S: R  S = the set of tuples that are in R but not S.

15 Relational Operations: contrast to SQL uThose operations do NOT themselves require R and S to have similar tuples in order to be well-defined. l E.g., R could be binary and on integer sets, S could be ternary and on character-string sets. uBut the corresponding DB table operations (which are usually called “relational operators”) do require the tables to have the same shape (same number of columns, same domains for corresponding columns).

16 ((“Relations don’t remember where they came from”)) uConsider a relation R on A, B, C, D, E, … l i.e., R  A  B  C  D  E  …. uSuppose A  AA, B  BB, C  CC, etc. l Then: a tuple formed from sets A, B, … is also automatically a tuple formed from AA, BB, … l That is, R  AA  BB  CC  DD  EE  …. uSo R is also a relation on AA, BB, CC, DD, EE, …. uSo a relation has no very close connection to the original sets it might have been defined from, unlike the case of tables, where the attribute domains are part of the nature of the table.

17 “Arity” of Relations uA relation on two sets is binary, on three sets is ternary, … … even when not all the sets are different. uSo a relation on A and A is still binary and NOT “unary.” The members of the relation are two-element tuples. A relation on, say, A, B and A is ternary and not binary. The members of the relation are three-element tuples.

18 “Arity” of Relations, contd. uA “unary relation” on A is a set of singleton tuples formed from A elements. l Unusual (though not inconceivable) to want a single-attribute table in a finalized ER model. l But one-attribute tables often arise dynamically from table operations, as you know.

19 Relations from Somewhere to Somewhere uA relation R “from” set A “to” set B is the same thing as a relation “on” A “and” B — just different terminology. uSimilarly, a relation from A, B, C to D, E is the same thing as a relation on A, B, C, D, E.

20 Changing the Sets in a Relation Around uA relation R on A, B, C, D, E, say, obviously “induces” (i.e., gives rise to, in a natural way) a relation on any reordering of the sets, such as D, A, B, E, C, just by reordering each tuple in the same way. uThus, R induces a relation from, say, D, A to B, E, C. u((When there are just two sets A and B, the (only possible) reordering of the sets gives the inverse of R.))

21 Removing some of the Sets in a Relation (Projection) uAnd we can remove some of the sets and the corresponding items from each tuple. uGiven the relation on D,A,B,E,C, we can get a relation on, say, D,B,C, just by removing the second and fourth item from each tuple. uThis is the mathematical operation underlying the PROJECT relational operator on tables (what I would prefer call Select-Columns or Select-Attributes).

22 Functional Relations (Partial Functions) uA relation R from A to B is functional if, for any a in A, there is AT MOST one (but perhaps no) b in B such that  a, b> is in R. uSo several things in A can be related to the same thing in B. uBut you can’t have several things in B related to the same thing in A. uA functional relation from A to B is also called a partial function from A to B.

23 Functional Relations, contd. uCan generalize:  a relation R from A1, A2, A3 … to B1, B2, B3, …is functional if, for each combination of things a1, a2, a3, … in A1, A2, A3, … respectively, there is at most one b1, b2, b3, … in B1, B2, B3, … respectively such that  a1, a2, a3, …, b1, b2, b3, …> is in R.

24 Functional Relations arising from Functional Dependencies uSuppose attribute X is functionally dependent on (= determined by) attributes A, B, … in a table. uThen, at any moment, the induced relation from A, B to X is a partial function from the A, B, … value domains to the X value domain. uSpecial case:  Consider any superkey (e.g., the primary key) of a table. Then the relation in the table at a given moment is a partial function from the superkey’s domains to the remaining attribute domains.

25 Caution uThe word “partial” in the phrase “partial function” has nothing to do with the word “partial” in “partial dependency” as discussed under Normalization. uAny dependency relationship in a table gives us a partial function, irrespective of whether the dependency is also “partial” in the special sense of involving only a part of the PK.

26 Remaining material on relations is optional

27 ((Restriction of a Relation)) uConsider a relation R from A to B, and a subset AA of A. uThen the restriction of R to AA is the relation derived from R by restricting attention to AA, i.e., including only tuples whose first element is in AA. R| AA uThe new relation is notated R| AA

28 ((Restriction More Generally)) uConsider a relation R from A, B, …C to D, E, …, F and subsets AA of A, BB of B, …, CC of C. uThen the restriction of R to AA, BB, …, CC is the relation derived from R by restricting attention to AA, BB, …, CC i.e., including only tuples whose first few elements are in AA, BB, …, CC respectively. R| AA, BB, …, CC uThe new relation is notated R| AA, BB, …, CC

29 ((Totality of Relations)) uA relation R from A to B is total (on A) if it relates everything in A to at least one thing in B.  > l I.e., for every member a of A, there is at least one b in B such that  a, b > is in R. uA relation may be merely partial (on A above) in not being total. However, technically all relations are “partial”, with total being a special case.

30 ((Totality, contd.)) uCan generalize: A relation R from A, B, C, … to D, E, … is total (on A, B, C, …) if for every member a of A, b of B, c of C, etc. there is at least one d in D, e in E, etc. such that  >  a, b, c, …, d, e, … > is in R.

31 ((Partiality of Table Relations)) uThe relation in a table (at a given moment), considered as a relation from any of its attribute value domains to the remaining value domains, will almost always be merely partial. This is simply because it’s highly unlikely that all possible combinations of values from the former collection of value domains will appear in the table!!

32 ((Functions)) uA total functional relation from A to B is called a function from A to B. Each thing in A is related to exactly one thing in B. (But two different things in A can be related to the same thing in B, and not everything in B needs to be related to anything in A. So the inverse relation is not necessarily either functional or total.) uCaution: every function is also a partial function.

33 ((From Partiality to Totality by Restriction)) uWe can always turn a merely-partial R from A to B into a total one by slimming A down enough! Just remove the members of A that aren’t related to anything by R, to get a new set AA. We don’t remove any tuples from R. uR (as a relation from AA to B) is total on AA. And note that R| AA = R. uAA is called the domain of R, notated dom(R). l Not to be confused with “value domains” of DB entity attributes. uCan generalize the above to non-binary relations.

34 ((Totality contd. and “Onto”)) uA relation R from A to B is onto if for everything in B there is at least one thing in A that is related by R to it. I.e.: l For every member b of B, there is at least one a in A such that  a, b> is in R. uOnto-ness is just totality in the other direction. You can also say that R is total on B, or that the inverse of R is total.

35 ((Other Categories of Relation)) uA relation R from A to B is one-to-one (1-1) if, for any a in A, there is at most one b in B such that  a, b> is in R, AND for any b in B, there is at most one a in A such that  a, b> is in R. l That is, both the relation and its inverse from B to A are functional. (But they don’t need to be total.) l To put it another way: it is functional and different members of A map to (= are related to) different members of B. l Or again: Different members of A map to different members of B and different members of B map to different members of A.

36 uA relation R from A to B is many-to-one if it is functional but not one-to-one: i.e., there are different members of A that map to the same member of B, in at least one case. uA relation R from A to B is one-to-many if it is not functional but its inverse from B to A is functional. That is, there’s a member of A that maps to more than one member of B; but each member of B maps to at most one member of A. uA relation R from A to B is many-to-many if neither it nor its inverse is functional: i.e., there’s a case of a member of A mapping to more than one member of B, and a case of a member of B mapping to more than one member of A. ((Other Categories of Relation, contd.))

37 ((Relations from Entity Relationships)) uSurprisingly … the concentration on mathematical relations in introductory accounts of “relational” DBs is on a relation as arising from each single table (entity type), despite … the importance of “relationships” between entity types in Entity-Relationship modelling! uHowever, between-entity-type relationships also correspond to mathematical relations, distinct from the ones within individual tables.

38 ((Intuitively...)) uRecall that for each entity type there is the set of possible entities of that type (the entity set). uA “relationship” between two (or more) entity types/sets is a description of the fact that at any given moment the database stores a particular mathematical relation on the entity sets. uE.g., the EMPLOYED-BY relationship from the People entity type to the Organizations entity type says that the database (at any moment) stores a relation on the People entity set and Organizations entity set.

39 ((Example Continued)) uSo at any given moment the relation might be {  Person1, Org1>,  Person2, Org1>,  Person3, Org1>,  Person4, Org2>,  Person3, Org2>} uEach Person… and Org… is an entity represented as a row of the corresponding table …... therefore itself mathematically represented as a tuple of attribute values: So  Person1, Org1> could be, in more detail,   E156, ‘Sam’, ‘Finks’, I678>,  I678, ‘IBM’, ‘USA’> > Note the nested tuples.

40 ((Bridging Entity Types)) uRecall that bridging entity types are brought in to represent M:N relationships (and similarly M:N:P relationships, etc.) uPeople/Organizations again: the relation within the bridging table would look like {  E156, I678>,  E257, I996>,  E714, I678>, … }. uThis relation can also be said to correspond to the original People-Organization relationship, but is abstracted from the above relation by replacing tuples representing entities, such as  E156, ‘Sam’, ‘Finks’, I678>, by the PK values in them, such as E156.

41 ((Bridging Entity Types (contd.) )) uBut now … what about the relationship between the People and Organization entity types and the bridging entity type!! (Exercise) uAnd note: We could have chosen to use the bridging- entity-style relation to begin with as our mathematical formulation of the People/Organization relationship. uA mathematical formulation is not objectively given by the world … it is chosen by us, on the basis of convenience for whatever purposes we have.

42 ((Connectivities)) uIf a relationship from an entity type to another is “1:1” then at any moment the actual relation is one-to-one (1-1). uIf the relationship is “1:M” then the relation at any moment may be one-to-many (but may by chance be one-to-one). uIf the relationship is “M:N” then the relation at any moment may be many-to-many (but may by chance be one-to-many, many-to-one or one- to-one).

43 ((Optionality/Mandatoriness)) uIf a relationship from an entity type E to another type F is mandatory then the relation at any moment after restriction to the set of entities currently in E is total. uIf a relationship from an entity type E to another type F is optional then the relation at any moment is not required to be total in the above restricted sense (but may happen to be).

44 ((Another Caution)) uA one-to-one correspondence between a set A and B is a SPECIAL one-to-one relation from A to B (or B to A): it is not only one-to-one but also TOTAL (on A) and ONTO (B). (Or we can say: total on both A and B.) uBut any 1-1 relation from A to B is a 1-1 correspondence between the subsets of A, B consisting of those members that do happen to feature in the relation! uA 1-1 relation induced by a single table will almost certainly NOT be a one-to-one correspondence between whole attribute domains!

45 EXAM (May/June; fine detail for Resit may differ) See notes about past exams on the module website.

46 Structure of Exam uOne and a half hours. uFour questions. uDO THREE. uMaterial on mathematical relations and relational algebra is only used in one of the four Questions. (That Question may also involve other material.)

47 The Remaining Three Questions uThey will range from precise technical things to more general considerations. uSQL query expressions required in several parts of the three questions. Amount to about 28% of the marks for those three questions. Extra credit of 8% is available in one of the questions for providing SQL create expressions. uExtra credit of 8% is available in one of the questions for providing creative ERD notation suggestions.

48 Material Needed for Exam uAnything in the required textbook reading may be useful in the exam, except of course that a detailed memory of specific, data-full examples is not expected, and except for some SQL detail (see next slide). uYou need to study all Additional Notes, except that: l The exam will not rely on the treatment of functional dependencies and normalization there (in the 1 st of the three parts in the Week 9 batch). l The exam will not rely on material on physical design (in the 3 rd of the three parts in the Week 9 batch). uYou need to study all Exercise Answer Notes. uThe content of the demonstrators’ lectures on experiences in industry will not be relied upon (but of course may help you in overall understanding of some issues).

49 Textbook Parts (R&C 2009) uSee my module website (top page). l On SQL: the exam doesn’t rely on fine detail beyond what’s in the handouts (and occasional lectures).

50 Textbook Chapters (7 th Ed) uChapters 1-9, except that: uOn SQL (Chs 7,8): the exam doesn’t rest on fine detail beyond what’s in the handouts (and occasional lecture) uIn Chapter 8: only up to section 8.4 inclusive uChapter 9: note the important concepts of the Systems Development Life Cycle and the Database Life Cycle.

51 Talk by demonstrator on experiences in industry Funmi Faniyi (about 30 mins incl. questions)


Download ppt "Fundamentals/ICY: Databases 2010/11 WEEK 11 John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham, UK."

Similar presentations


Ads by Google