Presentation is loading. Please wait.

Presentation is loading. Please wait.

M.P. Johnson, DBMS, Stern/NYU, Sp20041 C20.0046: Database Management Systems Lecture #5 Matthew P. Johnson Stern School of Business, NYU Spring, 2004.

Similar presentations


Presentation on theme: "M.P. Johnson, DBMS, Stern/NYU, Sp20041 C20.0046: Database Management Systems Lecture #5 Matthew P. Johnson Stern School of Business, NYU Spring, 2004."— Presentation transcript:

1 M.P. Johnson, DBMS, Stern/NYU, Sp20041 C20.0046: Database Management Systems Lecture #5 Matthew P. Johnson Stern School of Business, NYU Spring, 2004

2 M.P. Johnson, DBMS, Stern/NYU, Sp2004 2 Agenda Last time: relational model Homework 1 is up, due next Tuesday This time: 1. Functional dependencies  Keys and superkeys in terms of FDs  Finding keys for relations 2. Rules for combining FDs Next time: anomalies & normalization

3 M.P. Johnson, DBMS, Stern/NYU, Sp2004 3 Review: converting inheritance Movies Cartoons Murder- Mysteries isa Voices Weapon stars length titleyear Lion King Component

4 M.P. Johnson, DBMS, Stern/NYU, Sp2004 4 Review: converting inheritance OO-style: Nulls-style: TitleYearlength Star Wars1980120 TitleYearlengthMurder-Weapon Scream1988110Knife TitleYearlength Lion King1990115 TitleYearlengthMurder-Weapon Roger Rabbit1988110Knife 1. Plain movies3. Murder mysteries 2. Cartoons4. Cartoon murder-mysteries TitleYearlengthMurder-Weapon Star Wars1980120NULL Lion King1993130NULL Scream1988110Knife Roger Rabbit1990115Knife

5 M.P. Johnson, DBMS, Stern/NYU, Sp2004 5 Review: converting inheritance E/R-style Root entity set: Movies(title, year, length) 1301993Lion King 1988 1990 1980 Year 110 115 120 length Roger Rabbit Scream Star Wars Title Knife1990R. Rabbit 1988 Year Knife murderWeapon Scream Title Subclass: MurderMysteries(title, year, murderWeapon) Subclass: Cartoons(title, year) 1993Lion King 1990 Year Roger Rabbit Title

6 M.P. Johnson, DBMS, Stern/NYU, Sp2004 6 E/R-style & quasi-redundancy Name and year of Roger Rabbit were listed in three different rows (in different tables) Suppose title changes (“Roger”  “Roget”)   must change all three places Q: Is this redundancy? A: No!  name and year are independent  multiple movies may have same name Real redundancy reqs. depenency two rows agree on SSN  must agree on rest  conflicting hair colors in these rows is an error two rows agree on movie title  may still disagree  conflicting years may be correct – or may not be Better: introduce “movie-id” key att

7 M.P. Johnson, DBMS, Stern/NYU, Sp2004 7 Combined isa/weak example Exercise 3.3.1  Convert from E/R to R, by E/R, OO and nulls courses Lab- courses Depts Computer- allocation room number givenBy name chair isa

8 M.P. Johnson, DBMS, Stern/NYU, Sp2004 8 Next topic: Functional dependencies (3.4) FDs are constraints  part of the schema  can’t tell from particular relation instances  FD may hold for some instances “accidentally” Finding all FDs is part of DB design  Used in normalization

9 M.P. Johnson, DBMS, Stern/NYU, Sp2004 9 Functional dependencies Definition: Notation: Read: A i functionally determines B j If two tuples agree on the attributes A 1, A 2, …, A n then they must also agree on the attributes B 1, B 2, …, B m A 1, A 2, …, A n  B 1, B 2, …, B m

10 M.P. Johnson, DBMS, Stern/NYU, Sp2004 10 Typical Examples of FDs Product  name  price, manufacturer Person  ssn  name, age  father’s/husband’s-name  last-name  zipcode  state  phone  state (notwithstanding inter-state area codes) Company  name  stockprice, president  symbol  name  name  symbol

11 M.P. Johnson, DBMS, Stern/NYU, Sp2004 11 To check A  B, erase all other columns; for each rows t 1, t 2 i.e., check if remaining relation is many-one  no “divergences”  i.e., if A  B is a well-defined function  thus, functional dependency Functional dependencies BmBm...B1B1 AmAm A1A1 t1t1 t2t2 if t 1, t 2 agree here then t 1, t 2 agree here

12 M.P. Johnson, DBMS, Stern/NYU, Sp2004 12 FDs Example Product(name, category, color, department, price) name  color category  department color, category  price name  color category  department color, category  price Consider these FDs: What do they say ?

13 M.P. Johnson, DBMS, Stern/NYU, Sp2004 13 FDs Example FDs are constraints: On some instances they hold On others they don’t namecategorycolordepartmentprice GizmoGadgetGreenToys49 TweakerGadgetGreenToys99 Does this instance satisfy all the FDs ? name  color category  department color, category  price name  color category  department color, category  price

14 M.P. Johnson, DBMS, Stern/NYU, Sp2004 14 FDs Example namecategorycolordepartmentprice GizmoGadgetGreenToys49 TweakerGadgetBlackToys99 GizmoStationaryGreen Office- supp. 59 What about this one ? name  color category  department color, category  price name  color category  department color, category  price

15 M.P. Johnson, DBMS, Stern/NYU, Sp2004 15 Q: Is Position  Phone an FD here? A: It is for this instance, but no, presumably not in general Others FDs? EmpID  Name, Phone, Position but Phone  Position Recognizing FDs EmpIDNamePhonePosition E0045Smith1234Clerk E1847John9876Salesrep E1111Smith9876Salesrep E9999Mary1234Lawyer

16 M.P. Johnson, DBMS, Stern/NYU, Sp2004 16 Keys of relations {A 1 A 2 A 3 …A n } is a key for relation R if 1. A 1 A 2 A 3 …A n functionally determine all other attributes Usual notation: A 1 A 2 A 3 …A n  B 1 B 2 …B k rels = sets  distinct rows can’t agree on all A i 2. A 1 A 2 A 3 …A n is minimal No proper subset of A 1 A 2 A 3 …A n functionally determines all other attributes of R Primary key: chosen if there are several possible keys

17 M.P. Johnson, DBMS, Stern/NYU, Sp2004 17 Keys example Relation: Student(Name, Address, DoB, Email, Credits) Which (/why) of the following are keys?  Name, Address  Name, SSN  Email, SSN  Email  SSN NB: minimal != smallest

18 M.P. Johnson, DBMS, Stern/NYU, Sp2004 18 Superkeys A set of attributes that contains a key Satisfies first condition:  functionally determines every other attribute in the relation Might not satisfy the second condition: minimality  may be possible to peel away some attributes from the superkey  keys are superkeys key are special case of superkey  superkey set is superset of key set name;ssn is a superkey but not a key

19 M.P. Johnson, DBMS, Stern/NYU, Sp2004 19 Discovering keys for relations Relation  entity set  Key of relation = (minimized) key of entity set Relation  binary relationship  Many-many: union of keys of both entity sets  Many(M)-one(O): only key of M (Why?)  One-one: key of either entity set (but not both!)

20 M.P. Johnson, DBMS, Stern/NYU, Sp2004 20 Example – entity sets Key of entity set = (minimized) key of relation Student(Name, Address, DoB, SSN, Email, Credits) Student Name Address DoB SSN Email Credits

21 M.P. Johnson, DBMS, Stern/NYU, Sp2004 21 Example – many-many Many-many key: union of both ES keys StudentEnrolls Course SSNCredits CourseID Name Enrolls(SSN,CourseID)

22 M.P. Johnson, DBMS, Stern/NYU, Sp2004 22 Example – many-one Key of many ES but not of one ES  keys from both would be non-minimal CourseMeetsIn Room CourseIDName RoomNo Capacity MeetsIn(CourseID,RoomNo)

23 M.P. Johnson, DBMS, Stern/NYU, Sp2004 23 Example – one-one Keys of both ESs included in relation Key is key of either ES (but not both!) HusbandsMarriedWives SSNName SSN Name Married(HSSN, WSSN) or Married(HSSN, WSSN)

24 M.P. Johnson, DBMS, Stern/NYU, Sp2004 24 Discovering keys: multiway Multiway relationships:  Multiple ways – may not be obvious  R:F,G,H  E is many-one  E’s key is included but not part of key Recall that relship atts are implicitly many-one CourseEnrolls Student CourseID Name SSN NameSection RoomNo Capacity Enrolls(CourseID,SSN,RoomNo)

25 M.P. Johnson, DBMS, Stern/NYU, Sp2004 25 Example Exercise 3.4.2 Relation relating particles in a box to locations and velocities InPosition(id,x,y,z,vx,vy,vz) Q: What FDs hold? Q: What are the keys?

26 M.P. Johnson, DBMS, Stern/NYU, Sp2004 26 Combining FDs (3.5) If some FDs are satisfied, then others are satisfied too If all these FDs are true: name  color category  department color, category  price name  color category  department color, category  price Then this FD also holds: name, category  price Why?

27 M.P. Johnson, DBMS, Stern/NYU, Sp2004 27 Rules for FDs Reasoning about FDs: given a set of FDs, infer other FDs – useful E.g. A  B, B  C  A  C Definitions: for FD-sets S and T  T follows from S if all relation-instances satisfying S also satisfy T.  S and T are equivalent if the sets of relation- instances satisfying S and T are the same.  I.e., S and T are equivalent if S follows from T, and T follows from S.

28 M.P. Johnson, DBMS, Stern/NYU, Sp2004 28 Splitting & combining FDs Splitting rule: Combining rule: Note: doesn’t apply to the left side Q: Does it apply to the left side? A 1 A 2 …A n  B 1 B 2 …B m A 1, A 2, …, A n  B 1 A 1, A 2, …, A n  B 2..... A 1, A 2, …, A n  B m A 1, A 2, …, A n  B 1 A 1, A 2, …, A n  B 2..... A 1, A 2, …, A n  B m Bm...B1Am...A1 t1 t2

29 M.P. Johnson, DBMS, Stern/NYU, Sp2004 29 Reflexive rule: trivial FDs FD A 1 A 2 …A n  B 1 B 2 …B k may be  Trivial: Bs are a subset of As  Nontrivial: >=1 of the Bs is not among the As  Completely nontrivial: none of the Bs is among the As Trivial elimination rule:  Eliminate common attributes from Bs, to get an equivalent completely nontrivial FD A 1, A 2, …, A n  A i with i in 1..n is a trivial FD A1A1 …AnAn t t’

30 M.P. Johnson, DBMS, Stern/NYU, Sp2004 30 Transitive rule If and then A 1, A 2, …, A n  B 1, B 2, …, B m B 1, B 2, …, B m  C 1, C 2, …, C p A 1, A 2, …, A n  C 1, C 2, …, C p A1A1 …AmAm B1B1 …BmBm C1C1...CpCp t t’

31 M.P. Johnson, DBMS, Stern/NYU, Sp2004 31 Augmentation rule If then A 1, A 2, …, A n  B A 1, A 2, …, A n, C  B, C, for any C A1A1 …AmAm B1B1 …BmBm C1C1...CpCp t t’

32 M.P. Johnson, DBMS, Stern/NYU, Sp2004 32 Rules summary 1. A  B  AC  B (by definition) 2. Separation/Combination 3. Reflexive 4. Augmentation 5. Transitivity Last 3 called Armstrong’s Axioms Complete: entire closure follows from these Sound: no other FDs follow from these

33 M.P. Johnson, DBMS, Stern/NYU, Sp2004 33 Inferring FDs example Start from the following FDs: Infer the following FDs: 1. name  color 2. category  department 3. color, category  price 1. name  color 2. category  department 3. color, category  price Inferred FD Which Rule did we apply? 4. name, category  name 5. name, category  color 6. name, category  category 7. name, category  color, category 8. name, category  price Reflexive rule Transitivity(4,1) Reflexive rule combine(5,6) or Aug(1) Transitivity(3,7)

34 M.P. Johnson, DBMS, Stern/NYU, Sp2004 34 Problem: infer ALL FDs Given a set of FDs, infer all possible FDs How to proceed? Try all possible FDs, apply all rules  E.g. R(A, B, C, D): how many FDs are possible? Drop trivial FDs, drop augmented FDs  Still way too many Better: use the Closure Algorithm (next)

35 M.P. Johnson, DBMS, Stern/NYU, Sp2004 35 Closure of a set of Attributes Given a set of attributes A 1, …, A n The closure, {A 1, …, A n } + = {B in Atts: A 1, …, A n  B} Given a set of attributes A 1, …, A n The closure, {A 1, …, A n } + = {B in Atts: A 1, …, A n  B} name  color category  department color, category  price name  color category  department color, category  price Example: Closures: {name} + = {name, color} {name, category} + = {name, category, color, department, price} {color} + = {color}

36 M.P. Johnson, DBMS, Stern/NYU, Sp2004 36 Closure Algorithm Start with X={A 1, …, A n }. Repeat: if B 1, …, B n  C is a FD and B 1, …, B n are all in X then add C to X. until X didn’t change Start with X={A 1, …, A n }. Repeat: if B 1, …, B n  C is a FD and B 1, …, B n are all in X then add C to X. until X didn’t change {name, category} + = {name, category, color, department, price} name  color category  department color, category  price name  color category  department color, category  price Example:

37 M.P. Johnson, DBMS, Stern/NYU, Sp2004 37 Example Compute {A,B} + X = {A, B, } Compute {A, F} + X = {A, F, } R(A,B,C,D,E,F) A, B  C A, D  E B  D A, F  B A, B  C A, D  E B  D A, F  B In class:

38 M.P. Johnson, DBMS, Stern/NYU, Sp2004 38 More definitions Given FDs versus Derived FDs  Given FDs: stated initially for the relation  Derived FDs: those that are inferred using the rules or through closure Basis: A set of FDs from which we can infer all other FDs for the relation Minimal basis: a minimal set of FDs (Can’t discard any of them)  No proper subset of the minimal basis can derive the complete set of FDs

39 M.P. Johnson, DBMS, Stern/NYU, Sp2004 39 Problem: find all FDs Given a database instance Find all FD’s satisfied by that instance Useful if we don’t get enough information from our users: need to reverse engineer a data instance Q: How long does this take? A: Some time for each subset of atts Q: How many subsets?  powerset  exponential time in worst-case But can often be smarter…

40 M.P. Johnson, DBMS, Stern/NYU, Sp2004 40 Using Closure to Infer ALL FDs A, B  C A, D  B B  D A, B  C A, D  B B  D Example: Step 1: Compute X +, for every X: A + = A, B + = BD, C + = C, D + = D AB + = ABCD, AC + = AC, AD + = ABCD ABC + = ABD + = ACD + = ABCD (no need to compute– why?) BCD + = BCD, ABCD + = ABCD A + = A, B + = BD, C + = C, D + = D AB + = ABCD, AC + = AC, AD + = ABCD ABC + = ABD + = ACD + = ABCD (no need to compute– why?) BCD + = BCD, ABCD + = ABCD Step 2: Enumerate all FD’s X  Y, s.t. Y  X + and X  Y =  : AB  CD, AD  BC, ABC  D, ABD  C, ACD  B

41 M.P. Johnson, DBMS, Stern/NYU, Sp2004 41 Example R(A,B,C) Each of three determines other two Q: What are the FDs?  Closure of singleton sets  Closure of doubletons Q: What are the keys? Q: What are the minimal bases?

42 M.P. Johnson, DBMS, Stern/NYU, Sp2004 42 Computing Keys Rephrasing of definition of key: X is a key if  X + = all attributes  X is minimal Note: there can be many minimal keys ! Example: R(A,B,C), AB  C, BC  A Minimal keys: AB and BC

43 M.P. Johnson, DBMS, Stern/NYU, Sp2004 43 Examples of Keys Product(name, price, category, color) name, category  price category  color Keys are: {name, category} Enrollment(student, address, course, room, time) student  address room, time  course student, course  room, time Keys are: [in class]


Download ppt "M.P. Johnson, DBMS, Stern/NYU, Sp20041 C20.0046: Database Management Systems Lecture #5 Matthew P. Johnson Stern School of Business, NYU Spring, 2004."

Similar presentations


Ads by Google