Presentation is loading. Please wait.

Presentation is loading. Please wait.

Handout 4 Functional Dependencies

Similar presentations


Presentation on theme: "Handout 4 Functional Dependencies"— Presentation transcript:

1 Handout 4 Functional Dependencies
CIS 550 Handout 4 Functional Dependencies CIS550 Handout 4

2 Why we need relational design theory
We don’t need it to design databases ER diagrams and related tools are much more understandable and effective. The theory is useful as a check on our designs to understand certain things that ER diagrams cannot do to help us understand the consequences of redundancy (which we may use for efficiency) CIS550 Handout 4

3 Not all designs are equally good
Why is this design bad? And why is this one preferable? Data(Id#, Name, Address, C#, Description, Grade) Student(Id#, Name, Address) Course(C#, Description) Enrolled(Id#, C#, Grade) CIS550 Handout 4

4 An example of “bad” design
Name Jones Smith Brown Address Phila NYC Boston C# Phil7 Math8 Eng12 Description Plato Topology Chaucer Grade A B C Id# 124 456 789 Information is redundantly given. E.g. Name and Address Some information, e.g., course information depends on the existence of some student. CIS550 Handout 4

5 Functional Dependencies
Recall that a key is a set of attribute names. If two tuples agree on a key they agree everywhere -- they are the same. In our “bad” design, if two tuples agree on Id#, they agree on Address, even though they are not the same. We can say “Id# determines Address” -- written Id#  Address This is a functional dependency CIS550 Handout 4

6 Here are some functional dependencies that we expect to hold in our student-course database
Id#  Name, Address C#  Description Id#,C#  Grade Note that any relation (good or bad design) should be constrained by these dependencies A functional dependency X  Y is simply a pair of sets. Notice the “sloppy” notation A,B  C,D or AB  CD rather than {A, B}  {C,D } CIS550 Handout 4

7 The Meaning of fd’s Defn. Given a relation scheme R (a set of attributes) and subsets X,Y of R, an instance r of R satisfies X  Y if, for any two tuples t1, t2 in R, t1[X ]=t2[X ] implies t1[Y ] = t2[Y ] N.B. We cannot look at a relation to determine which fd’s hold (we can tell if an it doesn’t satisfy an fd. CIS550 Handout 4

8 Basic Intuition in Relational Design
A database scheme is “good” if all fd’s are of the form K  R where K is a key for R Example: Our “bad” design is bad because, for example Id#  Address is not a key for the relation scheme in which these attributes occur. However, it isn’t as simple as as this. A  A is a functional dependency for any attribute A. Are all attributes keys?? CIS550 Handout 4

9 Armstrong’s Axioms Some fd’s occur as consequences of others These can be deduced by Armstrong’s axioms: Reflexivity. If Y  X then X Y (These are called trivial dependencies). Example: Name, Address -> Address Augmentation. If X  Y then XW  YW Example: From C#  Description we deduce C#,Id#  Description, Id# Transitivity. If X  Y and Y  Z then X  Z Example: From Id#,C#  C# and C#  Description, we deduce Id#,C#  Description CIS550 Handout 4

10 Consequences of Armstrong’s Axioms
Union. If X Y and X  Z then X  YZ Pseudotransitivity. If X  Y and WY  Z then XW  Z Decomposition. If X  Y and Z  Y then X  Z Prove these from Armstrong’s Axioms. CIS550 Handout 4

11 {X  Y | X  Y can be deduced from F by Armstrong’s Axioms}
Closure of a set of fd’s Defn. Let F be a set of fd’s. The closure of F, F + is the set of fd’s {X  Y | X  Y can be deduced from F by Armstrong’s Axioms} Which of the following are in the the closure of our Student-Course fd’s? Address  Address C#  Description C#  Description, Name C#, Id#  Description, Name CIS550 Handout 4

12 Equivalence of fd sets Defn. Two sets of fd’s, F and G, are equivalent if F + = G + Example: {AB  C, A  B } and {A  C, A B } are equivalent. F + contains a huge number of fd’s (exponential in the size of the scheme). One naturally looks for small equivalent fd sets CIS550 Handout 4

13 Minimal Cover Defn. A fd set F is minimal if
1. Every fd in F is of the form where A is a (single) attribute, 2. For no X  A F is F \ {X  A } equivalent to F. 3. For no X  A in F and Z X is F \{X  A }  {Z A } equivalent to F. Example (from previous slide) {A  C, A B } is a minimal cover for {AB  C, A  B } CIS550 Handout 4

14 More on closures Fact. If F is a set of fd’s and X  Y  F + then there exists an attribute A s.t. X A  F +. Proof. Assume otherwise Let Y = {A1,..., An}. Then X  A1, ..., X  An are in F + . Therefore X  A1 ... An is in F +, i.e., X  Y is in F + Notation: F (X ) for  {Y | X Y  F +} CIS550 Handout 4

15 Why Armstrong’s Axioms?
Why are Armstrong’s axioms (or an equivalent rule set) appropriate for fd’s? They are consistent and complete “Consistent” means that any relation that satisfies the fd’s in F will satisfy the fd’s in F + “Complete” means that if an fd X  Y cannot be derived by Armstrong’s axioms from F. Then there’s a relational instance satisfying F but not X  Y. In other words, Armstrongs axioms derive all the fd’s that should hold. CIS550 Handout 4

16 Proof of consistency This comes directly from the definition. Consider augmentation, for example. This says that if XY then XW  YW. If a relation instance satisfies X  Y then for any tuples t1, t2 r. If t1[X]=t2[X] then t1[Y] = t2[Y]. If, in addition, t1[W]=t2[W] then t1[YW]=t2[YW] (remember that we are using “sloppy” notation -- YW for YW) CIS550 Handout 4

17 Proof of Completeness To prove completeness we suppose X  Y  F + and construct a relation instance that satisfies F + but not X  Y. By our previous result, we know there is an attribute A  X such that X  A  F +. Our relation has 2 tuples. They agree on F (X ) but disagree everywhere else. x1 x xn a1,1 v1 v vm w1,1 w2,1... x1 x xn a1,2 v1 v vm w1,2 w2,2... X A F(X) \ X rest of R CIS550 Handout 4

18 Proof of Completeness cont’d
It is immediate that this relation fails to satisfy XA and hence X  Y. We also have to check that it does satisfy any fd in F + . The tuples agree on only F (X ) . Thus the only fd’s that might be violated are of the form X’  Y’ where X’  F (X ). But if X’  Y’ F + and X’  F (X ) then Y’  F (X ) (reflexivity and augmentation). Therefore X’  Y’ is satisfied. CIS550 Handout 4

19 Data(Id#, Name, Address, C#, Description, Grade)
Decomposition Consider our attribute set We could decompose it into But this decomposition loses information about the relationship between students and courses. Why? Data(Id#, Name, Address, C#, Description, Grade) R1 (Id#, Name, Address,) R2(C#, Description, Grade) CIS550 Handout 4

20 Lossless Join Decomposition
R1, … Rk is a lossless join of R with respect to a fd set F if for every instance r of R that satisfies F, R1 r R1 r= r Consider What happens if we decompose on (Id#, Name,Address) and (C#,Description, Grade)? Name Jones Brown Address Phila Boston C# Phil7 Math8 Description Plato Topology Grade A C Id# 124 789 CIS550 Handout 4

21 Testing for lossless join
Fact. R1, R2 is a lossless join decomposition of R with respect to F iff at least one of the following dependencies is in F (R1  R2)  R1 \ R2 Example: WRT the fd set Id#  Name, Address C#  Description Id#,C#  Grade Is (Student,Name,Address) and (Student, C#, Description, Grade) a lossless decomposition? CIS550 Handout 4

22 Dependency preservation
Suppose we update a relation in a database. Can we easily check whether a fd XY is violated. We can if X Y is contained within set of attributes The projection of an fd set F onto a set of attributes Z, FZ is {XY | XYF + and X Y Z } A decomposition R1, …, Rk is dependency preserving if F + = (FR1...FRk)+ This means that the decomposition hasn’t “lost” any essential fd’s CIS550 Handout 4

23 {Sname, Sadd, City, Zip, Item, Price}
An example A relation scheme {Sname, Sadd, City, Zip, Item, Price} A fd set Sname  Sadd, City Sadd,City  Zip Sname,Item  Price Consider the decomposition {Sname,Sadd, City,Zip} and{Sname,Item,Price} Is it lossless? Is it dependency preserving? What if we replaced the first fd by Sname, Sadd  City ? CIS550 Handout 4

24 Another example The scheme: {Student, Teacher, Subject}
The fd set: Teacher  Subject Student, Subject  Teacher The decomposition: {Student, Teacher} and {Teacher, Subject} Is it lossless? Is it dependency preserving? CIS550 Handout 4

25 Fd’s and keys Earlier we stated that the idea in relational database design (from fd’s) is to obtain a design such that for each nontrivial dependency XY , X is a super-key for some relation scheme in R The last example shows that this cannot always be achieved in a way that preserves dependencies. This leads to two notions of normal forms CIS550 Handout 4

26 Normal forms Boyce-Codd Normal Form (BCNF). For every relation scheme R and for every X  A that holds over R, either A  X (it is trivial) ,or or X is a superkey for R Third Normal Form (3NF) For every relation scheme R and for every X  A that holds over R, either A  X (it is trivial), or X is a superkey for R, or A is a member of some key for R. CIS550 Handout 4

27 Normal Forms contd. BCNF is clearly desirable, but the teacher/student/subject example shows that it is not always obtainable. BCNF is stronger than 3NF There are algorithms to obtain A BCNF lossless join decomposition A 3NF lossless join, dependency preserving decomposition The 3NF algorithm uses a minimal cover. CIS550 Handout 4

28 BCNF Decomposition Algorithm
RES:= {R} //R = set of all attributes while there is a scheme S in RES that is not in BCNF do begin let A  B be a nontrivial functional dependency that holds on S such that A  S is not in F+ and A and B are disjoint RES:= (RES-{S})  {S-B}  {AB} end CIS550 Handout 4

29 3NF Decomposition Algorithm
let F be a minimal cover. RES = {} for each A  B in F do if none of the schemes in RES contains AB then RES:= RES {AB} if none of the schemes in RES contains a candidate key for R then RES:= RES  {any candidate key for R} CIS550 Handout 4


Download ppt "Handout 4 Functional Dependencies"

Similar presentations


Ads by Google