Presentation is loading. Please wait.

Presentation is loading. Please wait.

Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight.

Similar presentations


Presentation on theme: "Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight."— Presentation transcript:

1 Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight HW3 out tomorrow but not due to THURSDAY Nov 5 (no later than 11/12) Midterm next THURSDAY (10/22); Reviews 10/20 (me) & 10/21 (Dmitry) Basics of Probability –We’re covering probabilistic reasoning next so that the final programming HW can be undertaken right after the midterm –Random Events and Random Variables –Probability Distributions –Conditional Probabilities –Some Rules of Probability –Independence of Random Events Joint Probability Distributions (our focus) 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 61 Do Problem 1, HW3 before the midterm (Lecture 14 is end of midterm material)

2 Random Events Prob(event) or P(event) –The probability event will happen –Eg, prob(fair coin comes up heads) = 0.5 –A random variable represent a random event Probability Distributions –Assume event has k possible disjoint outcomes –Assign prob to each outcome; ∑ p(outcome i ) = 1 –In cs540, we’ll ignore real-valued outcomes (and probability density functions), so we’ll SUM rather than INTEGRATE 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 2

3 Sample Probability Distribution 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 3 RED GREEN BLUE Probability Color 1.0 Values of COLOR assumed disjoint and complete (ie, all objects have one and only one color)

4 What if Values of a Random Variable are NOT Disjoint and Complete? Assume some var has values x, y, and z, but these are not disjoint nor complete Create new values a = x   y   z g =  x   y   z b =  x  y   z h = x  y  z c =  x   y  z d = x  y   z e = x   y  z f =  x  y  z 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 4

5 Prob(A | B) - probability event A happens given event B occurred Eg, prob(you take bus to work) = 0.1 prob(you take bus to work | it is raining) = 0.9 NOTE:P( A  B | C  B) is shorthand for P( (A  B) | (C  B) ) 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 5 Conditional Prob’s

6 Some Rules of Prob 1) 0  prob(A)  1 2) prob(false)  0 and prob(true)  1 3) prob(A  B) = prob(A) + prob(B) - prob(A  B) 4) prob(A  B) = prob(A | B) prob(B) = prob(B | A) prob(A) 5) prob(  A) = 1 – prob(A) 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 6 Venn A B

7 Prob(A | B) = Prob(A  B) Prob(B) 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 7 A B Prob(A) can be small, say 0.10 and Prob(B) can be smaller, say 0.05, yet Prob(A | B) can be large, say, 0.75 Given we’re in B, what fraction is also in A?

8 Notational Complexity We really should state both variables and their values P(A = value A | B = value B ) Eg P(color = yellow | size = big) When we mean ‘whatever value A and B have’ we should say something like P(A = ? | B = ?) However, this gets complicated and so for clarity for P(A = ? | B = ?), we’ll often just say P(A | B) In a SPECIFIC calculation with Boolean-valued vars, P(A | B) means P(A=true | B=true) 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 8

9 Useful Equation for Manipulating Conditional Probs P(A  B | C ) = P(A | B  C)  P(B | C) Derivation 1)P(A | B  C)  P(A  B  C) / P(B  C) // Using P(    )  P(  |  )  P(  ) 2)P(B | C)  P(B  C) / P(C) 3)P(A | B  C)  P(B | C) = P(A  B  C) / P(C) // Combine 1 and 2 4)P(A  B  C) / P(C)  P(A  B | C) 5)P(A  B | C ) = P(A | B  C)  P(B | C) // Combine 3 and 4 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 9

10 Independence If A and B are independent events, then prob(A | B) = prob(A) // Knowing B happened tells us // nothing about A and prob(B | A) = prob(B) // Ditto So if independence holds (or is assumed) prob(A  B) = prob(A) prob(B) since prob(A  B) = prob(A | B) prob(B) 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 10

11 I have two coins, if I flip them, what is the prob one comes up heads and the other tails? 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 11 I never promised you independent coins!

12 Conditional Independence If A and B are independent events given C, then prob(A  B | C) = prob(A | C)  prob(B | C) We say “A and B are conditionally independent given C” A variant: “A is independent of B given C” prob(A | B  C) = prob(A | C) // Assuming we know the value of C, // knowing B happened tells us nothing more about A 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 12

13 Joint Prob Distributions (probs for compound events) Assume we have three Boolean-valued random variables; so 2 3 = 8 possible combo’s 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 13 CombinationProb of this Combination A=false, B=false, C=false 0.50 A=false, B=false, C=true 0.20 A=false, B=true, C=false 0.15 A=false, B=true, C=true 0.01 A=true, B=false, C=false 0.02 A=true, B=false, C=true 0.03 A=true, B=true, C=false 0.04 A=true, B=true, C=true ? [hint: probs sum to 1]

14 Visually – There are 8 Distinct Regions 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 14 A B C

15 Calculating Prob of Any Expression Given any (arbitrarily complex) logical expression, we can calculate its probability by summing the probs in the cells of the joint prob table where the expression is true This is our key calculation! The bulk of this topic in cs540 will address making this calculation feasible because these tables get too big 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 15

16 Examples 1)P(A) = ? 2)P(A   C) = ? 3)P(A  B) = ? 4)P(  C) = ? 5)P(B  (A  C)) = ? 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 16 CombinationProb of this Combo A=false, B=false, C=false 0.50 A=false, B=false, C=true 0.20 A=false, B=true, C=false 0.15 A=false, B=true, C=true 0.01 A=true, B=false, C=false 0.02 A=true, B=false, C=true 0.03 A=true, B=true, C=false 0.04 A=true, B=true, C=true 0.05 Answers 1)0.14 2)0.06 3)0.30 4)0.71 5)0.28 Here, vars = true unless NOT sign present

17 Then vs. Now In 1980s: from where do the probs come? Today: we have tons of data! We can (conceptually) fill a huge joint prob table simply by counting, then can answer any possible question about the ‘random variables’ in our table! 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 17

18 Evidence, Hidden, and Query Vars In general, we ask about some QUERY variables CONDITIONED on some EVIDENCE variables, but don’t mention some HIDDEN variables Prob(A   C  E |  S  P  Y ) ? Evidence vars: S, P and Y Hidden vars: B, D, F, G, …, N, O, Q, R, T, U, … X, & Z Query vars: A, C, and E 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 18

19 More Experience with Joint Prob Tables Recall that a Joint Probability Table specifies the probability of each (discrete) “complete world state” A “complete world state” specifies the value of each random variable used to represent the world we’re modeling If N variables each with M possible values, there are M N different states of the world one Petabyte = 10 15 bytes if N = 50 and M = 2, then M N  10 15 10/13/15CS 540 - Fall 2015 (Shavlik©), Lecture 15, Week 6 19

20 Marginalizing (“Summing Out”) A method to answer questions involving partial world states P(Y) =  P(Y, Z) // Eq 13.6 10/13/15CS 540 - Fall 2015 (Shavlik©), Lecture 15, Week 6 20 Z  all possible ‘conjunctive’ settings for vars not set by Y This comma means AND Ex: Assume we have vars A, B, C, and D And Y = A ˄ ¬C Then Z  { B ˄ D, B ˄ ¬D, ¬B ˄ D, ¬B ˄ ¬D } We did this earlier when we added up all cells where Y was true

21 Conditional Forms P(Y) =  P(Y | Z)  P (Z) // Eq 13.8 - uses P(A ˄ B) = P(A | B)  P(B) - a bit like a weighted sum P(Y | X)  P(Y ˄ X) / P (X) =  P(Y ˄ X ˄ Z 1 ) /  P(X ˄ Z 2 ) 10/13/15CS 540 - Fall 2015 (Shavlik©), Lecture 15, Week 6 21 Sum over all vars not set by Y or X Sum over all vars not set by X Called ‘conditioning’

22 A ‘Weighted Sum’ Example P(takeBus) = P(takeBus | weather=sunny) x P(w=sunny) + P(takeBus | weather=cloudy) x P(w=cloudy) + P(takeBus | weather=rainy) x P(w=rainy) + P(takeBus | weather=snowy) x P(w=snowy) 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 22 Assumed disjoint and complete

23 Some Worked Examples Assume we have a joint prob table for A, …, E Express the following using table entries (ie, full world states) P(B ˄ D ˄ E) = P(B | D ˄ E) = P(A ˅ C) = 10/13/15CS 540 - Fall 2015 (Shavlik©), Lecture 15, Week 6 23 Basic idea 1)create probs that only involve AND and NOT 2)“AND in” the remaining vars in all possible (conjunctive) ways 3)look up fully specified ‘world states’ 4)do the arithmetic Here, vars = true unless NOT sign present

24 Solutions for Prev Slide P(B ˄ D ˄ E) = P(B ˄ D ˄ E ˄ A ˄ C) + P(B ˄ D ˄ E ˄ A ˄ ¬ C) + P(B ˄ D ˄ E ˄ ¬ A ˄ C) + P(B ˄ D ˄ E ˄ ¬ A ˄ ¬ C) P(B | D ˄ E) = P(B ˄ D ˄ E) / P(D ˄ E) = … - numerator done in first question - denominator involves EIGHT terms P(A ˅ C) = P(A) + P(C) - P(A ˄ C) = … - first TWO terms each involve summing SIXTEEN terms - last term involves EIGHT 10/13/15CS 540 - Fall 2015 (Shavlik©), Lecture 15, Week 6 24

25 Next What if too few examples to sufficiently populate all the cells in the joint table? What if joint prob table too large for memory? Bayesian Networks (Bayes Nets for short) provide a popular/successful answer 10/6/15CS 540 - Fall 2015 (Shavlik©), Lecture 14, Week 6 25


Download ppt "Today’s Topics Graded HW1 in Moodle (Testbeds used for grading are linked to class home page) HW2 due (but can still use 5 late days) at 11:55pm tonight."

Similar presentations


Ads by Google