Presentation is loading. Please wait.

Presentation is loading. Please wait.

Basic Probability (most slides borrowed, with permission, from Andrew Moore of CMU and Google)

Similar presentations


Presentation on theme: "Basic Probability (most slides borrowed, with permission, from Andrew Moore of CMU and Google)"— Presentation transcript:

1 Basic Probability (most slides borrowed, with permission, from Andrew Moore of CMU and Google) http://www.cs.cmu.edu/~awm/tutorials

2

3

4

5

6

7

8 A

9 A B

10 Notation Digression P(A) is shorthand for P(A=true) P(~A) is shorthand for P(A=false) Same notation applies to other binary RVs: P(Gender=M), P(Gender=F) Same notation applies to multivalued RVs: P(Major=history), P(Age=19), P(Q=c) Note: upper case letters/names for variables, lower case letters/names for values For multivalued RVs, P(Q) is shorthand for P(Q=q) for some unknown q

11

12 … k k …

13 k k

14 k k Integration over nuisance variables

15

16

17

18

19

20 R Q S P(H|F) = R/(Q+R) P(F|H) = R/(S+R)

21

22

23

24

25 Let’s-Make-A-Deal Problem Game show with 3 doors Behind one door is a prize Player chooses a door, say door 1 Host opens door 2 to reveal no prize Host offers player opportunity to switch from door 1 to door 3. Should player stay with door 1, switch to door 3, or does it not matter?

26 Let’s-Make-A-Deal Problem D = door containing prize C = contestant’s choice H = door revealed by host P(D = d | C, H) P(D = d | C, H) ~ P(H | D=d, C) P(D = d|C) P(D = d | C, H) ~ P(H | D=d, C) P(D = d) E.g., C=1, H=2: P(H=2|C=1,D=1) =.5 P(H=2|C=1,D=2) = 0 P(H=2|C=1,D=3) = 1

27 Let’s-Make-A-Deal Problem: Lessons Likelihood function is based on a generative model of the environment (host’s behavior) Griffiths and Tenenbaum made their own generative assumptions – About how the experimenter would pick an age t total Solving a problem involves modeling the generative environment – Equivalent to learning the joint probability distribution

28

29

30

31

32

33

34

35

36

37 If you have joint distribution, you can perform any inference in the domain.

38

39

40

41

42 e.g., model of the environment

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62 Naïve Bayes Model

63

64

65

66 Bayes Classifier Assume you want to predict some output Y which has arity n and values y 1, y 2, y 3, … y n. Assume there are m input attributes called X 1, X 2, X 3, … X m. For each of n values y i, build a density estimator, D i, that estimates P(X 1, X 2, X 3, … X m |Y=y i ) Y predict = argmax y P(Y=y| X 1, X 2, X 3, … X m ) = argmax y P(X 1, X 2, X 3, … X m | Y=y) P(Y=y)

67

68 Machine Learning Vs. Cognitive Science For machine learning, density estimation is required to do classification, prediction, etc. For cognitive science, density estimation under a particular model is a theory about what is going on in the brain when an individual learns.


Download ppt "Basic Probability (most slides borrowed, with permission, from Andrew Moore of CMU and Google)"

Similar presentations


Ads by Google