Presentation is loading. Please wait.

Presentation is loading. Please wait.

Slide 1 Directed Graphical Probabilistic Models: inference William W. Cohen Machine Learning 10-601 Feb 2008.

Similar presentations


Presentation on theme: "Slide 1 Directed Graphical Probabilistic Models: inference William W. Cohen Machine Learning 10-601 Feb 2008."— Presentation transcript:

1 Slide 1 Directed Graphical Probabilistic Models: inference William W. Cohen Machine Learning 10-601 Feb 2008

2 Slide 2 REVIEW OF DIRECTED GRAPHICAL MODELS AND D- SEPARATION

3 Slide 3 Summary of Monday(1): Bayes nets Many problems can be solved using the joint probability P(X 1,…,X n ). Bayes nets describe a way to compactly write the joint. For a Bayes net: AB First guess The money C The goat D Stick or swap? E Second guess AP(A) 10.33 2 3 BP(B) 10.33 2 3 ABCP(C|A,B) 1120.5 113 1231.0 132 ………… ACDP(E|A,C,D) ………… Conditional independence:

4 Slide 4 Conditional Independence Independence: Conditional independence: (Fancy version of c.r.) (Def ’ n of cond. Indep.) Claim: if I then “It’s really a lot like independence”

5 Slide 5 Summary of Monday(2): d-separation Z Z Z X YE There are three ways paths from X to Y given evidence E can be blocked. X is d-separated from Y given E iff all paths from X to Y given E are blocked If X is d-separated from Y given E, then I All the ways paths can be blocked

6 Slide 6 d-separation continued… X Y E Question: is ? XP(X) 00.5 1 XEP(E|X) 000.01 010.99 10 110.01 EYP(Y|E) 000.5 01 10 11 EYP(Y|E) 000.01 010.99 10 110.01 It depends…on the CPTs This is why d-separation implies conditional independence but not the converse…

7 Slide 7 d-separation continued… X Y E Question: is ? Yes!

8 Slide 8 d-separation continued… X Y E Question: is ? Yes! Bayes rule Fancier version of B.R. From previous slide…

9 Slide 9 d-separation Z Z Z X YE

10 Slide 10 d-separation Z Z Z X YE ? ?

11 Slide 11 d-separation continued… X Y E Question: is ? Yes!

12 Slide 12 d-separation continued… X Y E Question: is ? No EP(E) 00.5 1

13 Slide 13 d-separation Z Z Z X YE ? ?

14 Slide 14 d-separation continued… X Y E Question: is ? Yes!

15 Slide 15 d-separation continued… X Y E Question: is ? XYEP(E|X,Y)P(E,X,Y) 0000.960.24 0010.040.01 0100.040.01 0110.960.24 1000.040.01 1010.960.24 1100.040.01 1110.960.24

16 Slide 16 d-separation continued… X Y E Question: is ? No! XYEP(E|X,Y)P(E,X,Y) 0000.960.24 0010.040.01 0100.960.01 0110.960.24 1000.040.01 1010.960.24 1100.040.01 1110.960.24

17 Slide 17 d-separation continued… X Y E Question: is ? No! XYEP(E|X,Y)P(E,X,Y) 0000.960.24 0010.040.01 0100.960.01 0110.960.24 1000.040.01 1010.960.24 1100.040.01 1110.960.24

18 Slide 18 “ Explaining away ” X Y E NO YES This is “ explaining away ” : E is common symptom of two causes, X and Y After observing E=1, both X and Y become more probable After observing E=1 and X=1, Y becomes less probable (compared to just E=1) since X alone is enough to “ explain ” E=1

19 Slide 19 INFERENCE IN DGM from: Russell and Norvig

20 ____ ____ __ ______ ______ ____ ____ __ ______ ______ Great Ideas in ML: Message Passing 3 behind you 2 behind you 1 behind you 4 behind you 5 behind you 1 before you 2 before you there's 1 of me 3 before you 4 before you 5 before you Count the soldiers 20 adapted from MacKay (2003) textbook Thanks Matt Gormley

21 Great Ideas in ML: Message Passing 3 behind you 2 before you there's 1 of me Belief: Must be 2 + 1 + 3 = 6 of us only see my incoming messages 231 Count the soldiers 21 adapted from MacKay (2003) textbook 2 before you Thanks Matt Gormley

22 Great Ideas in ML: Message Passing 4 behind you 1 before you there's 1 of me only see my incoming messages Count the soldiers 22 adapted from MacKay (2003) textbook Belief: Must be 2 + 1 + 3 = 6 of us 231 Belief: Must be 1 + 1 + 4 = 6 of us 141 Thanks Matt Gormley

23 Great Ideas in ML: Message Passing 7 here 3 here 11 here (= 7+3+1) 1 of me Each soldier receives reports from all branches of tree 23 adapted from MacKay (2003) textbook

24 Great Ideas in ML: Message Passing 3 here 7 here (= 3+3+1) Each soldier receives reports from all branches of tree 24 adapted from MacKay (2003) textbook

25 Great Ideas in ML: Message Passing 7 here 3 here 11 here (= 7+3+1) Each soldier receives reports from all branches of tree 25 adapted from MacKay (2003) textbook Thanks Matt Gormley

26 Great Ideas in ML: Message Passing 7 here 3 here Belief: Must be 14 of us Each soldier receives reports from all branches of tree 26 adapted from MacKay (2003) textbook Thanks Matt Gormley

27 Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 7 here 3 here Belief: Must be 14 of us wouldn't work correctly with a 'loopy' (cyclic) graph 27 adapted from MacKay (2003) textbook Thanks Matt Gormley

28 Message Passing and Inference Message passing: – Handles the “simple” (and tractable) case of polytree* exactly – Handles intractable cases approximately – Often an important part of exact algorithms for more complex cases 28

29 Message Passing and Counting Message passing is almost counting Instead of passing counts, we’ll be passing probability distributions (beliefs) of various types 29

30 Inference in Bayes Nets The belief propagation algorithm 30

31 Slide 31 Inference in Bayes nets General problem: given evidence E 1,…,E k compute P(X|E 1,..,E k ) for any X Big assumption: graph is “ polytree ” <=1 undirected path between any nodes X,Y Notation: X Y1Y1 Y2Y2 Z2Z2 Z1Z1 U2U2 U1U1

32 Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 7 here 3 here Belief: Must be 14 of us wouldn't work correctly with a 'loopy' (cyclic) graph 32 adapted from MacKay (2003) textbook Thanks Matt Gormley

33 Slide 33 Hot* or not? AB First guess The money C The goat D Stick or swap? E *Hot = polytree (Must be one undirected path between every pair of nodes)

34 Slide 34 Inference in Bayes nets: P(X|E) E+: causal support E-: evidential support E+: causal support E-: evidential support

35 Slide 35 Inference in Bayes nets: P(X|E + ) d-sep. d-sep – write as product

36 Slide 36 Inference in Bayes nets: P(X|E + ) CPT table lookup Recursive call to P(. |E + ) So far: simple way of propagating requests for “ belief due to causal evidence ” up the tree Evidence for Uj that doesn’t go thru X I.e. info on Pr(X|E+) flows down

37 Slide 37 Inference in Bayes nets: P(X|E)

38 Slide 38 Inference in Bayes nets: P(E - |X) simplified d-sep + polytree Recursive call to P(E - |. ) So far: simple way of propagating requests for “ belief due to evidential support ” down the tree I.e. info on Pr(E-|X) flows up

39 Slide 39 Inference in Bayes nets: P(E - |X) simplified Recursive call to P(E - |. ) Usual implementation is message passing: Send values for P(E-|X) up the tree (vertex to parent) Wait for all children’s msgs before sending Send values for P(X|E+) down the tree (parent to child) Wait for all parent’s msgs before sending Compute P(X|E) when after all msgs to X are recieved Usual implementation is message passing: Send values for P(E-|X) up the tree (vertex to parent) Wait for all children’s msgs before sending Send values for P(X|E+) down the tree (parent to child) Wait for all parent’s msgs before sending Compute P(X|E) when after all msgs to X are recieved Recursive call to P(. |E + )

40 Slide 40 recursion Inference in Bayes nets: P(E - |X) our decomposition d-sep

41 Slide 41 Inference in Bayes nets: P(E - |X)

42 Slide 42 where Recursive call to P(E - |. ) CPT Recursive call to P(. |E ) Inference in Bayes nets: P(E - |X)

43 Slide 43 Recap: Message Passing for BP We reduced P(X|E) to product of two recursively calculated parts: P(X=x|E + ) i.e., CPT for X and product of “ forward ” messages from parents P(E - |X=x) i.e., combination of “ backward ” messages from parents, CPTs, and P(Z|E Z\Yk ), a simpler instance of P(X|E) This can also be implemented by message-passing (belief propagation) Messages are distributions – i.e., vectors

44 Slide 44 Recap: Message Passing for BP Top-level algorithm Pick one vertex as the “root” Any node with only one edge is a “leaf” Pass messages from the leaves to the root Pass messages from root to the leaves Now every X has received P(X|E+) and P(E-|X) and can compute P(X|E)

45 Slide 45 From Russell and Norvig

46 46 Pr(X2=1|X1=1) Pr(X3=1|X2=2) ~ count the total weight of paths from E to X Evidential support Causal support

47 Slide 47 More on message passing/BP BP for other graphical models Markov networks Factor graphs BP for non-polytrees: Small tree-width graphs Loopy BP

48 Slide 48 Markov blanket The Markov blanket for a random variable A is the set of variables B 1,…,B k that A is not conditionally independent of I.e., not d-separated from For DGM this includes parents, children, and “co-parents” (other parents of children) why? explaining away

49 Slide 49 Markov network A Markov network is a set of random variables in an undirected graph where each variable is conditionally independent of all other variables given its neighbors E.g: I

50 Slide 50 Markov networks A Markov network is a set of random variables in an undirected graph where each variable is conditionally independent of all other variables given its neighbors E.g: I Instead of CPTs there are clique potentials CE ϕ 001 0110 10 111 So the Markov blanket, d-sep stuff is much simpler But there’s no “generative” story

51 Slide 51 Markov networks Instead of CPTs there are clique potentials which define a joint CE ϕ 001 0110 10 111 ABD ϕ 0001 0013.5 0102.7 ………… 1111 clique C vars in clique C these value mapped to these variable indices these values mapped to these variable indices

52 Slide 52 More on message passing/BP BP can be extended to other graphical models Markov networks, if they are polytrees “Factor graphs”---which are a generalization of both Markov networks and DGMs Arguably cleaner than DGM BP BP for non-polytrees: Small tree-width graphs Loopy BP

53 Slide 53 Small tree-width graphs AB First guess The money C The goat D Stick or swap? E Not a polytree B The money ACE Guesses A,E and the goat C D Stick or swap? A polytree BDACEP(ACE|B,D) 001,1,1… 001,1,2… ………… 003,3,3… 011,1,1… …………

54 Slide 54 Small tree-width graphs AB First guess The money C The goat D E Not a polytree B The money ACE Guesses A,E and the goat C D A polytree X … W …… X … W ……

55 Slide 55 More on message passing/BP BP can be extended to other graphical models Markov networks, if they are polytrees “Factor graphs” BP for non-polytrees: Small tree-width graphs: convert to polytrees and run normal BP Loopy BP

56 Great Ideas in ML: Message Passing Each soldier receives reports from all branches of tree 7 here 3 here Belief: Must be 14 of us wouldn't work correctly with a 'loopy' (cyclic) graph 56 adapted from MacKay (2003) textbook Thanks Matt Gormley

57 Recap: Loopy BP Top-level algorithm – Initialize every X’s messages P(X|E+) and P(E-|X) to vectors of all 1’s. – Repeat: Have every X send its children/parents messages – Which will be incorrect at first For every X, update P(X|E+) and P(E-|X) based on last messages – Which will be incorrect at first – In a tree this will eventually converge to the right values – In a graph if might converge Non-trivial to predict when and if but… it’s often a good approximation


Download ppt "Slide 1 Directed Graphical Probabilistic Models: inference William W. Cohen Machine Learning 10-601 Feb 2008."

Similar presentations


Ads by Google