Presentation is loading. Please wait.

Presentation is loading. Please wait.

Carnegie Mellon Focused Belief Propagation for Query-Specific Inference Anton Chechetka Carlos Guestrin 14 May 2010.

Similar presentations


Presentation on theme: "Carnegie Mellon Focused Belief Propagation for Query-Specific Inference Anton Chechetka Carlos Guestrin 14 May 2010."— Presentation transcript:

1 Carnegie Mellon Focused Belief Propagation for Query-Specific Inference Anton Chechetka Carlos Guestrin 14 May 2010

2 Query-Specific inference problem 2 known query not interesting Using information about the query to speed up convergence of belief propagation for the query marginals

3 Graphical models Talk focus: pairwise Markov random fields Paper: arbitrary factor graphs (more general) Probability is a product of local potentials Graphical structure Compact representation  Intractable inference Approximate inference often works well in practice variable 3 potential r sj k i h u

4 (loopy) Belief Propagation Passing messages along edges Variable belief: Update rule: Result: all single-variable beliefs 4 r sj k i h u

5 (loopy) Belief Propagation Update rule: Message dependencies are local: Round–robin schedule Fix message order Apply updates in that order until convergence 5 r sj k i h u dependence dep.

6 Dynamic update prioritization Fixed update sequence is not the best option Dynamic update scheduling can speed up convergence Tree-Reweighted BP [Wainwright et. al., AISTATS 2003] Residual BP [Elidan et. al. UAI 2006] Residual BP  apply the largest change first 6 1 1 informative update 2 2 wasted computation large change small change

7 Residual BP [Elidan et. al., UAI 2006] Update rule: Pick edge with largest residual Update oldnew More effort on the difficult parts of the model 7 But no query 

8 Our contributions Using weighted residuals to prioritize updates Define message weights reflecting the importance of the message to the query Computing importance weights efficiently Experiments: faster convergence on large relational models 8

9 Residual BP updates no influence on the query wasted computation Why edge importance weights? query residual < residual which to update?? want to update influence on the query in the future 9 Residual BP  max immediate residual reduction Our work  max approx. eventual effect on P(query)

10 Query-Specific BP Update rule: Pick edge with Update oldnew 10 Rest of the talk: defining and computing edge importance edge importance the only change!

11 Edge importance base case Pick edge with approximate eventual update effect on P(Q) query r sj k i h u ||P (NEW) (Q)  P (OLD) (Q)||||m (NEW)  m (OLD) ||  change in query beliefchange in message tight bound  1 Base case: edge directly connected to the query A j  i =?? 1 jijijiji ||  P(Q)|| ||  m || jiji

12 mjimji sup mrjmrj Edge one step away from the query: A r  j =?? mjimji sup mrjmrj Edge importance one step away query r sj k i h u ||  P(Q)||||  m ||  change in query belief change in message  can compute in closed form looking at only f ji [Mooij, Kappen; 2007] message importance jiji rjrj ||  m ||  over values of all other messages

13 One step away: A r  j = Edge importance general case query r sj k i h u ||  P(Q)||  ||  m s  h ||   P(Q) mshmsh sup     P(Q) mshmsh sup  Base case: A j  i =1 mhrmhr sup mshmsh mrjmrj mhrmhr mjimji mrjmrj mjimji mrjmrj   sensitivity(  ): max impact along the path  Generalization?  expensive to compute  bound may be infinite

14 Edge importance general case query r sj k i h u  11  P(Q) mshmsh sup  mhrmhr mshmsh mrjmrj mhrmhr mjimji mrjmrj   sensitivity(  ): max impact along the path  22 A s  h = max all paths  from to query sensitivity(  ) h There are a lot of paths in a graph, trying out every one is intractable 

15 Efficient edge importance computation A = max all paths  from to query sensitivity(  ) There are a lot of paths in a graph, trying out every one is intractable  always  1 always decreases as the path grows 15 mhrmhr sup mshmsh mrjmrj mhrmhr mjimji mrjmrj  sensitivity( h  r  j  i ) = always  1 decomposes into individual edge contributions Dijkstra’s (shortest paths) alg. will efficiently find max-sensitivity paths for every edge

16 A j  i = max all paths  from i to query sensitivity(  ) Query-Specific BP Run Dijkstra’s alg starting at query to get edge weights Pick edge with largest weighted residual Update 16 More effort on the difficult parts of the model Takes into account not only graphical structure, but also strength of dependencies and relevant

17 Outline Using weighted residuals to prioritize updates Define message weights reflecting the importance of the message to the query Computing importance weights efficiently As an initialization step before residual BP Restore anytime behavior of residual BP Experiments: faster convergence on large relational models 17

18 Big picture anytime property long initialization  Dijkstra BP This is broken! BeforeAfter 18 all marginals query marginals

19 Interleaving BP and Dijkstra’s Dijkstra’s expands the highest weight edges first Can pause it at any time and get the most relevant submodel Dijkstra BP Dijkstra BP … When to stop BP and resume Dijkstra’s? query full model 19

20 Interleaving Dijkstra’s expands the highest weight edges first query expanded on previous iteration just expanded not yet expanded min  expanded edges A  A suppose  M  min  expanded A no need to expand further at this point upper bound on priorityactual priority of 20

21 Any-Time query-specific BP query Dijkstra’s alg.BP updates Query-specific BP: Anytime QSBP: same BP update sequence! 21 r sj k i

22 Experiments – the setup Relational model: semi-supervised people recognition in collection of images [with Denver Dash and Matthai Philipose @ Intel] One variable per person Agreement potentials for people who look alike Disagreement potentials for people in the same image 2K variables, 900K factors Error measure: KL from the fixed point of BP 22 agree disagree

23 Experiments – convergence speed 23 better Residual BP Query-Specific BP Anytime Query-Specific BP (with Dijkstra’s interleaving)

24 Conclusions Prioritize updates by weighted residuals Takes query info into account Importance weights depend on both the graphical structure and strength of local dependencies Efficient computation of importance weights Much faster convergence on large relational models 24 Thank you!


Download ppt "Carnegie Mellon Focused Belief Propagation for Query-Specific Inference Anton Chechetka Carlos Guestrin 14 May 2010."

Similar presentations


Ads by Google