1 Variable Elimination Graphical Models – 10708 Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3, 8.7.1.

Slides:



Advertisements
Similar presentations
1 Undirected Graphical Models Graphical Models – Carlos Guestrin Carnegie Mellon University October 29 th, 2008 Readings: K&F: 4.1, 4.2, 4.3, 4.4,
Advertisements

Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.
Variational Inference Amr Ahmed Nov. 6 th Outline Approximate Inference Variational inference formulation – Mean Field Examples – Structured VI.
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
Bayesian Networks CSE 473. © Daniel S. Weld 2 Last Time Basic notions Atomic events Probabilities Joint distribution Inference by enumeration Independence.
Lauritzen-Spiegelhalter Algorithm
Exact Inference in Bayes Nets
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
MPE, MAP AND APPROXIMATIONS Lecture 10: Statistical Methods in AI/ML Vibhav Gogate The University of Texas at Dallas Readings: AD Chapter 10.
Clique Trees Amr Ahmed October 23, Outline Clique Trees Representation Factorization Inference Relation with VE.
Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Variable Elimination Dhruv Batra, Recitation 10/16/2008.
Recent Development on Elimination Ordering Group 1.
Belief Propagation, Junction Trees, and Factor Graphs
. Inference I Introduction, Hardness, and Variable Elimination Slides by Nir Friedman.
Exact Inference: Clique Trees
Context-specific independence Graphical Models – Carlos Guestrin Carnegie Mellon University October 16 th, 2006 Readings: K&F: 4.1, 4.2, 4.3, 4.4,
1 Bayesian Param. Learning Bayesian Structure Learning Graphical Models – Carlos Guestrin Carnegie Mellon University October 6 th, 2008 Readings:
Automated Planning and Decision Making Prof. Ronen Brafman Automated Planning and Decision Making 2007 Bayesian networks Variable Elimination Based on.
Daphne Koller Variable Elimination Graph-Based Perspective Probabilistic Graphical Models Inference.
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
1 EM for BNs Kalman Filters Gaussian BNs Graphical Models – Carlos Guestrin Carnegie Mellon University November 17 th, 2006 Readings: K&F: 16.2 Jordan.
Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website
Probabilistic Networks Chapter 14 of Dechter’s CP textbook Speaker: Daniel Geschwender April 1, 2013 April 1&3, 2013DanielG--Probabilistic Networks1.
1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.
1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2008 Readings: K&F: 3.1, 3.2, –  Carlos.
1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:
1 Parameter Learning 2 Structure Learning 1: The good Graphical Models – Carlos Guestrin Carnegie Mellon University September 27 th, 2006 Readings:
1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
1 BN Semantics 2 – The revenge of d-separation Graphical Models – Carlos Guestrin Carnegie Mellon University September 20 th, 2006 Readings: K&F:
1 Param. Learning (MLE) Structure Learning The Good Graphical Models – Carlos Guestrin Carnegie Mellon University October 1 st, 2008 Readings: K&F:
1 Learning P-maps Param. Learning Graphical Models – Carlos Guestrin Carnegie Mellon University September 24 th, 2008 Readings: K&F: 3.3, 3.4, 16.1,
1 BN Semantics 2 – Representation Theorem The revenge of d-separation Graphical Models – Carlos Guestrin Carnegie Mellon University September 17.
Today Graphical Models Representing conditional dependence graphically
1 Structure Learning (The Good), The Bad, The Ugly Inference Graphical Models – Carlos Guestrin Carnegie Mellon University October 13 th, 2008 Readings:
Daphne Koller Overview Maximum a posteriori (MAP) Probabilistic Graphical Models Inference.
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 15 th, 2008 Readings: K&F: 8.1, 8.2, 8.3,
1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2006 Readings: K&F: 3.1, 3.2, 3.3.
1 BN Semantics 3 – Now it’s personal! Parameter Learning 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 22 nd, 2006 Readings:
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk
Maximum Expected Utility
Inference in Bayesian Networks
Exact Inference Continued
CSCI 5822 Probabilistic Models of Human and Machine Learning
Readings: K&F: 15.1, 15.2, 15.3, 15.4, 15.5 K&F: 7 (overview of inference) K&F: 8.1, 8.2 (Variable Elimination) Structure Learning in BNs 3: (the good,
Exact Inference ..
Class #19 – Tuesday, November 3
CS 188: Artificial Intelligence Fall 2008
Variable Elimination 2 Clique Trees
Parameter Learning 2 Structure Learning 1: The good
Readings: K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7 Markov networks, Factor graphs, and an unified view Start approximate inference If we are lucky… Graphical.
Lecture 3: Exact Inference in GMs
Unifying Variational and GBP Learning Parameters of MNs EM for BNs
BN Semantics 3 – Now it’s personal! Parameter Learning 1
Junction Trees 3 Undirected Graphical Models
Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website
Context-specific independence
Variable Elimination Graphical Models – Carlos Guestrin
Mean Field and Variational Methods Loopy Belief Propagation
Probabilistic Reasoning
Generalized Belief Propagation
BN Semantics 2 – The revenge of d-separation
Presentation transcript:

1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 11 th, 2006 Readings: K&F: 8.1, 8.2, 8.3, 8.7.1

–  Carlos Guestrin General probabilistic inference Query: Using def. of cond. prob.: Normalization: Flu Allergy Sinus Headache Nose

–  Carlos Guestrin Marginalization FluNose=tSinus

–  Carlos Guestrin Probabilistic inference example Flu Allergy Sinus Headache Nose=t Inference seems exponential in number of variables!

–  Carlos Guestrin Fast probabilistic inference example – Variable elimination Flu Allergy Sinus Headache Nose=t (Potential for) Exponential reduction in computation!

–  Carlos Guestrin Understanding variable elimination – Exploiting distributivity FluSinusNose=t

–  Carlos Guestrin Understanding variable elimination – Order can make a HUGE difference Flu Allergy Sinus Headache Nose=t

–  Carlos Guestrin Understanding variable elimination – Intermediate results Flu Allergy Sinus Headache Nose=t Intermediate results are probability distributions

–  Carlos Guestrin Understanding variable elimination – Another example Pharmacy Sinus Headache Nose=t

–  Carlos Guestrin Pruning irrelevant variables Flu Allergy Sinus Headache Nose=t Prune all non-ancestors of query variables More generally: Prune all nodes not on active trail between evidence and query vars

–  Carlos Guestrin Variable elimination algorithm Given a BN and a query P(X|e) / P(X,e) Instantiate evidence e Prune non-active vars for {X,e} Choose an ordering on variables, e.g., X 1, …, X n Initial factors {f 1,…,f n }: f i = P(X i |Pa Xi ) (CPT for X i ) For i = 1 to n, If X i  {X,E}  Collect factors f 1,…,f k that include X i  Generate a new factor by eliminating X i from these factors  Variable X i has been eliminated! Normalize P(X,e) to obtain P(X|e) IMPORTANT!!!

–  Carlos Guestrin Operations on factors Flu Allergy Sinus Headache Nose=t Multiplication:

–  Carlos Guestrin Operations on factors Flu Allergy Sinus Headache Nose=t Marginalization:

–  Carlos Guestrin Complexity of VE – First analysis Number of multiplications: Number of additions:

–  Carlos Guestrin Complexity of variable elimination – (Poly)-tree graphs Variable elimination order: Start from “leaves” inwards: Start from skeleton! Choose a “root”, any node Find topological order for root Eliminate variables in reverse order Linear in CPT sizes!!! (versus exponential)

–  Carlos Guestrin What you need to know about inference thus far Types of queries  probabilistic inference  most probable explanation (MPE)  maximum a posteriori (MAP) MPE and MAP are truly different (don’t give the same answer) Hardness of inference  Exact and approximate inference are NP-hard  MPE is NP-complete  MAP is much harder (NP PP -complete) Variable elimination algorithm  Eliminate a variable: Combine factors that include this var into single factor Marginalize var from new factor  Efficient algorithm (“only” exponential in induced-width, not number of variables) If you hear: “Exact inference only efficient in tree graphical models” You say: “No!!! Any graph with low induced width” And then you say: “And even some with very large induced-width” (next week with context-specific independence) Elimination order is important!  NP-complete problem  Many good heuristics

–  Carlos Guestrin Announcements Recitation tomorrow:  Khalid on Variable Elimination Recitation on advanced topic:  Carlos on Context-Specific Independence  On Monday Oct 16, 5:30-7:00pm in Wean Hall 4615A

–  Carlos Guestrin Complexity of variable elimination – Graphs with loops Connect nodes that appear together in an initial factor Difficulty SATGrade Happy Job Coherence Letter Intelligence Moralize graph: Connect parents into a clique and remove edge directions

–  Carlos Guestrin Eliminating a node – Fill edges Eliminate variable add Fill Edges: Connect neighbors Difficulty SATGrade Happy Job Coherence Letter Intelligence

–  Carlos Guestrin Induced graph Elimination order: {C,D,S,I,L,H,J,G} Difficulty SATGrade Happy Job Coherence Letter Intelligence The induced graph I F Á for elimination order Á has an edge X i – X j if X i and X j appear together in a factor generated by VE for elimination order Á on factors F

–  Carlos Guestrin Different elimination order can lead to different induced graph Elimination order: {G,C,D,S,I,L,H,J} Difficulty SATGrade Happy Job Coherence Letter Intelligence

–  Carlos Guestrin Induced graph and complexity of VE Difficulty SATGrade Happy Job Coherence Letter Intelligence Structure of induced graph encodes complexity of VE!!! Theorem:  Every factor generated by VE subset of a maximal clique in I F Á  For every maximal clique in I F Á corresponds to a factor generated by VE Induced width (or treewidth)  Size of largest clique in I F Á minus 1  Minimal induced width – induced width of best order Á Read complexity from cliques in induced graph Elimination order: {C,D,I,S,L,H,J,G}

–  Carlos Guestrin Example: Large induced-width with small number of parents Compact representation  Easy inference 

–  Carlos Guestrin Finding optimal elimination order Difficulty SATGrade Happy Job Coherence Letter Intelligence Theorem: Finding best elimination order is NP-complete:  Decision problem: Given a graph, determine if there exists an elimination order that achieves induced width · K Interpretation:  Hardness of finding elimination order in addition to hardness of inference  Actually, can find elimination order in time exponential in size of largest clique – same complexity as inference Elimination order: {C,D,I,S,L,H,J,G}

–  Carlos Guestrin Induced graphs and chordal graphs Difficulty SATGrade Happy Job Coherence Letter Intelligence Chordal graph:  Every cycle X 1 – X 2 – … – X k – X 1 with k ¸ 3 has a chord  Edge X i – X j for non-consecutive i & j Theorem:  Every induced graph is chordal “Optimal” elimination order easily obtained for chordal graph

–  Carlos Guestrin Chordal graphs and triangulation Triangulation: turning graph into chordal graph Max Cardinality Search:  Simple heuristic Initialize unobserved nodes X as unmarked For k = |X| to 1  X Ã unmarked var with most marked neighbors  Á (X) Ã k  Mark X Theorem: Obtains optimal order for chordal graphs Often, not so good in other graphs! B ED H G A F C

–  Carlos Guestrin Minimum fill/size/weight heuristics Many more effective heuristics  see reading Min (weighted) fill heuristic  Often very effective Initialize unobserved nodes X as unmarked For k = 1 to |X|  X Ã unmarked var whose elimination adds fewest edges  Á (X) Ã k  Mark X  Add fill edges introduced by eliminating X Weighted version:  Consider size of factor rather than number of edges B ED H G A F C

–  Carlos Guestrin Choosing an elimination order Choosing best order is NP-complete  Reduction from MAX-Clique Many good heuristics (some with guarantees) Ultimately, can’t beat NP-hardness of inference  Even optimal order can lead to exponential variable elimination computation In practice  Variable elimination often very effective  Many (many many) approximate inference approaches available when variable elimination too expensive  Most approximate inference approaches build on ideas from variable elimination

–  Carlos Guestrin Most likely explanation (MLE) Query: Using defn of conditional probs: Normalization irrelevant: Flu Allergy Sinus Headache Nose

–  Carlos Guestrin Max-marginalization FluSinusNose=t

–  Carlos Guestrin Example of variable elimination for MLE – Forward pass Flu Allergy Sinus Headache Nose=t

–  Carlos Guestrin Example of variable elimination for MLE – Backward pass Flu Allergy Sinus Headache Nose=t

–  Carlos Guestrin MLE Variable elimination algorithm – Forward pass Given a BN and a MLE query max x 1,…,x n P(x 1,…,x n,e) Instantiate evidence E=e Choose an ordering on variables, e.g., X 1, …, X n For i = 1 to n, If X i  E  Collect factors f 1,…,f k that include X i  Generate a new factor by eliminating X i from these factors  Variable X i has been eliminated!

–  Carlos Guestrin MLE Variable elimination algorithm – Backward pass {x 1 *,…, x n * } will store maximizing assignment For i = n to 1, If X i  E  Take factors f 1,…,f k used when X i was eliminated  Instantiate f 1,…,f k, with {x i+1 *,…, x n * } Now each f j depends only on X i  Generate maximizing assignment for X i :

–  Carlos Guestrin What you need to know Variable elimination algorithm  Eliminate a variable: Combine factors that include this var into single factor Marginalize var from new factor  Cliques in induced graph correspond to factors generated by algorithm  Efficient algorithm (“only” exponential in induced-width, not number of variables) If you hear: “Exact inference only efficient in tree graphical models” You say: “No!!! Any graph with low induced width” And then you say: “And even some with very large induced-width” (special recitation) Elimination order is important!  NP-complete problem  Many good heuristics Variable elimination for MLE  Only difference between probabilistic inference and MLE is “sum” versus “max”