Junction Tree Algorithm Brookes Vision Reading Group.

Slides:



Advertisements
Similar presentations
1 Undirected Graphical Models Graphical Models – Carlos Guestrin Carnegie Mellon University October 29 th, 2008 Readings: K&F: 4.1, 4.2, 4.3, 4.4,
Advertisements

Markov Networks Alan Ritter.
Constraint Satisfaction Problems
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
1 Chapter 5 Belief Updating in Bayesian Networks Bayesian Networks and Decision Graphs Finn V. Jensen Qunyuan Zhang Division. of Statistical Genomics,
Graphical Models BRML Chapter 4 1. the zoo of graphical models Markov networks Belief networks Chain graphs (Belief and Markov ) Factor graphs =>they.
Lauritzen-Spiegelhalter Algorithm
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
Introduction to Markov Random Fields and Graph Cuts Simon Prince
Exact Inference in Bayes Nets
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.
From Variable Elimination to Junction Trees
Machine Learning CUNY Graduate Center Lecture 6: Junction Tree Algorithm.
Introduction to Inference for Bayesian Netoworks Robert Cowell.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
CS774. Markov Random Field : Theory and Application Lecture 06 Kyomin Jung KAIST Sep
Greedy Algorithms for Matroids Andreas Klappenecker.
Hardness Results for Problems P: Class of “easy to solve” problems Absolute hardness results Relative hardness results –Reduction technique.
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
Global Approximate Inference Eran Segal Weizmann Institute.
Bayesian Networks Clique tree algorithm Presented by Sergey Vichik.
Belief Propagation, Junction Trees, and Factor Graphs
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
BCCS 2008/09: GM&CSS Lecture 6: Bayes(ian) Net(work)s and Probabilistic Expert Systems.
Exact Inference: Clique Trees
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Chapter 9: Graphs Basic Concepts
The Erik Jonsson School of Engineering and Computer Science Chapter 1 pp William J. Pervin The University of Texas at Dallas Richardson, Texas
A Brief Introduction to Graphical Models
Probabilistic Graphical Models David Madigan Rutgers University
Directed - Bayes Nets Undirected - Markov Random Fields Gibbs Random Fields Causal graphs and causality GRAPHICAL MODELS.
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
1 Inferring structure to make substantive conclusions: How does it work? Hypothesis testing approaches: Tests on deviances, possibly penalised (AIC/BIC,
Greedy Algorithms and Matroids Andreas Klappenecker.
Generalizing Variable Elimination in Bayesian Networks 서울 시립대학원 전자 전기 컴퓨터 공학과 G 박민규.
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
An Introduction to Variational Methods for Graphical Models
12/7/20151 Math b Conditional Probability, Independency, Bayes Theorem.
Intro to Junction Tree propagation and adaptations for a Distributed Environment Thor Whalen Metron, Inc.
Indexing Correlated Probabilistic Databases Bhargav Kanagal, Amol Deshpande University of Maryland, College Park, USA SIGMOD Presented.
1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Inference Algorithms for Bayes Networks
On Distributing a Bayesian Network
Christopher M. Bishop, Pattern Recognition and Machine Learning 1.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Today Graphical Models Representing conditional dependence graphically
Markov Random Fields in Vision
1 Variable Elimination Graphical Models – Carlos Guestrin Carnegie Mellon University October 15 th, 2008 Readings: K&F: 8.1, 8.2, 8.3,
BN Semantic II d-Separation, PDAGs, etc
Inference in Bayesian Networks
Quick Review Probability Theory
Quick Review Probability Theory
Markov Networks.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18
CSCI 5822 Probabilistic Models of Human and Machine Learning
Chapter 9: Graphs Basic Concepts
Markov Networks.
Markov Random Fields Presented by: Vladan Radosavljevic.
Exact Inference Continued
Readings: K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7 Markov networks, Factor graphs, and an unified view Start approximate inference If we are lucky… Graphical.
Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website
Variable Elimination Graphical Models – Carlos Guestrin
Chapter 9: Graphs Basic Concepts
Presentation transcript:

Junction Tree Algorithm Brookes Vision Reading Group

Outline Graphical Models – What are Graphical Models ? – Conditional Independence – Inference Junction Tree Algorithm – Moralizing a graph – Junction Tree Property – Creating a junction tree – Inference using junction tree algorithm

Don’t we all know … P(A) = 1, if and only if A is certain P(A or B) = P(A)+P(B) if and only if A and B are mutually exclusive P(A,B) = P(A|B)P(B) = P(B|A)P(A) Conditional Independence –A is conditionally independent of C given B –P(A|B,C) = P(A|B)

Outline Graphical Models – What are Graphical Models ? – Conditional Independence – Inference Junction Tree Algorithm – Moralizing a graph – Junction Tree Property – Creating a junction tree – Inference using junction tree algorithm

Graphical Models Compact graphical representation of joint probability. A B P(A,B) = P(A)P(B|A) A ‘causes’ B

Graphical Models Compact graphical representation of joint probability. A B P(A,B) = P(B)P(A|B) B ‘causes’ A

Graphical Models Compact graphical representation of joint probability. A B P(A,B)

A Simple Example P(A,B,C) = P(A)P(B,C | A) = P(A) P(B|A) P(C|B,A) C is conditionally independent of A given B = P(A) P(B|A) P(C|B) Graphical Representation ???

Bayesian Network Directed Graphical Model P(U) =  P(V i | Pa(V i )) A B C P(A,B,C) = P(A) P(B | A) P(C | B)

Markov Random Fields Undirected Graphical Model ABC

Markov Random Fields Undirected Graphical Model ABBC B P(U) =  P(Clique) /  P(Separator) Clique Separator P(A,B,C) = P(A,B) P(B,C) / P(B)

Outline Graphical Models – What are Graphical Models ? – Conditional Independence – Inference Junction Tree Algorithm – Moralizing a graph – Junction Tree Property – Creating a junction tree – Inference using junction tree algorithm

Bayesian Networks A is conditionally independent of B given C Bayes ball cannot reach A from B

Markov Random Fields A, B, C - (set of) nodes C is conditionally independent of A given B All paths from A to C go through B

Markov Random Fields

A node is conditionally independent of all others given its neighbours.

Outline Graphical Models – What are Graphical Models ? – Conditional Independence – Inference Junction Tree Algorithm – Moralizing a graph – Junction Tree Property – Creating a junction tree – Inference using junction tree algorithm

MAP Estimation c*,s*,r*,w* = argmax P(C=c,S=s,R=r,W=w)

Computing Marginals P(W=w) =  c,s,r P(C=c,S=s,R=r,W=w)

Outline Graphical Models – What are Graphical Models ? – Conditional Independence – Inference Junction Tree Algorithm – Moralizing a graph – Junction Tree Property – Creating a junction tree – Inference using junction tree algorithm

Aim To perform exact inference efficiently Transform the graph into an appropriate data structure Ensure joint probability remains the same Ensure exact marginals can be computed

Junction Tree Algorithm Converts Bayes Net into an undirected tree – Joint probability remains unchanged – Exact marginals can be computed Why ??? –Uniform treatment of Bayes Net and MRF –Efficient inference is possible for undirected trees

Junction Tree Algorithm Converts Bayes Net into an undirected tree – Joint probability remains unchanged – Exact marginals can be computed Why ??? –Uniform treatment of Bayes Net and MRF –Efficient inference is possible for undirected trees

Let us recap.. Shall we P(U) =  P(V i | Pa(V i )) =  a(V i, Pa(V i )) Potential Lets convert this to an undirected graphical model A BC D

Let us recap.. Shall we A BC D Wait a second …something is wrong here. The cliques of this graph are inconsistent with the original one. Node D just lost a parent.

Solution A BC D Ensure that a node and its parents are part of the same clique Marry the parents for a happy family Now you can make the graph undirected

Solution A BC D A few conditional independences are lost. But we have added extra edges, haven’t we ???

Outline Graphical Models – What are Graphical Models ? – Conditional Independence – Inference Junction Tree Algorithm – Moralizing a graph – Junction Tree Property – Creating a junction tree – Inference using junction tree algorithm

Moralizing a graph Marry all unconnected parents Drop the edge directions Ensure joint probability remains the same.

Moralizing a graph

C SR W CSR SRW SR Clique Potentials a(C i ) Separator Potentials a(S i ) Initialize a(C i ) = 1 a(S i ) = 1

Moralizing a graph C SR W CSR SRW SR Choose one node V i Find one clique C i containing V i and Pa(V i ) Multiply a(V i,Pa(V i )) to a(C i ) Repeat for all V i

Moralizing a graph C SR W CSR SRW SR Choose one node V i Find one clique C i containing V i and Pa(V i ) Multiply a(V i,Pa(V i )) to a(C i ) Repeat for all V i

Moralizing a graph C SR W CSR SRW SR Choose one node V i Find one clique C i containing V i and Pa(V i ) Multiply a(V i,Pa(V i )) to a(C i ) Repeat for all V i

Moralizing a graph C SR W CSR SRW SR Choose one node V i Find one clique C i containing V i and Pa(V i ) Multiply a(V i,Pa(V i )) to a(C i ) Repeat for all V i

Moralizing a graph P(U) =  a(C i ) /  a(S i ) Now we can form a tree with all the cliques we chose. That was easy. We’re ready to marginalize. OR ARE WE ???

A few more examples … AB DC AB CD

AB DC AB CD

ABBD CDAC ABBCD Inconsistency in C Clearly we’re missing something here

Outline Graphical Models – What are Graphical Models ? – Conditional Independence – Inference Junction Tree Algorithm – Moralizing a graph – Junction Tree Property – Creating a junction tree – Inference using junction tree algorithm

Junction Tree Property In a junction tree, all cliques in the unique path between cliques C i and C j must contain C i  C j So what we want is a junction tree, right ??? Q. Do all graphs have a junction tree ??? A. NO

Decomposable Graphs Decomposition (A,B,C) ACB V = A  B  C All paths between A and B go through C C is a complete subset of V Undirected graph G = (V,E)

Decomposable Graphs A, B and/or C can be empty A, B are non-empty in a proper decomposition

Decomposable Graphs G is decomposable if and only if G is complete OR It possesses a proper decomposition (A,B,C) such that – G A  C is decomposable – G B  C is decomposable

Decomposable Graphs AB DC AB CD Not Decomposable Decomposable

Decomposable Graphs Not Decomposable Decomposable A BC ED A BC ED

An Important Theorem Theorem: A graph G has a junction tree if and only if it is decomposable. Proof on white board.

OK. So how do I convert my graph into a decomposable one.

Time for more definitions Chord of a cycle – An edge between two non-successive nodes Chordless cycle – A cycle with no chords Triangulated graph – A graph with no chordless cycles

Another Important Theorem Theorem: A graph G is decomposable if and only if it is triangulated. Proof on white board. Alright. So add edges to triangulate the graph.

Triangulating a Graph AB CD ABCBCD BC

Triangulating a Graph

Some Notes on Triangulation Can we ensure the joint probability remains unchanged ?? Of course. Adding edges preserves cliques found after moralization. Use the previous algorithm for initializing potentials. Aren’t more conditional independences lost ??? Yes. :-(

Some Notes on Triangulation Is Triangulation unique?? No. Okay then. Lets find the best triangulation. Sadly, that’s NP hard. Hang on. We still have a graph. We were promised a tree. Alright. Lets form a tree then.

Outline Graphical Models – What are Graphical Models ? – Conditional Independence – Inference Junction Tree Algorithm – Moralizing a graph – Junction Tree Property – Creating a junction tree – Inference using junction tree algorithm

Creating a Junction Tree A BD CE ABD CDE BCD ABD BCD CDE Not a junction tree Junction tree Clearly, we’re still missing something here.

Yet Another Theorem Theorem: A junction tree is an MST where the weights are the cardinality of the separators. Proof on white board. Alright. So lets form an MST.

A BD CE Forming an MST ABD BCDCDE 2 2 1

A BD CE Forming an MST ABD BCDCDE 2 2 1

A BD CE Forming an MST ABD BCDCDE 2 2 1

A BD CE Forming an MST ABD BCDCDE 2 2

A Quick Recap AS BLT E XD Asia Network

A Quick Recap AS BLT E XD 1. Marry unconnected parents

A Quick Recap AS BLT E XD 2. Drop directionality of edges.

A Quick Recap AS BLT E XD 3. Triangulate the graph.

A Quick Recap 4. Find the MST clique tree. Voila.. The junction tree. SBL BLE DBEXE TLEAT Whew. Done !! But where are these marginals we were talking about ?

Outline Graphical Models – What are Graphical Models ? – Conditional Independence – Inference Junction Tree Algorithm – Moralizing a graph – Junction Tree Property – Creating a junction tree – Inference using junction tree algorithm

Inference using JTA Modify potentials Ensure joint probability is consistent Ensure consistency between neighbouring cliques Ensure clique potentials = clique marginals Ensure separator potentials = separator marginals

Inference using JTA VW S 1. a*(S) =  V\S a(V) 2. a*(W) = a(W) a*(S) / a(S) 3. a**(S) =  W\S a*(W) 4. a*(V) = a(V) a**(S) / a*(S)  V\S a*(V) = a**(S) =  W\S a*(W) Consistency

Inference using JTA VW S 1. a*(S) =  V\S a(V) 2. a*(W) = a(W) a*(S) / a(S) 3. a**(S) =  W\S a*(W) 4. a*(V) = a(V) a**(S) / a*(S) a*(V) a*(W) / a**(S) = a(V) a(W) / a(S) Joint probability remains same

One Last Theorem Theorem: After JTA, Potentials = Marginals Proof on white board. (Then we can all go home)

Happy Marginalizing