Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.

Slides:

Advertisements

Similar presentations

CS188: Computational Models of Human Behavior

Advertisements

Efficient Inference for General Hybrid Bayesian Networks Wei Sun PhD in Information Technology George Mason University, 2007.

Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.

Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.

CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.

1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.

Lauritzen-Spiegelhalter Algorithm

Exact Inference in Bayes Nets

Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.

Dynamic Bayesian Networks (DBNs)

For Monday Read chapter 18, sections 1-2 Homework: –Chapter 14, exercise 8 a-d.

EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.

Pearl’s Belief Propagation Algorithm Exact answers from tree-structured Bayesian networks Heavily based on slides by: Tomas Singliar,

Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.

Markov Networks.

Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.

From Variable Elimination to Junction Trees

Machine Learning CUNY Graduate Center Lecture 6: Junction Tree Algorithm.

CSCI 121 Special Topics: Bayesian Networks Lecture #3: Multiply-Connected Graphs and the Junction Tree Algorithm.

GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.

M.I. Jaime Alfonso Reyes ´Cortés.  The basic task for any probabilistic inference system is to compute the posterior probability distribution for a set.

Global Approximate Inference Eran Segal Weizmann Institute.

Belief Propagation, Junction Trees, and Factor Graphs

Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.

5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.

Bayesian Networks Alan Ritter.

1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.

Inference in Gaussian and Hybrid Bayesian Networks ICS 275B.

Machine Learning CUNY Graduate Center Lecture 21: Graphical Models.

CSC2535 Spring 2013 Lecture 2a: Inference in factor graphs Geoffrey Hinton.

Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by B.-H. Kim Biointelligence Laboratory, Seoul National.

Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.

Belief Propagation. What is Belief Propagation (BP)? BP is a specific instance of a general class of methods that exist for approximate inference in Bayes.

1 CS 391L: Machine Learning: Bayesian Learning: Beyond Naïve Bayes Raymond J. Mooney University of Texas at Austin.

1 Chapter 14 Probabilistic Reasoning. 2 Outline Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions.

For Wednesday Read Chapter 11, sections 1-2 Program 2 due.

2 Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions Exact inference by enumeration Exact.

Direct Message Passing for Hybrid Bayesian Networks Wei Sun, PhD Assistant Research Professor SFL, C4I Center, SEOR Dept. George Mason University, 2009.

Bayesian Statistics and Belief Networks. Overview Book: Ch 13,14 Refresher on Probability Bayesian classifiers Belief Networks / Bayesian Networks.

Learning With Bayesian Networks Markus Kalisch ETH Zürich.

1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.

An Introduction to Variational Methods for Graphical Models

Lecture 2: Statistical learning primer for biologists

Belief Propagation and its Generalizations Shane Oldenburger.

Wei Sun and KC Chang George Mason University March 2008 Convergence Study of Message Passing In Arbitrary Continuous Bayesian.

Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:

Christopher M. Bishop, Pattern Recognition and Machine Learning 1.

1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.

Pattern Recognition and Machine Learning

Introduction on Graphic Models

Machine Learning – Lecture 18

Daphne Koller Overview Conditional Probability Queries Probabilistic Graphical Models Inference.

Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Machine Learning – Lecture 13 Exact Inference & Belief Propagation Bastian.

Today Graphical Models Representing conditional dependence graphically

Belief propagation with junction trees Presented by Mark Silberstein and Yaniv Hamo.

1 Structure Learning (The Good), The Bad, The Ugly Inference Graphical Models – Carlos Guestrin Carnegie Mellon University October 13 th, 2008 Readings:

CS 541: Artificial Intelligence Lecture VII: Inference in Bayesian Networks.

Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk

Markov Networks.

CSCI 5822 Probabilistic Models of Human and Machine Learning

Bayesian Statistics and Belief Networks

Markov Networks.

Class #19 – Tuesday, November 3

Expectation-Maximization & Belief Propagation

Class #16 – Tuesday, October 26

Readings: K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7 Markov networks, Factor graphs, and an unified view Start approximate inference If we are lucky… Graphical.

Lecture 3: Exact Inference in GMs

Junction Trees 3 Undirected Graphical Models

Markov Networks.

Presentation transcript:

Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009

2 Outline Bayesian network and its properties Probabilistic inference for Bayesian networks Inference algorithm overview Junction tree algorithm review Current research

3 Definition of BN A Bayesian network is a directed, acyclic graph consisting of nodes and arcs:  Nodes: variables  Arcs: probabilistic dependence relationships.  Parameters: for each node, there is a conditional probability distribution (CPD). CPD of X i : P(X i |Pa(X i )) where Pa(X i ) represents all parents of X i  Discrete: CPD is typically represented as a table, also called CPT.  Continuous: CPD involves functions, such as P(X i |Pa(X i )) = f(Pa(X i ), w), where w is a random noise. Joint distribution of variables in BN is

4 Bayesian Network Example Vehicle Identification

5 Probabilistic Inference in BN Task: find the posterior distributions of query nodes given evidence.  Bayes’ Rule: Both exact and approximate inference using BNs are NP-hard. Tractable inference algorithms exist only for special classes of BNs.

6 Classify BNs by Network Structure Multiply - connected networksSingly-connected networks (a.k.a. polytree)

7 Classify BNs by Node Types Node types  Discrete: conditional probability distribution is usually represented as a table.  Continuous: Gaussian or non- Gaussian distribution; conditional probability distribution is specified using functions: P(X i |Pa(X i )) = f(Pa(X i ), w) where w is a random noise; the function could be linear/nonlinear.  Hybrid model: mixed discrete and continuous variables.

8 Conditional Linear Gaussian (CLG) CLG – Conditional Linear Gaussian model is the simplest hybrid Bayesian networks:  All continuous variable are Gaussian  The functional relationships between continuous variables and their parents are linear.  No continuous parent for any discrete node. Given any assignment of all discrete variables in CLG, it represents a multivariate Gaussian distribution.

9 Conditional Hybrid Model (CHM) The conditional hybrid model (CHM) is a special hybrid BN:  No continuous parent for any discrete node.  Continuous variable can be arbitrary.  The functional relationships between variables can be arbitrary nonlinear. Only difference between CHM and general hybrid BN is the restriction that there is no continuous parent for any discrete node.

10 Examples of CHM and CLG Conditional Hybrid Model (CHM)CLG model

11 Taxonomy of BNs Research Focus

12 Inference Algorithms Review - 1 Exact Inference  Pearl’s message passing algorithm (MP) [Pearl88] In MP, messages (probabilities/likelihood) propagate between variables. After finite number of iterations, each node has its correct beliefs. It only works for pure discrete or pure Gaussian and singly-connected network (inference is done in linear time).  Clique tree (a.k.a. Junction tree) [LS88,SS90,HD96] and related algorithms Includes variable elimination, arc reversal, symbolic probabilistic inference (SPI). It only works on pure discrete or pure Gaussian networks or simple CLGs For CLGs, clique tree algorithm is also called Lauritzen’s algorithm [Lau92]. It returns the correct mean and variance of the posterior distributions for continuous variables even though the true distribution might be Gaussian mixture. It does not work for general hybrid model and is intractable for complicated CLGs.

13 Inference Algorithms Review - 2 Approximate Inference  Model simplification Discretization, linearization, arc removal etc. Performance degradation could be significant.  Sampling method Logic sampling [Hen88] Likelihood weighting [FC89] Adaptive Importance Sampling (AIS-BN) [CD00], EPIS-BN [YD03], Cutset sampling [BD06]  Performs well in case of unlikely evidence, but only work for pure discrete networks Markov chain Monte Carlo.  Loopy propagation [MWJ99]: use Pearl’s message passing algorithm for networks with loops. This become a popular topic recently. For pure discrete or pure Gaussian networks with loops, it usually converges to approximate answers in several iterations. For hybrid model, message representation and integration are issues.  Numerical hybrid loopy propagation [YD06], computational intensive.  Conditioned hybrid message passing [SC07], exponential complexity on the size of interface nodes.

14 Junction Tree Algorithm JT is the most popular exact inference algorithm for Bayesian networks.  v1: JT for discrete network [LS89]  v2: JT for CLG, also called Lauritzen’s algorithm [Lau92] - extension of JT v1. Junction tree property:  if node S appears in both clique U and V, then node S is in all cliques on the path between U and V. Junction property guarantees the correctness of message propagation. Restriction:  For pure discrete or simple CLG only  Complexity depends on the size of the biggest clique.

15 Junction Tree for CLG Graph transformation – construct Junction tree from the original DAG  DAG -> Undirected graph  Moralization, triangulation, and decomposition.  Clique identification and connection for building a tree Local message passing to propagate beliefs in the tree  Clique potential and separator  Initialization  Evidence entering and absorption  Marginalization

16 JT Moralization, Triangulation Moralization Moralization – to marry the parents: link nodes if they have common child. Triangulation Triangulation – any chordless cycle has at most 3 nodes. T FWB E D C T FWB E D C

17 JT Decomposition (for CLG only) Any path between two discrete nodes that containing only continuous nodes is forbidden we have to link these two discrete nodes to make the graph strongly decomposable. – we have to link these two discrete nodes to make the graph strongly decomposable. T FWB E D C

18 Clique and Junction Tree Clique is a maximal and complete cluster of nodes (subset of variables) – if node S has link to all of nodes in clique U, node S belongs to clique U. Clique tree is not unique. T FWB E D C BFEWFE BED WED BCWT

19 Local Message Passing in JT Next time.

20 Current Research about Direct Message Passing Algotithm

21 Pearl’s Message Passing Algorithm In polytree, any node d-separate the sub-network above it from the sub- network below it. For a typical node X in a polytree, evidence can be divided into two exclusive sets, and processed separately: Define messages and messages as: Multiply-connected network may not be partitioned into two separate sub- networks by a node. Then the belief of node X is:

22 Pearl’s Message Passing in BNs In message passing algorithm, each node maintains Lambda message and Pi message for itself. Also it sends Lambda message to every parent it has and Pi message to its children. After finite-number iterations of message passing, every node obtains its correct belief. For polytree, MP returns exact belief; For networks with loop, MP is called loopy propagation that often gives good approximation to posterior distributions.

23 Unscented Hybrid Loopy Propagation U D X Weighted sum of continuous message. where is the function specified in CPD of X. Non-negative constant. Weighted sum of continuous message. where is the inverse function. Complexity is reduced significantly! Only depends on the size of discrete parents in local CPD.

24 A B C U X Y W Z