Markov Networks Alan Ritter.

Slides:



Advertisements
Similar presentations
CS188: Computational Models of Human Behavior
Advertisements

1 Undirected Graphical Models Graphical Models – Carlos Guestrin Carnegie Mellon University October 29 th, 2008 Readings: K&F: 4.1, 4.2, 4.3, 4.4,
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
Graphical Models BRML Chapter 4 1. the zoo of graphical models Markov networks Belief networks Chain graphs (Belief and Markov ) Factor graphs =>they.
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
Introduction to Markov Random Fields and Graph Cuts Simon Prince
Exact Inference in Bayes Nets
Identifying Conditional Independencies in Bayes Nets Lecture 4.
Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis)
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Probabilistic Inference Lecture 1
Markov Networks.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
Junction Tree Algorithm Brookes Vision Reading Group.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
Today Logistic Regression Decision Trees Redux Graphical Models
Bayesian Networks Alan Ritter.
CPSC 422, Lecture 18Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Feb, 25, 2015 Slide Sources Raymond J. Mooney University of.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
. DAGs, I-Maps, Factorization, d-Separation, Minimal I-Maps, Bayesian Networks Slides by Nir Friedman.
Machine Learning CUNY Graduate Center Lecture 21: Graphical Models.
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by B.-H. Kim Biointelligence Laboratory, Seoul National.
Presenter : Kuang-Jui Hsu Date : 2011/5/23(Tues.).
Probabilistic graphical models. Graphical models are a marriage between probability theory and graph theory (Michael Jordan, 1998) A compact representation.
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models
1 CS 391L: Machine Learning: Bayesian Learning: Beyond Naïve Bayes Raymond J. Mooney University of Texas at Austin.
1 Monte Carlo Artificial Intelligence: Bayesian Networks.
Markov Random Fields Probabilistic Models for Images
1 Markov Random Fields with Efficient Approximations Yuri Boykov, Olga Veksler, Ramin Zabih Computer Science Department CORNELL UNIVERSITY.
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
CS774. Markov Random Field : Theory and Application Lecture 02
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
LAC group, 16/06/2011. So far...  Directed graphical models  Bayesian Networks Useful because both the structure and the parameters provide a natural.
Slides for “Data Mining” by I. H. Witten and E. Frank.
An Introduction to Variational Methods for Graphical Models
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
Lecture 2: Statistical learning primer for biologists
1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
CPSC 422, Lecture 17Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17 Oct, 19, 2015 Slide Sources D. Koller, Stanford CS - Probabilistic.
Christopher M. Bishop, Pattern Recognition and Machine Learning 1.
1 Learning P-maps Param. Learning Graphical Models – Carlos Guestrin Carnegie Mellon University September 24 th, 2008 Readings: K&F: 3.3, 3.4, 16.1,
Introduction on Graphic Models
Bayesian Belief Propagation for Image Understanding David Rosenberg.
Markov Random Fields in Vision
CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016.
INTRODUCTION TO Machine Learning 2nd Edition
CHAPTER 16: Graphical Models
Markov Random Fields with Efficient Approximations
The set  of all independence statements defined by (3
Prof. Adriana Kovashka University of Pittsburgh April 4, 2017
Markov Networks.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18
Dependency Models – abstraction of Probability distributions
General Gibbs Distribution
Pattern Recognition and Image Analysis
Lecture 5 Unsupervised Learning in fully Observed Directed and Undirected Graphical Models.
Markov Networks.
Markov Random Fields Presented by: Vladan Radosavljevic.
Readings: K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7 Markov networks, Factor graphs, and an unified view Start approximate inference If we are lucky… Graphical.
Junction Trees 3 Undirected Graphical Models
Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18
Markov Networks.
Outline Texture modeling - continued Markov Random Field models
Presentation transcript:

Markov Networks Alan Ritter

Markov Networks Smoking Cancer Asthma Cough Undirected graphical models Smoking Cancer Asthma Cough Potential functions defined over cliques Smoking Cancer Ф(S,C) False 4.5 True 2.7

Undirected Graphical Models: Motivation Terminology: Directed graphical models = Bayesian Networks Undirected graphical models = Markov Networks We just learned about DGMs (Bayes Nets) For some domains being forced to choose a direction of edges is awkward. Example: consider modeling an image Assumption: neighboring pixels are correlated We could create a DAG model w/ 2D topology

2D Bayesian Network

Markov Random Field (Markov Network)

UGMs (Bayes Nets) vs DGMs (Markov Nets) Advantages Symmetric More natural for certain domains (e.g. spatial or relational data) Discriminative UGMs (A.K.A Conditional Random Fields) work better than discriminative UGMs Disadvantages Parameters are less interpretable and modular Parameter estimation is computationally more expensive

Conditional Independence Properties Much Simpler than Bayesian Networks No d-seperation, v-structures, etc… UGMs define CI via simple graph separation E.g. if we remove all the evidence nodes from the graph, are there any paths connecting A and B?

Markov Blanket Also Simple Markov blanket of a node is just the set of it’s immediate neighbors Don’t need to worry about co-parents

Independence Properties Pairwise: Local: Global:

Converting a Bayesian Network to a Markov Network Tempting: Just drop directionality of the edges But this is clearly incorrect (v-structure) Introduces incorrect CI statements Solution: Add edges between “unmarried” parents This process is called moralization

Example: moralization Unfortunately, this looses some CI information Example:

Directed vs. Undirected GMs Q: which has more “expressive power”? Recall: G is an I-map of P if: Now define: G is a perfect I-map of P if: Graph can represent all (and only) CIs in P Bayesian Networks and Markov Networks are perfect maps for different sets of distributions

Parameterization No topological ordering on undirected graph Can’t use the chain rule of probability to represent P(y) Instead we will use potential functions: associate potential functions with each maximal clique in the graph A potential can be any non-negative function Joint distribution is defined to be proportional to product of clique potentials

Parameterization (con’t) Joint distribution is defined to be proportional to product of clique potentials Any positive distribution whose CI properties can be represented by an UGM can be represented this way.

Hammersly-Clifford Theorem A positive distribution P(Y) > 0 satisfies the CI properties of an undirected graph G iff P can be represented as a product of factors, one per maximal clique Z is the partition function

Example If P satisfies the conditional independence assumptions of this graph, we can write

Pairwise MRF Potentials don’t need to correspond to maximal cliques We can also restrict parameterization to edges (or any other cliques) Pairwise MRF:

Representing Potential Functions Can represent as CPTs like we did for Bayesian Networks (DGMs) But, potentials are not probabilities Represent relative “compatibility” between various assignments

Representing Potential Functions More general approach: Represent the log potentials as a linear function of the parameters Log-linear (maximum entropy) models

Log-Linear Models Smoking Cancer Asthma Cough Log-linear model: Weight of Feature i Feature i

Log-Linear models can represent Table CPTs Consider pairwise MRF where each edge has an associated potential w/ K^2 features: Then we can convert into a potential function using the weight for each feature: But, log-linear model is more general Feature vectors can be arbitrarily designed