12/07/2008UAI 2008 Cumulative Distribution Networks and the Derivative-Sum-Product Algorithm Jim C. Huang and Brendan J. Frey Probabilistic and Statistical.

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

Linear Time Methods for Propagating Beliefs Min Convolution, Distance Transforms and Box Sums Daniel Huttenlocher Computer Science Department December,
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.
Markov Networks Alan Ritter.
Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
Graphical Models BRML Chapter 4 1. the zoo of graphical models Markov networks Belief networks Chain graphs (Belief and Markov ) Factor graphs =>they.
Exact Inference in Bayes Nets
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Introduction to probability theory and graphical models Translational Neuroimaging Seminar on Bayesian Inference Spring 2013 Jakob Heinzle Translational.
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
04/02/2006RECOMB 2006 Detecting MicroRNA Targets by Linking Sequence, MicroRNA and Gene Expression Data Joint work with Quaid Morris (2) and Brendan Frey.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
A Graphical Model For Simultaneous Partitioning And Labeling Philip Cowans & Martin Szummer AISTATS, Jan 2005 Cambridge.
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
Probability theory 2010 Main topics in the course on probability theory  Multivariate random variables  Conditional distributions  Transforms  Order.
Probability theory 2011 Main topics in the course on probability theory  The concept of probability – Repetition of basic skills  Multivariate random.
Conditional Random Fields
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
Understanding Belief Propagation and its Applications Dan Yuan June 2004.
Third Generation Machine Intelligence Christopher M. Bishop Microsoft Research, Cambridge Microsoft Research Summer School 2009.
Belief Propagation Kai Ju Liu March 9, Statistical Problems Medicine Finance Internet Computer vision.
The moment generating function of random variable X is given by Moment generating function.
Bayesian Networks Alan Ritter.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Computer vision: models, learning and inference
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
A Brief Introduction to Graphical Models
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by B.-H. Kim Biointelligence Laboratory, Seoul National.
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
Probabilistic Graphical Models
Frontiers in Applications of Machine Learning Chris Bishop Microsoft Research
Markov Random Fields Probabilistic Models for Images
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
CS774. Markov Random Field : Theory and Application Lecture 02
CS Statistical Machine learning Lecture 24
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:
Lecture 2: Statistical learning primer for biologists
1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Maximum Entropy Model, Bayesian Networks, HMM, Markov Random Fields, (Hidden/Segmental) Conditional Random Fields.
Christopher M. Bishop, Pattern Recognition and Machine Learning 1.
Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3
Pattern Recognition and Machine Learning
STA347 - week 91 Random Vectors and Matrices A random vector is a vector whose elements are random variables. The collective behavior of a p x 1 random.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Today Graphical Models Representing conditional dependence graphically
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.
Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208
Bayesian Belief Propagation for Image Understanding David Rosenberg.
Markov Random Fields in Vision
Perfect recall: Every decision node observes all earlier decision nodes and their parents (along a “temporal” order) Sum-max-sum rule (dynamical programming):
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
Slide 1 Directed Graphical Probabilistic Models: inference William W. Cohen Machine Learning Feb 2008.
Probability for Machine Learning
Main topics in the course on probability theory
Prof. Adriana Kovashka University of Pittsburgh April 4, 2017
Bayesian Ranking using Expectation Propagation and Factor Graphs
Markov Networks.
Bayesian Models in Machine Learning
CSCI 5822 Probabilistic Models of Human and Machine Learning
Markov Random Fields Presented by: Vladan Radosavljevic.
Expectation-Maximization & Belief Propagation
Junction Trees 3 Undirected Graphical Models
Experiments, Outcomes, Events and Random Variables: A Revisit
Presentation transcript:

12/07/2008UAI 2008 Cumulative Distribution Networks and the Derivative-Sum-Product Algorithm Jim C. Huang and Brendan J. Frey Probabilistic and Statistical Inference Group, Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada

12/07/2008UAI 2008 Problems where density models may be intractable e.g.: Modelling arbitrary dependencies e.g.: Modelling stochastic orderings Cumulative distribution network (CDN) Motivation e.g.: Predicting game outcomes in Halo 2

12/07/2008UAI 2008 Cumulative distribution networks (CDNs) Graphical model of the cumulative distribution function (CDF) Example:

12/07/2008UAI 2008 Positive convergence Negative convergence Monotonicity Cumulative distribution functions Marginalization  maximization Conditioning  differentiation

12/07/2008UAI 2008 Necessary/sufficient conditions on CDN functions Negative convergence (necessity and sufficiency): Positive convergence (sufficiency): For each X k, at least one neighboring function  0 All functions  1

12/07/2008UAI 2008 Necessary/sufficient conditions on CDN functions Monotonicity lemma (sufficiency): All functions monotonically non-decreasing… Sufficient condition for a valid joint CDF: Each CDN function can be a CDF of its arguments

12/07/2008UAI 2008 Marginal independence Marginalization  maximization –e.g.: X is marginally independent of Y

12/07/2008UAI 2008 Conditional independence Conditioning  differentiation –e.g.: X and Y are conditionally dependent given Z –e.g.: X and Y are conditionally independent given Z Conditional independence  No paths contain observed variables

12/07/2008UAI 2008 Check: A toy example Markov random fields Required “Bayes net”

12/07/2008UAI 2008 Inference by message passing Conditioning  differentiation Replace sum in sum-product with differentiation Recursively apply product rule via message-passing with messages , Derivative-Sum-Product (DSP) …

12/07/2008UAI 2008 Derivative-sum-product In a CDN: In a factor graph:

12/07/2008UAI 2008 Ranking in multiplayer gaming e.g.: Halo 2 game with 7 players, 3 teams Player skill functions Player performanc e Team performanc e Given game outcomes, update player skills as a function of all player/team performances

12/07/2008UAI 2008 Ranking in multiplayer gaming = Local cumulative model linking team rank r n with player performances x n e.g.: Team 2 has rank 2

12/07/2008UAI 2008 Ranking in multiplayer gaming Enforce stochastic orderings between teams via h = Pairwise model of team ranks r n,r n+1

12/07/2008UAI 2008 CDN functions = Gaussian CDFs Skill updates: Prediction: Ranking in multiplayer gaming

12/07/2008UAI 2008 Results Previous methods for ranking players: –ELO (Elo, 1978) –TrueSkill (Graepel, Minka and Herbrich, 2006) After message-passing…

12/07/2008UAI 2008 Summary The CDN as a graphical model for CDFs Unique conditional independence structure Marginalization  maximization Global normalization can be enforced locally Conditioning  differentiation Efficient inference with Derivative-Sum-Product Application to Halo 2 Beta Dataset

12/07/2008UAI 2008 Discussion Need to be careful when applying to ordinal discrete variables… Principled method for learning CDNs Variational principle? (loopy DSP seems to work well) Future applications to –Hypothesis testing –Document retrieval –Collaborative filtering –Biological sequence search –…

12/07/2008UAI 2008 Thanks Questions?

12/07/2008UAI 2008 Null-dependence in CDNs Given that X s, X t, X u are sets of discrete variables over the alphabet A =  0,…,K , if X s, X t are marginally independent and X u = (0,…,0), then X s and X t are conditionally independent given X u Proof:

12/07/2008UAI 2008 The CDN as a random field Take an undirected graph G = (V,E) (NOT an MRF/CRF!) Define CUT(A,B) as a set of nodes which if removed, cuts G into two parts containing A,B If the random field satisfies: Necessary and sufficient for potentials to be defined over connected components of G (proof similar to that of Hammersley-Clifford theorem)

12/07/2008UAI 2008 Interpretation of skill updates For any given player let denote the outcomes of games he/she has played previously Then the skill function corresponds to

12/07/2008UAI 2008 Derivative-Sum-Product Message from function to variable: …

12/07/2008UAI 2008 Derivative-Sum-Product Message from variable to function: …