by Ryan P. Adams, Iain Murray, and David J.C. MacKay (ICML 2009)

Slides:

Advertisements

Similar presentations

Part 2: Unsupervised Learning

Advertisements

Xiaolong Wang and Daniel Khashabi

Factorial Mixture of Gaussians and the Marginal Independence Model Ricardo Silva Joint work-in-progress with Zoubin Ghahramani.

Gibbs Sampling Methods for Stick-Breaking priors Hemant Ishwaran and Lancelot F. James 2001 Presented by Yuting Qi ECE Dept., Duke Univ. 03/03/06.

Bayesian Estimation in MARK

Markov-Chain Monte Carlo

Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process Chong Wang and David M. Blei NIPS 2009 Discussion led by Chunping Wang.

Statistical inference for epidemics on networks PD O’Neill, T Kypraios (Mathematical Sciences, University of Nottingham) Sep 2011 ICMS, Edinburgh.

Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Journal Club 11/04/141 Neal, Radford M (2011). " MCMC Using Hamiltonian Dynamics. "

Bayesian Nonparametric Matrix Factorization for Recorded Music Reading Group Presenter: Shujie Hou Cognitive Radio Institute Friday, October 15, 2010 Authors:

Computing the Posterior Probability The posterior probability distribution contains the complete information concerning the parameters, but need often.

The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.

Machine Learning CUNY Graduate Center Lecture 7b: Sampling.

Approximate Bayesian Methods in Genetic Data Analysis Mark A. Beaumont, University of Reading,

Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.

Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.

Space-time Modelling Using Differential Equations Alan E. Gelfand, ISDS, Duke University (with J. Duan and G. Puggioni)

Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.

Introduction to Monte Carlo Methods D.J.C. Mackay.

Bayes Factor Based on Han and Carlin (2001, JASA).

Gaussian process regression Bernád Emőke Gaussian processes Definition A Gaussian Process is a collection of random variables, any finite number.

The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.

Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.

Hierarchical Dirichelet Processes Y. W. Tech, M. I. Jordan, M. J. Beal & D. M. Blei NIPS 2004 Presented by Yuting Qi ECE Dept., Duke Univ. 08/26/05 Sharing.

Machine Learning Lecture 23: Statistical Estimation with Sampling Iain Murray’s MLSS lecture on videolectures.net:

Bayesian Inversion of Stokes Profiles A.Asensio Ramos (IAC) M. J. Martínez González (LERMA) J. A. Rubiño Martín (IAC) Beaulieu Workshop ( Beaulieu sur.

Simulation of the matrix Bingham-von Mises- Fisher distribution, with applications to multivariate and relational data Discussion led by Chunping Wang.

Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.

Spatial Dynamic Factor Analysis Hedibert Freitas Lopes, Esther Salazar, Dani Gamerman Presented by Zhengming Xing Jan 29,2010 * tables and figures are.

Fast Simulators for Assessment and Propagation of Model Uncertainty* Jim Berger, M.J. Bayarri, German Molina June 20, 2001 SAMO 2001, Madrid *Project of.

Multifactor GPs Suppose now we wish to model different mappings for different styles. We will add a latent style vector s along with x, and define the.

The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.

Continuous Variables Write message update equation as an expectation: Proposal distribution W t (x t ) for each node Samples define a random discretization.

An Asymptotic Analysis of Generative, Discriminative, and Pseudolikelihood Estimators by Percy Liang and Michael Jordan (ICML 2008 ) Presented by Lihan.

Suppressing Random Walks in Markov Chain Monte Carlo Using Ordered Overrelaxation Radford M. Neal 발표자 : 장 정 호.

Randomized Algorithms for Bayesian Hierarchical Clustering

-Arnaud Doucet, Nando de Freitas et al, UAI

STA 216 Generalized Linear Models Meets: 2:50-4:05 T/TH (Old Chem 025) Instructor: David Dunson 219A Old Chemistry, Teaching.

Variational Inference for the Indian Buffet Process

Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008.

1 Dirichlet Process Mixtures A gentle tutorial Graphical Models – Khalid El-Arini Carnegie Mellon University November 6 th, 2006 TexPoint fonts used.

Lecture 2: Statistical learning primer for biologists

Characterizing the Function Space for Bayesian Kernel Models Natesh S. Pillai, Qiang Wu, Feng Liang Sayan Mukherjee and Robert L. Wolpert JMLR 2007 Presented.

Beam Sampling for the Infinite Hidden Markov Model by Jurgen Van Gael, Yunus Saatic, Yee Whye Teh and Zoubin Ghahramani (ICML 2008) Presented by Lihan.

Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.

Lecture #9: Introduction to Markov Chain Monte Carlo, part 3

Gaussian Processes For Regression, Classification, and Prediction.

Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

A latent Gaussian model for compositional data with structural zeroes Adam Butler & Chris Glasbey Biomathematics & Statistics Scotland.

Multi-label Prediction via Sparse Infinite CCA Piyush Rai and Hal Daume III NIPS 2009 Presented by Lingbo Li ECE, Duke University July 16th, 2010 Note:

Sparse Approximate Gaussian Processes. Outline Introduction to GPs Subset of Data Bayesian Committee Machine Subset of Regressors Sparse Pseudo GPs /

Gil McVean, Department of Statistics Thursday February 12 th 2009 Monte Carlo simulation.

Introduction: Metropolis-Hasting Sampler Purpose--To draw samples from a probability distribution There are three steps 1Propose a move from x to y 2Accept.

Hierarchical Beta Process and the Indian Buffet Process by R. Thibaux and M. I. Jordan Discussion led by Qi An.

Markov-Chain-Monte-Carlo (MCMC) & The Metropolis-Hastings Algorithm P548: Intro Bayesian Stats with Psych Applications Instructor: John Miyamoto 01/19/2016:

Latent Feature Models for Network Data over Time Jimmy Foulds Advisor: Padhraic Smyth (Thanks also to Arthur Asuncion and Chris Dubois)

Institute of Statistics and Decision Sciences In Defense of a Dissertation Submitted for the Degree of Doctor of Philosophy 26 July 2005 Regression Model.

A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.

Generalization Performance of Exchange Monte Carlo Method for Normal Mixture Models Kenji Nagata, Sumio Watanabe Tokyo Institute of Technology.

STA 216 Generalized Linear Models

CSCI 5822 Probabilistic Models of Human and Machine Learning

A Non-Parametric Bayesian Method for Inferring Hidden Causes

Multidimensional Integration Part I

SMEM Algorithm for Mixture Models

Generalized Spatial Dirichlet Process Models

Bayesian Nonparametric Matrix Factorization for Recorded Music

Robust Full Bayesian Learning for Neural Networks

Presentation transcript:

Tractable Nonparametric Bayesian Inference in Poisson Processes with Gaussian Process Intensity by Ryan P. Adams, Iain Murray, and David J.C. MacKay (ICML 2009) Presented by Lihan He ECE, Duke University July 31, 2009

Outline Introduction The model Inference by MCMC Experimental results Poisson distribution Poisson process Gaussian process Gaussian Cox process Generating data from Gaussian Cox process Inference by MCMC Experimental results Conclusion 2/19

Inhomogeneous Poisson process Introduction Inhomogeneous Poisson process A counting process Rate of arrivals varies in time or space Intensity function (s) Astronomy, forestry, birth model, etc. 3/19

How to model the intensity function (s) Introduction How to model the intensity function (s) Using Gaussian process Nonparametrical approach Called Gaussian Cox process Difficulty: intractable in inference Double-stochastic process Some approximation methods in previous research This paper: tractable inference Introducing latent variables MCMC inference – Metropolis-Hastings method No approximation 4/19

Model: Poisson distribution Discrete random variable X has p.m.f. for k = 0, 1, 2, … Number of event arrivals Parameter  E[X] =  Conjugate prior: Gamma distribution 5/19

Model: Poisson process The Poisson process is parameterized by an intensity function such that the random number of event within a subregion is Poisson distributed with parameter for k = 0, 1, 2, … N(0)=0 The number of events in disjoint subregions are independent No events happen simultaneously Likelihood function 6/19

Model: Poisson process One-dimensional temporal Poisson process Two-dimensional spatial Poisson process 7/19

Model: Gaussian Cox process Using Gaussian process prior for intensity function (s) *: upper bound on (s) σ : logistic function g(s): random scalar function, drawn from a Gaussian process prior 8/19

Model: Gaussian process Definition: Let g=(g(x1), g(x2), …, g(xN)) be an N-dimensional vector of function values evaluated at N points x1:N. P(g) is a Gaussian process if for any finite subset {x1, …, xN} the marginal distribution over that finite subset g has a multivariate Gaussian distribution. Nonparametric prior (without parameterizing g, as g=wTx) Infinite dimension prior (dimension N is flexible), but only need to work with finite dimensional problem Fully specified by the mean function and the covariance function Mean function is usually defined to be zero Example covariance function 9/19

Model: Generating data from Gaussian Cox process Objective: generate a set of event {sk}k=1:K on some subregion T which are drawn from Poisson process with intensity function 10/19

Inference Given a set of K event {sk}k=1:K on some subregion T as observed data, what is the posterior distribution over (s)? Poisson process likelihood function Posterior 11/19

Inference Augment the posterior distribution by introducing latent variables to make the MCMC-based inference tractable. Observed data: Introduced latent variables: Total number of thinned events M Locations of thinned events Values of the function g(s) at the thinned events Values of the function g(s) at the observed events Complete likelihood 12/19

Inference MCMC inference: sample Sample M and : Metropolis-Hasting method Metropolis-Hasting method: draw a new sample xt+1based on the last sample xt and a proposal distribution q(x’;xt) Sample x’ from proposal q(x’; xt) 2. Compute acceptance ratio 3. Sample r~U(0,1) 4. If r<a, accept x’ as new sample, i.e., xt+1=x’; otherwise, reject x’, let xt+1=xt. 13/19

Inference Sample M: Metropolis-Hasting method Proposal distribution for inserting one thinned event Proposal distribution for deleting one thinned event Acceptance ratio for inserting one thinned event Acceptance ratio for deleting one thinned event 14/19

Inference Sample : Metropolis-Hasting method Acceptance ratio for sampling a thinned event Sample gM+K: Hamiltonian Monte Carlo method (Duane et al, 1987) m Sample *: place Gamma prior on * Conjugate prior, the posterior can be derived analytically. 15/19

Experimental results Synthetic data 53 events 29 events 235 events 16/19

Coal mining disaster data Experimental results Coal mining disaster data 191 coal mine explosions in British from year 1875 to 1962 17/19

Experimental results Redwoods data 195 redwood locations 18/19

Conclusion Proposed a novel method of inference for the Gaussian Cox process that avoids the intractability of such model; Using a generative prior that allows exact Poisson data to be generated from a random intensity function drawn from a transformed Gaussian process; Using MCMC method to infer the posterior distribution of the intensity function; Compared to other method, having better result; Having significant computational demands: infeasible for data sets that have more than several thousand event. 19/19