Characterizing the Function Space for Bayesian Kernel Models Natesh S. Pillai, Qiang Wu, Feng Liang Sayan Mukherjee and Robert L. Wolpert JMLR 2007 Presented.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

1 -Classification: Internal Uncertainty in petroleum reservoirs.
Jose-Luis Blanco, Javier González, Juan-Antonio Fernández-Madrigal University of Málaga (Spain) Dpt. of System Engineering and Automation May Pasadena,
Gibbs Sampling Methods for Stick-Breaking priors Hemant Ishwaran and Lancelot F. James 2001 Presented by Yuting Qi ECE Dept., Duke Univ. 03/03/06.
CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct
Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.
1 Removing Camera Shake from a Single Photograph Rob Fergus, Barun Singh, Aaron Hertzmann, Sam T. Roweis and William T. Freeman ACM SIGGRAPH 2006, Boston,
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
Instructor : Dr. Saeed Shiry
Pattern Recognition and Machine Learning
Bayesian Reinforcement Learning with Gaussian Processes Huanren Zhang Electrical and Computer Engineering Purdue University.
1 Introduction to Kernels Max Welling October (chapters 1,2,3,4)
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Greg GrudicIntro AI1 Introduction to Artificial Intelligence CSCI 3202: The Perceptron Algorithm Greg Grudic.
Kernel Methods and SVM’s. Predictive Modeling Goal: learn a mapping: y = f(x;  ) Need: 1. A model structure 2. A score function 3. An optimization strategy.
CHAPTER 6 Statistical Analysis of Experimental Data
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
Statistical analysis and modeling of neural data Lecture 4 Bijan Pesaran 17 Sept, 2007.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Bayesian Inference Ekaterina Lomakina TNU seminar: Bayesian inference 1 March 2013.
Hierarchical Dirichelet Processes Y. W. Tech, M. I. Jordan, M. J. Beal & D. M. Blei NIPS 2004 Presented by Yuting Qi ECE Dept., Duke Univ. 08/26/05 Sharing.
Kernel Classifiers from a Machine Learning Perspective (sec ) Jin-San Yang Biointelligence Laboratory School of Computer Science and Engineering.
Correntropy as a similarity measure Weifeng Liu, P. P. Pokharel, Jose Principe Computational NeuroEngineering Laboratory University of Florida
Aug. 27, 2003IFAC-SYSID2003 Functional Analytic Framework for Model Selection Masashi Sugiyama Tokyo Institute of Technology, Tokyo, Japan Fraunhofer FIRST-IDA,
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Encoding/Decoding of Arm Kinematics from Simultaneously Recorded MI Neurons Y. Gao, E. Bienenstock, M. Black, S.Shoham, M.Serruya, J. Donoghue Brown Univ.,
1. 2  A Hilbert space H is a real or complex inner product space that is also a complete metric space with respect to the distance function induced.
Continuous Variables Write message update equation as an expectation: Proposal distribution W t (x t ) for each node Samples define a random discretization.
Probability and Measure September 2, Nonparametric Bayesian Fundamental Problem: Estimating Distribution from a collection of Data E. ( X a distribution-valued.
14 October, 2010LRI Seminar 2010 (Univ. Paris-Sud)1 Statistical performance analysis by loopy belief propagation in probabilistic image processing Kazuyuki.
Randomized Algorithms for Bayesian Hierarchical Clustering
Bayesian Multivariate Logistic Regression by Sean O’Brien and David Dunson (Biometrics, 2004 ) Presented by Lihan He ECE, Duke University May 16, 2008.
Bayesian Generalized Kernel Mixed Models Zhihua Zhang, Guang Dai and Michael I. Jordan JMLR 2011.
Sparse Kernel Methods 1 Sparse Kernel Methods for Classification and Regression October 17, 2007 Kyungchul Park SKKU.
Survey of Kernel Methods by Jinsan Yang. (c) 2003 SNU Biointelligence Lab. Introduction Support Vector Machines Formulation of SVM Optimization Theorem.
A Hierarchical Nonparametric Bayesian Approach to Statistical Language Model Domain Adaptation Frank Wood and Yee Whye Teh AISTATS 2009 Presented by: Mingyuan.
Lecture 2: Statistical learning primer for biologists
by Ryan P. Adams, Iain Murray, and David J.C. MacKay (ICML 2009)
Reducing MCMC Computational Cost With a Two Layered Bayesian Approach
Learning Kernel Classifiers Chap. 3.3 Relevance Vector Machine Chap. 3.4 Bayes Point Machines Summarized by Sang Kyun Lee 13 th May, 2005.
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
Bayesian Density Regression Author: David B. Dunson and Natesh Pillai Presenter: Ya Xue April 28, 2006.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
Rao-Blackwellised Particle Filtering for Dynamic Bayesian Network Arnaud Doucet Nando de Freitas Kevin Murphy Stuart Russell.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Estimating variable structure and dependence in multi-task learning via gradients By: Justin Guinney, Qiang Wu and Sayan Mukherjee Presented by: John Paisley.
An Iterative Monte Carlo Method for Nonconjugate Bayesian Analysis B. P. Carlin and A. E. Gelfand Statistics and Computing 1991 A Generic Approach to Posterior.
Sparse Approximate Gaussian Processes. Outline Introduction to GPs Subset of Data Bayesian Committee Machine Subset of Regressors Sparse Pseudo GPs /
Gaussian Process Networks Nir Friedman and Iftach Nachman UAI-2K.
Hierarchical Beta Process and the Indian Buffet Process by R. Thibaux and M. I. Jordan Discussion led by Qi An.
1 Kernel-class Jan Recap: Feature Spaces non-linear mapping to F 1. high-D space 2. infinite-D countable space : 3. function space (Hilbert.
Markov-Chain-Monte-Carlo (MCMC) & The Metropolis-Hastings Algorithm P548: Intro Bayesian Stats with Psych Applications Instructor: John Miyamoto 01/19/2016:
Institute of Statistics and Decision Sciences In Defense of a Dissertation Submitted for the Degree of Doctor of Philosophy 26 July 2005 Regression Model.
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
CS Statistical Machine learning Lecture 7 Yuan (Alan) Qi Purdue CS Sept Acknowledgement: Sargur Srihari’s slides.
MCMC Output & Metropolis-Hastings Algorithm Part I
Ch3: Model Building through Regression
Non-Parametric Models
Lecture 09: Gaussian Processes
Collapsed Variational Dirichlet Process Mixture Models
Multidimensional Integration Part I
Generalized Spatial Dirichlet Process Models
Robust Full Bayesian Learning for Neural Networks
More Parameter Learning, Multinomial and Continuous Variables
Lecture 10: Gaussian Processes
Mathematical Foundations of BME
Uncertainty Propagation
Presentation transcript:

Characterizing the Function Space for Bayesian Kernel Models Natesh S. Pillai, Qiang Wu, Feng Liang Sayan Mukherjee and Robert L. Wolpert JMLR 2007 Presented by: Mingyuan Zhou Duke University January 20, 2012

Outline Reproducing kernel Hilbert space (RKHS) Bayesian kernel model –Gaussian processes –Levy processes Gamma process Dirichlet process Stable process –Computational and modeling considerations Posterior inference Discussion

RKHS In functional analysis (a branch of mathematics), a reproducing kernel Hilbert space is a Hilbert space of functions in which pointwise evaluation is a continuous linear functional. Equivalently, they are spaces that can be defined by reproducing kernels.functional analysismathematicsHilbert spacefunctionscontinuous linear functional

A finite kernel based solution The direct adoption of the finite representation is not a fully Bayesian model since it depends on the (arbitrary) training data sample size. In addition, this prior distribution is supported on a finite-dimensional subspace of the RKHS. Our coherent fully Bayesian approach requires the specification of a prior distribution over the entire space H.

Mercer kernel

Bayesian kernel model

Properties of the RKHS

Bayesian kernel models and integral operators

Two concrete examples

Bayesian kernel models

Gaussian processes

Levy processes

Poisson random fields

Dirichlet Process

Symmetric alpha-stable processes

Computational and modeling considerations Finite approximation for Gaussian processes Discretization for pure jump processes

Posterior inference Levy process model –Transition probability proposal –The MCMC algorithm

Classification of gene expression data

Discussion This paper formulates a coherent Bayesian perspective for regression using a RHKS model. The paper stated an equivalence under certain conditions of the function class G and the RKHS induced by the kernel. This implies: –(a) a theoretical foundation for the use of Gaussian processes, Dirichlet processes, and other jump processes for non-parametric Bayesian kernel models. –(b) an equivalence between regularization approaches and the Bayesian kernel approach. –(c) an illustration of why placing a prior on the distribution is natural approach in Bayesian non-parametric modelling. A better understanding of this interface may lead to a better understanding of the following research problems: –Posterior consistency –Priors on function spaces –Comparison of process priors for modeling –Numerical stability and robust estimation