Bayesian kernel mixtures for counts

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Bayesian dynamic modeling of latent trait distributions Duke University Machine Learning Group Presented by Kai Ni Jan. 25, 2007 Paper by David B. Dunson,
DEPARTMENT OF ENGINEERING SCIENCE Information, Control, and Vision Engineering Bayesian Nonparametrics via Probabilistic Programming Frank Wood
CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct
MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
4-1 Continuous Random Variables 4-2 Probability Distributions and Probability Density Functions Figure 4-1 Density function of a loading on a long,
Chapter Two Probability Distributions: Discrete Variables
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Introduction and Motivation Approaches for DE: Known model → parametric approach: p(x;θ) (Gaussian, Laplace,…) Unknown model → nonparametric approach Assumes.
Moment Generating Functions
Bayesian Inference Ekaterina Lomakina TNU seminar: Bayesian inference 1 March 2013.
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Linear Model. Formal Definition General Linear Model.
Learning Theory Reza Shadmehr Linear and quadratic decision boundaries Kernel estimates of density Missing data.
4-1 Continuous Random Variables 4-2 Probability Distributions and Probability Density Functions Figure 4-1 Density function of a loading on a long,
Continuous Variables Write message update equation as an expectation: Proposal distribution W t (x t ) for each node Samples define a random discretization.
STA 216 Generalized Linear Models Meets: 2:50-4:05 T/TH (Old Chem 025) Instructor: David Dunson 219A Old Chemistry, Teaching.
Intro. ANN & Fuzzy Systems Lecture 23 Clustering (4)
Characterizing the Function Space for Bayesian Kernel Models Natesh S. Pillai, Qiang Wu, Feng Liang Sayan Mukherjee and Robert L. Wolpert JMLR 2007 Presented.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Analyzing Statistical Inferences July 30, Inferential Statistics? When? When you infer from a sample to a population Generalize sample results to.
Nonparametric Density Estimation Riu Baring CIS 8526 Machine Learning Temple University Fall 2007 Christopher M. Bishop, Pattern Recognition and Machine.
Non-life insurance mathematics Nils F. Haavardsson, University of Oslo and DNB Forsikring.
Generalized Spatial Dirichlet Process Models Jason A. Duan Michele Guindani Alan E. Gelfand March, 2006.
Bayesian Density Regression Author: David B. Dunson and Natesh Pillai Presenter: Ya Xue April 28, 2006.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
STA 216 Generalized Linear Models Instructor: David Dunson 211 Old Chem, (NIEHS)
Gaussian Process Networks Nir Friedman and Iftach Nachman UAI-2K.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
MLPR - Questions. Can you go through integration, differentiation etc. Why do we need priors? Difference between prior and posterior. What does Bayesian.
Crash course in probability theory and statistics – part 2 Machine Learning, Wed Apr 16, 2008.
4-1 Continuous Random Variables 4-2 Probability Distributions and Probability Density Functions Figure 4-1 Density function of a loading on a long,
Bayesian Semi-Parametric Multiple Shrinkage
Engineering Probability and Statistics - SE-205 -Chap 4
Bayesian Generalized Product Partition Model
Probability for Machine Learning
Ch8: Nonparametric Methods
Distributions cont.: Continuous and Multivariate
Parameter Estimation 主講人:虞台文.
CH 5: Multivariate Methods
Non-Parametric Models
STA 216 Generalized Linear Models
Dirichlet process tutorial
Distributions and Concepts in Probability Theory
Outlier Discovery/Anomaly Detection
Outline Parameter estimation – continued Non-parametric methods.
Maximum Likelihood Find the parameters of a model that best fit the data… Forms the foundation of Bayesian inference Slide 1.
Advanced Statistical Computing Fall 2016
STA 216 Generalized Linear Models
Akio Utsugi National Institute of Bioscience and Human-technology,
Igor V. Cadez, Padhraic Smyth, Geoff J. Mclachlan, Christine and E
Generalized Spatial Dirichlet Process Models
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Ch13 Empirical Methods.
Stochastic Volatility Models: Bayesian Framework
More Parameter Learning, Multinomial and Continuous Variables
Generally Discriminant Analysis
Extending Expectation Propagation for Graphical Models
LECTURE 07: BAYESIAN ESTIMATION
Topic Models in Text Processing
Parametric Methods Berlin Chen, 2005 References:
Mathematical Foundations of BME
Mixture Models with Adaptive Spatial Priors
Data Mining Anomaly Detection
Generalized Additive Model
Data Mining Anomaly Detection
Fractional-Random-Weight Bootstrap
Probabilistic Surrogate Models
Presentation transcript:

Bayesian kernel mixtures for counts Antonio Canale & David B. Dunson Presented by Yingjian Wang Apr. 29, 2011

Outline Existed models for counts and their drawbacks; Univariate rounded kernel mixture priors; Simulation of the univariate model; Multivariate rounded kernel mixture priors; Experiment with the multivariate model;

Modeling of counts Mixture of Poissons: a) Not a nonparametric way; b) Only accounts for cases where the variance is greater than the mean;

Modeling of counts (2) DP mixture of Poissons/Multinomial kernel: a) It is non-parametric but, still has the problem of not suitable for under-disperse cases; b) If with multinomial kernel, the dimension of the probability vector is equal to the number of support points, causes overfitting.

Modeling of counts (3) DP with Poisson base measure: a) There is no allowance for smooth deviations from the base; Motivation: The continuous densities can be accurately approximated using Gaussian kernels. Idea: Use kernels induced through rounding of continuous kernels.

Univariate rounded kernel

Univariate rounded kernel (2) Existence: Consistence: (the mapping g(.) maintains KL neighborhoods.)

Examples of rounded kernels Rounded Gaussian kernel: Other kernels: log-normal, gamma, Weibull densities.

Eliciting the thresholds

A Gibbs sampling algorithm

Experiment with univariate model Two scenarios: Two standards: Results:

Extension to multivariate model

Telecommunication data Data from 2050 SIM cards, with multivariate: yi=[yi1, yi2, yi3, yi4, yi5], Compare the RMG with generalized additive model (GAM):