Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dirichlet process tutorial

Similar presentations


Presentation on theme: "Dirichlet process tutorial"— Presentation transcript:

1 Dirichlet process tutorial
Bryan Russell TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA

2 Goals Intuitive understanding of Dirichlet processes and applications
Minimize math and maximize pictures Motivate you to go through the math to understand implementation

3 Disclaimers What I’m about to tell you applies more generally
We’ll gloss over lots of math (especially measure theory); look at the original papers for details

4 What’s this good for? Principled, Bayesian method for fitting a mixture model with an unknown number of clusters Because it’s Bayesian, can build hierarchies (e.g. HDPs) and integrate with other random variables in a principled way

5 Aren’t there other ways to count the number of clusters?

6 Gaussian mixture model, revisited

7 Gaussian mixture model, revisited

8 Gaussian mixture model, revisited

9 Let us generate data points…

10 Multinomial weights: prior probabilities of the mixtures

11 For each data point, choose cluster center h

12 Generate points x from the Gaussian mixture h

13 Let us be more Bayesian…
Put a prior over mixture parameters For Gaussian mixtures, this is a normal inverse-Wishart density

14 Suppose we do not know the number of clusters
We could sample Gaussian parameters for each data point However, the parameters may all be unique, i.e. there is one Gaussian mixture for each data point--overfitting

15 Dirichlet processes to the rescue
Draws from Dirichlet processes have a nice clustering property: Normal inverse-Wishart density Concentration parameter is a density over the parameters and is discrete with probability one

16 Visualizing Dirichlet process draws
Think of these as prior weights over the parameters

17 DP mixture model This model has a bias to “bunch” parameters
together. The concentration parameter controls this “bunching” property: lower values will find fewer clusters and vice versa for higher values


Download ppt "Dirichlet process tutorial"

Similar presentations


Ads by Google