Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.

Similar presentations


Presentation on theme: "A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion."— Presentation transcript:

1 A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion led by Iulian Pruteanu

2 Outline Introduction Approximate inferences for LDA Collapsed VB inference for LDA Experimental results Conclusions

3 Introduction (1/2) Latent Dirichlet Allocation is suitable for many applications from document modeling to computer vision. Collapsed Gibbs sampling seems to be the preferred choice to the large scale problems; however collapsed Gibbs sampling has its own problems. CVB algorithm, making use of some approximations, is easy to implement and more accurate than standard VB.

4 Introduction (2/2) This paper –proposes an improved VB algorithm based on integrating out the model parameters - assumption: the latent variables are mutually independent –uses a Gaussian approximation for computation efficiency

5 Approximate inferences for LDA(1/3) - observed words - latent variables (topic indices) - mixing proportions - topic parameters - number of documents - number of topics

6 Approximate inferences for LDA (2/3) Given the observed words the task of Bayesian inference is to compute the posterior distribution over 1.Variational Bayes

7 Approximate inferences for LDA (3/3) 2. Collapsed Gibbs sampling

8 Collapsed VB inference for LDA and marginalization on model parameters In variational Bayesian approximation, we assume a factorized form for the posterior approximating distribution. However it is not a good assumption since changes in model parameters ( ) will have a considerable impact on latent variables ( ). CVB is equivalent to marginalizing out the model parameters before approximating the posterior over the latent variable. The exact implementation of CVB has a closed form but is computationally too expensive to be practical. Therefore, the authors propose a simple Gaussian approximation which seems to work very accurately.

9 Experimental results Left: results for KOS. D=3,430 documents; W=6,909; N=467,714 words Right: results for NIPS. D=1,675 documents; W=12,419; N=2,166,029 words 10% for testing; 50 random runs Variational bounds (# iterations) Log probabilities (# iterations)

10 Conclusions Variational approximation are much more efficient computationally than Gibbs sampling, with almost no loss in accuracy The CVB inference algorithm is easy to implement*, computationally efficient (Gaussian approximation) and more accurate than standard VB.


Download ppt "A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion."

Similar presentations


Ads by Google