Presentation is loading. Please wait.

Presentation is loading. Please wait.

Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.

Similar presentations


Presentation on theme: "Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006."— Presentation transcript:

1 Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006

2 Introduction Latent Dirichlet Allocation (LDA) Correlated Topic Models (CTM) Experimental Results Conclusions Outlines

3 Topic models: generative probabilistic models which use a small number of distributions over a vocabulary to describe text collections and other discrete data (such as image). Normally, some latent variables are introduced to capture abstract notions such as topics. Applications: document modeling, text classification, image processing, collaborative filtering, etc. Latent Dirichlet Allocation (LDA): allows each document to exhibit multiple topics, but ignores the correlation between topics. Correlated Topic Models (CTM): is based on LDA and dresses the limitation of LDA. Introduction(1)

4 Notation and terminology (text collections) Word: the basic unit from a vocabulary of size V (includes V distinct words). The vth word is represented by Document: a sequence of N words. Corpus: a collection of M documents. Introduction(2) The words in a document are exchangeable; Documents are also exchangeable. Assumptions:

5 Latent Dirichlet Allocation (LDA) (1) fixed unknown parameters Random variables (w are observable) fixed known parameters Generative process for each document W in a corpus D: 1.Choose 2.For each of the N words (a)Choose a topic index (b)Choose a word are document-level variables, z and w are word-level variables.

6 Latent Dirichlet Allocation (LDA) (2) Pros: The Dirichlet distribution is in the exponential family and conjugate to the multinomial distribution --- variational inference is tractable. are document-specific, so the variational parameters of could be regarded as the representation of a document --- feature set is reduced. are sampled repeatedly within a document --- one document can be associated with multiple topics. Cons: Because of the independence assumption implicit in the Dirichlet distribution, LDA is unable to capture the correlation between different topics.

7 Correlated Topic Models (CTM) (1) Key point: the topic proportions are drawn from a logistic normal distribution rather than a Dirichlet distribution. Definition of logistic normal distribution Let denote k-dimensional real space, the (k-1)-dimensional positive simplex defined by Suppose that follows a multinormal distribution over. The logistic transformation from to can be used to define a logistic distribution over.

8 Correlated Topic Models (CTM) (2) 1 1 1 Logistic transformation Log ratio transformation The density function of The logistic normal distribution is defined over the simplex as Dirichlet distribution and it allows correlation between components.

9 Correlated Topic Models (CTM) (3) Generative process for each document W in a corpus D: 1.Choose 2.For each of the N words (a)Choose a topic (b) Choose a word

10 Correlated Topic Models (CTM) (4) Posterior inference (for in each document) – variational inference where Difficulty: the logistic normal is not exponential conjugate. Solution: we lower bound it with a Taylor expansion concave

11 Correlated Topic Models (CTM) (5) Parameters estimation (for ) – maximizing the likelihood of the entire corpus of documents Variational EM 1. (E-step) For each document, we maximize the lower bound with respect to the variational parameters ; 2. (M-step) Maximize the lower bound of the likelihood of the entire corpus with respect to the model parameters and

12 Experimental Results (1) Example: Modeling Science

13 Comparison with LDA - Document modeling Experimental Results (2)

14 Comparison with LDA – Collaborative filtering To evaluate how well the models predict the remaining words after observing a portion of the document, we need to define a measure to compare. Lower numbers denote more predictive power. Experimental Results (3)

15 Conclusions The main contribution of this paper is that the CTM directly model correlation between topics via the logistic normal distribution. At the same time, the nonconjugacy of the logistic normal distribution adds complexity to the variational inference process. As the LDA, the CTM allows multiple topics for each document; its variational parameters could serve as features of the document.

16 Reference: J. Aitchison and S.M. Shen. Logistic-Normal Distributions: Some Properties and Uses. Biometrika, vol.67, no.2, pp.261- 272, 1980. D. Blei, A. Ng and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3: 993-1022, 2003.


Download ppt "Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006."

Similar presentations


Ads by Google