Presentation is loading. Please wait.

Presentation is loading. Please wait.

Latent Dirichlet Allocation

Similar presentations


Presentation on theme: "Latent Dirichlet Allocation"— Presentation transcript:

1 Latent Dirichlet Allocation
David M Blei, Andrew Y Ng & Michael I Jordan presented by Tilaye Alemu & Anand Ramkissoon

2 Motivation for LDA In lay terms: document modelling
text classification collaborative filtering ... ...in the context of Information Retrieval The principal focus in this paper is on document classification within a corpus

3 Structure of this talk Part 1: Part 2: Theory Background
(some) other approaches Part 2: Experimental results some details of usage wider applications

4 LDA: conceptual features
Generative Probabilistic Collections of discrete data 3-level hierarchical Bayesian model mixture models efficient approximate inference techniques variational methods EM algorithm for empirical Bayes parameter estimation

5 How to classify text documents
Word (term) frequency tf-idf term-by-document matrix discriminative sets of words fixed-length lists of numbers little statistical structure Dimensionality reduction techniques Latent Semantic Indexing Singular value decomposition not generative

6 How to classify text documents ct'd
probabilistic LSI (PLSI) each word generated by one topic each document generated by a mixture of topics a document is represented as a list of mixing proportions for topics No generative model for these numbers Number of parameters grows linearly with the corpus Overfitting How to classify documents outside training set

7 A major simplifying assumption
A document is a “bag of words” A corpus is a “bag of documents” order is unimportant exchangeability de Finetti representation theorem any collection of exchangeable random variables has a representation as a (generally infinite) mixture distribution

8 A note about exchangeability
Does not mean that random variables are iid iid when conditioned on wrt to an underlying latent parameter of a probability distribution Conditionally the joint distribution is simple and factored

9 Notation word: unit of discrete data, an item from a vocabulary indexed {1,...,V} each word is a unit basis V-vector document: sequence of N words w=(w1,...,wN) corpus a collection of M documents D=(w1,...,wM) Each document is considered a random mixture over latent topics Each topic is considered a distribution over words

10 LDA assumes a generative process for each document in the corpus

11 Probability density for the Dirichlet Random variable

12 Joint distribution of a Topic mixture

13 Marginal distribution of a document

14 Probability of a corpus

15 Marginalize over z The word distribution The generative process

16 a Unigram Model

17 probabilistic Latent Semantic Indexing

18 Inference from LDA

19 Variational Inference

20 A family of distributions on latent variables
The Dirichlet parameter γ and the multinomial parameters φ are the free variational parameters

21 The update equations Minimize the Kullback-Leibler divergence between the distribution and the true posterior

22 Variational Inference Algorithm


Download ppt "Latent Dirichlet Allocation"

Similar presentations


Ads by Google