Presentation is loading. Please wait.

Presentation is loading. Please wait.

Topic Modeling using Latent Dirichlet Allocation

Similar presentations


Presentation on theme: "Topic Modeling using Latent Dirichlet Allocation"— Presentation transcript:

1 Topic Modeling using Latent Dirichlet Allocation

2 Topic Modeling A process of analyzing large collections of documents in order to discover latent topics from the documents. Able to organize and structure the documents Discover the different topics that a documents has How similar are certain documents

3

4 Latent Dirichlet Allocation (LDA)
It is a unsupervised learning Produces a generative model

5 Terminology Word: w ∈ {1,…,V} Document: Sequence of N words
Corpus: which is a set of M documents Topic: z ∈ {1,…, K}

6 Topic A topic is a set of co-occurring terms

7 Generate Process Choose N based on Poisson distribution
Choose θ based on Dirichlet distribution (θ is a topic weight vector) For each of the N words: Choose z from θ Choose w from z

8 Learning Variational Bayes Gibbs Sampling

9 Applications of LDA Collaborative Filtering Spam Detection Music Image

10 References D M Blei, A Y Ng, M I Jordan. (2003). Latent Dirichlet Allocation. The Journal of Machine Learning Research D J Hu. (2009). Latent Dirichlet Allocation for text, images, and music.


Download ppt "Topic Modeling using Latent Dirichlet Allocation"

Similar presentations


Ads by Google