Multiscale Topic Tomography Ramesh Nallapati, William Cohen, Susan Ditmore, John Lafferty & Kin Ung (Johnson and Johnson Group)

Slides:



Advertisements
Similar presentations
Topic models Source: Topic models, David Blei, MLSS 09.
Advertisements

Information retrieval – LSI, pLSI and LDA
Hierarchical Dirichlet Processes
One Theme in All Views: Modeling Consensus Topics in Multiple Contexts Jian Tang 1, Ming Zhang 1, Qiaozhu Mei 2 1 School of EECS, Peking University 2 School.
Simultaneous Image Classification and Annotation Chong Wang, David Blei, Li Fei-Fei Computer Science Department Princeton University Published in CVPR.
Expectation Maximization
Title: The Author-Topic Model for Authors and Documents
An Introduction to LDA Tools Kuan-Yu Chen Institute of Information Science, Academia Sinica.
Probabilistic Clustering-Projection Model for Discrete Data
Statistical Topic Modeling part 1
Joint Sentiment/Topic Model for Sentiment Analysis Chenghua Lin & Yulan He CIKM09.
Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process Chong Wang and David M. Blei NIPS 2009 Discussion led by Chunping Wang.
Unsupervised and Weakly-Supervised Probabilistic Modeling of Text Ivan Titov April TexPoint fonts used in EMF. Read the TexPoint manual before.
2. Introduction Multiple Multiplicative Factor Model For Collaborative Filtering Benjamin Marlin University of Toronto. Department of Computer Science.
Parallelized variational EM for Latent Dirichlet Allocation: An experimental evaluation of speed and scalability Ramesh Nallapati, William Cohen and John.
Generative Topic Models for Community Analysis
Caimei Lu et al. (KDD 2010) Presented by Anson Liang.
Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation James Foulds 1, Levi Boyles 1, Christopher DuBois 2 Padhraic Smyth.
Sparse Word Graphs: A Scalable Algorithm for Capturing Word Correlations in Topic Models Ramesh Nallapati Joint work with John Lafferty, Amr Ahmed, William.
1 Unsupervised Learning With Non-ignorable Missing Data Machine Learning Group Talk University of Toronto Monday Oct 4, 2004 Ben Marlin Sam Roweis Rich.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Latent Dirichlet Allocation a generative model for text
Generative learning methods for bags of features
Modeling User Rating Profiles For Collaborative Filtering
British Museum Library, London Picture Courtesy: flickr.
Topic models for corpora and for graphs. Motivation Social graphs seem to have –some aspects of randomness small diameter, giant connected components,..
Introduction to Machine Learning for Information Retrieval Xiaolong Wang.
Correlated Topic Models By Blei and Lafferty (NIPS 2005) Presented by Chunping Wang ECE, Duke University August 4 th, 2006.
Example 16,000 documents 100 topic Picked those with large p(w|z)
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
Modeling Documents by Combining Semantic Concepts with Unsupervised Statistical Learning Author: Chaitanya Chemudugunta America Holloway Padhraic Smyth.
Annealing Paths for the Evaluation of Topic Models James Foulds Padhraic Smyth Department of Computer Science University of California, Irvine* *James.
Memory Bounded Inference on Topic Models Paper by R. Gomes, M. Welling, and P. Perona Included in Proceedings of ICML 2008 Presentation by Eric Wang 1/9/2009.
Topic Modelling: Beyond Bag of Words By Hanna M. Wallach ICML 2006 Presented by Eric Wang, April 25 th 2008.
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
Probabilistic Topic Models
27. May Topic Models Nam Khanh Tran L3S Research Center.
Style & Topic Language Model Adaptation Using HMM-LDA Bo-June (Paul) Hsu, James Glass.
Integrating Topics and Syntax -Thomas L
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Latent Dirichlet Allocation D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3: , January Jonathan Huang
Probabilistic Models for Discovering E-Communities Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, Hongyuan Zha The Pennsylvania State University WWW.
Stick-Breaking Constructions
Storylines from Streaming Text The Infinite Topic Cluster Model Amr Ahmed, Jake Eisenstein, Qirong Ho Alex Smola, Choon Hui Teo, Eric Xing Carnegie Mellon.
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
Topic Models Presented by Iulian Pruteanu Friday, July 28 th, 2006.
Topic Modeling using Latent Dirichlet Allocation
Latent Dirichlet Allocation
Discovering Objects and their Location in Images Josef Sivic 1, Bryan C. Russell 2, Alexei A. Efros 3, Andrew Zisserman 1 and William T. Freeman 2 Goal:
Dynamic Multi-Faceted Topic Discovery in Twitter Date : 2013/11/27 Source : CIKM’13 Advisor : Dr.Jia-ling, Koh Speaker : Wei, Chang 1.
Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.
Automatic Labeling of Multinomial Topic Models
Web-Mining Agents Topic Analysis: pLSI and LDA
Analysis of Social Media MLD , LTI William Cohen
Hierarchical Beta Process and the Indian Buffet Process by R. Thibaux and M. I. Jordan Discussion led by Qi An.
A Study of Poisson Query Generation Model for Information Retrieval
Probabilistic Topic Models Hongning Wang Outline 1.General idea of topic models 2.Basic topic models -Probabilistic Latent Semantic Analysis (pLSA)
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
Online Multiscale Dynamic Topic Models
The topic discovery models
The topic discovery models
The topic discovery models
Bayesian Inference for Mixture Language Models
Stochastic Optimization Maximization for Latent Variable Models
Topic models for corpora and for graphs
Michal Rosen-Zvi University of California, Irvine
Latent Dirichlet Allocation
LDA AND OTHER DIRECTED MODELS FOR MODELING TEXT
Topic models for corpora and for graphs
Topic Models in Text Processing
Presentation transcript:

Multiscale Topic Tomography Ramesh Nallapati, William Cohen, Susan Ditmore, John Lafferty & Kin Ung (Johnson and Johnson Group)

8/13/2007KDD 2007, San Jose, CA2 / 22 Introduction –Explosive growth of electronic document collections Need for unsupervised techniques for summarization, visualization and analysis –Many probabilistic graphical models proposed in the recent past: Latent Dirichlet Allocation Correlated Topic Models Pachinko Allocation Dirichlet Process Mixtures ….. –All the above ignore an important dimension that reveals huge amount of information Time!

8/13/2007KDD 2007, San Jose, CA3 / 22 Introduction Recent work that models time: –Topics over Time [Wang and McCallum, KDD’06] Key ideas : Each sampled topic generates a word as well as a time stamp Beta distribution to model the occurrence probability of topics Collapsed Gibbs sampling for inference

8/13/2007KDD 2007, San Jose, CA4 / 22 Introduction Recent work that models time –Topics over Time (ToT) [Wang and McCallum, KDD’06]

8/13/2007KDD 2007, San Jose, CA5 / 22 Introduction Recent models proposed to address this issue: –Dynamic Topic Models (DTM) [Blei and Lafferty, ICML’06] Key ideas : Models evolution of “topic content”, not just topic occurrence Evolution of topic multinomials modeled using logistic-normal prior approximate variational inference

8/13/2007KDD 2007, San Jose, CA6 / 22 Introduction Recent models proposed to address this issue: –Dynamic Topic Models (DTM) [Blei and Lafferty, ICML’06]

8/13/2007KDD 2007, San Jose, CA7 / 22 Introduction Issues with DTM –Logistic normal not a conjugate to the multinomial Results in complicated inference procedures Topic tomography: a new time series topic model –Uses a Poisson process to model word counts –A wedding of multiscale wavelet analysis with topic models Uses conjugate priors –Efficient inference Allows Visualization of topic evolution at various time- scales

8/13/2007KDD 2007, San Jose, CA8 / 22 Topic Tomography: A sneak-preview

8/13/2007KDD 2007, San Jose, CA9 / 22 Topic Tomography (TT): what’s with the name? LDA models how topics are distributed in each document Normalization is per document TT models how each topic is distributed among documents ! Normalization is per topic From the Greek words " tomos" (to cut or section) and "graphein" (to write)

8/13/2007KDD 2007, San Jose, CA10 / 22 Topic Tomography model

8/13/2007KDD 2007, San Jose, CA11 / 22 Multiscale parameter generation Haar multiscale wavelet representation scale epochs

8/13/2007KDD 2007, San Jose, CA12 / 22 Multiscale parameter generation

8/13/2007KDD 2007, San Jose, CA13 / 22 Multiscale Topic Tomography: where is the conjugacy? Recall: multiscale canonical parameters are generated using Beta distribution Data likelihood w.r.t. the Poissons can be equivalently expressed in terms of the binomials:

8/13/2007KDD 2007, San Jose, CA14 / 22 Multiscale Topic Tomography Parameter learning using mean-field variational EM

8/13/2007KDD 2007, San Jose, CA15 / 22 Experiments Perplexity analysis on Science data –Spans 120 years: split into 8 epochs each spanning 15 years –Documents in each epoch split into training and test sets Trained three different versions of TT –Basic TT: basic tomography model with no multiscale analysis, applied to the whole training set –Multiple TT: same as above, but one model for each epoch –Multiscale TT: full multiscale version

8/13/2007KDD 2007, San Jose, CA16 / 22 Experiments Perplexity results Multiscale TT Multiple TT Basic TT LDA

8/13/2007KDD 2007, San Jose, CA17 / 22 Experiments: Topic visualization of “Particle physics”

8/13/2007KDD 2007, San Jose, CA18 / 22 Experiments Topic visualization: “Particle physics”

8/13/2007KDD 2007, San Jose, CA19 / 22 Experiments: Evolution of content- bearing words in “particle physics” quantum heat electron atom

8/13/2007KDD 2007, San Jose, CA20 / 22 Experiments: Topic occurrence distribution Agricultural science Genetics Climate change Neuroscience

8/13/2007KDD 2007, San Jose, CA21 / 22 Conclusion Advantages: –Multiscale tomography has the best features of both DTM and ToT In addition, it provides a “zoom” feature for time-scales –A natural model for sequence modeling of counts data Conjugate priors, easier inference Limitations: –Cannot generate one document at a time –Not easily parallelizable Future work: –Build a GaP like model with Gamma weights

8/13/2007KDD 2007, San Jose, CA22 / 22 Demo Analysis of 32,000 documents from PubMed containing the word “cancer”, spanning 32 years Will be shown this evening at poster # 9 Also available at: Local copy

8/13/2007KDD 2007, San Jose, CA23 / 22 Inference: Mean field variational EM E-step: M-step: Variational multinomial Variational Dirichlet

8/13/2007KDD 2007, San Jose, CA24 / 22 Related Work Poisson distribution used in 2-Poisson model in IR –Not successful, but inspired the famous BM25 Gamma-Poisson topic model [ Canny, SIGIR’04 ] –Poisson to model word counts and Gamma to model topic weights –does not follow the semantics of a “pure” generative model Optimizes the likelihood of complete-data –Topic tomography model is very similar We optimize the likelihood of observed-data Use Dirichlet to model topic weights

8/13/2007KDD 2007, San Jose, CA25 / 22 Related Work Multiscale Topic Tomography model originally introduced by Nowak et al [ Nowak and Kolaczyk, IEEE ToIT’00 ] –Called “Poisson inverse” problem –Applied to model gamma ray bursts –Topic weights assumed to be known a simple EM algorithm proposed We cast topic modeling as a Poisson inverse problem –Topic weights unknown –Variational EM proposed

8/13/2007KDD 2007, San Jose, CA26 / 22 Outline Introduction/Motivation Related work Topic Tomography model –Basic model –Multiscale analysis –Learning and Inference Experiments –Perplexity analysis –Topic visualizations Demo (if time permits)

8/13/2007KDD 2007, San Jose, CA27 / 22 Experiments: Multiple senses of word “reaction” particle physics chemistry Blood tests Total count