Presentation is loading. Please wait.

Presentation is loading. Please wait.

Carnegie Mellon Exact Maximum Likelihood Estimation for Word Mixtures Yi Zhang & Jamie Callan Carnegie Mellon University Wei Xu.

Similar presentations


Presentation on theme: "Carnegie Mellon Exact Maximum Likelihood Estimation for Word Mixtures Yi Zhang & Jamie Callan Carnegie Mellon University Wei Xu."— Presentation transcript:

1 Carnegie Mellon Exact Maximum Likelihood Estimation for Word Mixtures Yi Zhang & Jamie Callan Carnegie Mellon University {yiz,callan}@cs.cmu.edu Wei Xu NEC C&C Research Lab xw@ccrl.sj.nec.com

2 Carnegie Mellon Outline u Introduction 1. Why this problems? some retrieval applications 2. Traditional solutions: EM algorithm u New algorithm: exact MLE estimation u Experimental Results

3 Carnegie Mellon Query Q Document D Results Feedback Docs F={d 1, d 2, …, d n } Example 1: Model-based Feedback in the Language Modeling Approach to IR Based on Zhai&Lafferty’s slides in CIKM 2001

4 Carnegie Mellon  F Estimation based on Generative Mixture Model w w F={d 1, …, d n } Maximum Likelihood P(w|  ) P(w| C) 1- P(source) Background words Topic words Based on Zhai&Lafferty’s slides in CIKM 2001 Given: F, P(w|c) and Find: MLE of 

5 Carnegie Mellon M T :  Topic M E :  general English M I :  new E T new Example 2: Model-based Approach for Novelty Detection in Adaptive Information Filtering Based on Zhang&Callan’s paper in SIGIR 2002 Given:  general English,  Topic E T new Find: MLE of  new

6 Carnegie Mellon Problem Setting and Traditional Solution Using EM u Observe: data generated by a mixture multinomial distribution r=(r 1, r 2, r 3, …, r k ) u Given: interpolation weights  and , another multinomial distribution p=(p 1, p 2, p 3, …, p k ) u Find: the maximum likelihood estimation (MLE) of multinomial distribution q=(q 1, q 2, q 3, …, q k ) u Traditional solution: EM algorithm l Iterative process which can be computationally expensive l Only provide approximate solution

7 Carnegie Mellon Finding q (1) Under the constraints: Where: f i is observed frequency of word i

8 Carnegie Mellon Finding q (2) For all the q i such that q i >0, apply Lagrange multiplier method and calculate the derivatives with respect to q i : This is a close form solution for q i, if we know all i that q i >0. Theorem: All the q i greater than 0 correspond to the smallest See detailed proof in our paper…

9 Carnegie Mellon Algorithm for Finding Exact MLE for q

10 Carnegie Mellon Experiments Setting on Model Based Feedback in IR u 20 relevant documents (sampled from AP Wire News and Wall Street Journal dataset from 1988- 1990) for a topic as observed training data sequence. p is calculated directly as described in (Zhai&Lafferty) from 119823 documents. u There are 2352 unique words in these 20 relevant documents, which means at most 2352 q i 's are none zero, while there are 200542 p i 's are none zero.

11 Carnegie Mellon EM result converges to the result calculated directly by our algorithm.

12 Carnegie Mellon Compar ing the Speed of Our Algorithm With EM u EM stop if change of LL < 10 -  u 50000 times on PIII 500 PC

13 Carnegie Mellon Conclusion u We developed a new training algorithm that provide exact MLE for word mixtures u Theoretically and Empirically works well u Can be used in several language model based IR applications


Download ppt "Carnegie Mellon Exact Maximum Likelihood Estimation for Word Mixtures Yi Zhang & Jamie Callan Carnegie Mellon University Wei Xu."

Similar presentations


Ads by Google