Presentation is loading. Please wait.

Presentation is loading. Please wait.

John Lafferty, Chengxiang Zhai School of Computer Science

Similar presentations


Presentation on theme: "John Lafferty, Chengxiang Zhai School of Computer Science"— Presentation transcript:

1 Document Language Models, Query Models, and Risk Minimization for Information Retrieval
John Lafferty, Chengxiang Zhai School of Computer Science Carnegie Mellon University

2 LM Applied to IR First proposed in (Ponte & Croft, 98)
Also explored in (Miller et al., 99; Hiemstra et al. 99; Berger & Lafferty, 99; Song & Croft, 99; Hiemstra, 00 etc.) A very promising approach Good empirical performance Statistical foundation Re-use existing LM techniques But, Lack of understanding of the approach (lack of “relevance”) Conceptual inconsistency in feedback (text + terms) Real empirical advantage still needs to be demonstrated

3 Research Questions How can we extend the language modeling approach to
Allow modeling of both queries and documents Exploit language modeling to perform natural and effective feedback Estimate translation models without training data (Berger & Lafferty 99) What is the relationship between the language modeling approach and the traditional probabilistic retrieval models? Can we go beyond the Probability Ranking Principle?

4 Outline A Risk Minimization Retrieval Framework Special Cases
Markov Chain Translation Model Query Language Model Estimation Evaluation

5 Risk Minimization Framework — Basic Idea
Utility/Risk as retrieval criterion Retrieval as a sequence of presenting decisions Application of Bayesian decision theory

6 Risk Minimization Framework — Features
Modeling utility-based retrieval (beyond binary relevance) Modeling interactive retrieval (dynamic user model) Covering many existing retrieval models Fully probabilistic (language model + estimation) Document language model + query language model Feedback as query model estimation

7 Generative Model inferred observed U User q R  {0,1} d S Source

8 Actions, Loss functions, and Bayes risk
Given C={d1,d2, …dk} from source S and query q from user U, which list of documents to present? Action: a = “list of documents” Loss: L(a,), (Q, D1, …,Dk ,R1, …, Rk) Bayes risk: a=di L(a, Q , Di, Ri ) R(di;q)  R(a=di|U,q,S,C) def R(a|U,q,S,C) = L(a, ) p( |U, q, S, C) d Bayes optimal decision (risk minimization) a* = argmin R(a|U,q,S,C) d* = argmin R(d;q) d a

9 Risk Minimization Ranking Function
posterior distribution Loss function Query model Doc model “Relevance” model

10 Special Cases Loss function L(Q, D ,R)=? Prob. Relevance Model
L(Q, D ,R)=L(R) Prob. Relevance Model p(R=1|q,d) L(Q, D ,R)=(Q, D) Prob. Similarity Model Generative Model? Classical prob. Model Doc generation “Generalized” LM approach Unigram + D || TF-IDF + cosine Vector space Model Query generation LM approach (Lafferty & Zhai 01)

11 A Markov Chain Method for Estimating Query Model
“Browsing” p(w|U) User q=q1,q2, …, qm Query w1 w2 wn w1 w2 wn doc source Markov chain “translation” channel t,D(q|w) query model = (posterior) prob. dist. over initial terms prior Markov chain Restrict D for pseudo feedback

12 Markov Chain Translation Probabilities
1- stop 1- stop 1- stop w0 p(w1|d0) w1 w2 p(w2|d2) ... d0 p(d0|w0)  d1 p(d1|w1)  A B Matrix Formulation:

13 Sample Query Probabilities
q=star wars (TREC8) Political sense Movie sense q=star wars (feedback on TREC8) Political sense

14 Evaluation KL-divergence Unigram Retrieval Model
Fixed linear interpolation smoothing for Comparing two methods for estimating Maximum likelihood (= query likelihood, simple LM) Markov chain on top 50 docs Three testing collections AP89 (~250MB + 50 queries) TREC8 ad hoc (~2GB + 50 queries) TREC8 web track (~2GB + 50 queries)

15 Results: Simple LM vs. Markov Chain
AP89 TREC8 Web

16 Results: Rocchio vs. Markov Chain
AP89 TREC8 Web

17 Rocchio vs. Markov Chain: TREC8

18 Rocchio vs. Markov Chain : Web

19 Conclusions and Future Work
Risk minimization as a new general retrieval framework Goes beyond the Probability Ranking Principle (PRR) Recovers existing models Extends existing work on language modeling Markov chain model expansion Efficient “translation” model Applicable to both query model and document model Empirically effective Future Work Explore utility-based ranking criterion (e.g., MMR) Explore new models and new estimation methods


Download ppt "John Lafferty, Chengxiang Zhai School of Computer Science"

Similar presentations


Ads by Google