John Lafferty, Chengxiang Zhai School of Computer Science

Document Language Models, Query Models, and Risk Minimization for Information Retrieval
John Lafferty, Chengxiang Zhai School of Computer Science Carnegie Mellon University

LM Applied to IR First proposed in (Ponte & Croft, 98)
Also explored in (Miller et al., 99; Hiemstra et al. 99; Berger & Lafferty, 99; Song & Croft, 99; Hiemstra, 00 etc.) A very promising approach Good empirical performance Statistical foundation Re-use existing LM techniques But, Lack of understanding of the approach (lack of “relevance”) Conceptual inconsistency in feedback (text + terms) Real empirical advantage still needs to be demonstrated

Research Questions How can we extend the language modeling approach to
Allow modeling of both queries and documents Exploit language modeling to perform natural and effective feedback Estimate translation models without training data (Berger & Lafferty 99) What is the relationship between the language modeling approach and the traditional probabilistic retrieval models? Can we go beyond the Probability Ranking Principle?

Outline A Risk Minimization Retrieval Framework Special Cases
Markov Chain Translation Model Query Language Model Estimation Evaluation

Risk Minimization Framework — Basic Idea
Utility/Risk as retrieval criterion Retrieval as a sequence of presenting decisions Application of Bayesian decision theory

Risk Minimization Framework — Features
Modeling utility-based retrieval (beyond binary relevance) Modeling interactive retrieval (dynamic user model) Covering many existing retrieval models Fully probabilistic (language model + estimation) Document language model + query language model Feedback as query model estimation

Generative Model inferred observed U User q R  {0,1} d S Source

Actions, Loss functions, and Bayes risk
Given C={d1,d2, …dk} from source S and query q from user U, which list of documents to present? Action: a = “list of documents” Loss: L(a,), (Q, D1, …,Dk ,R1, …, Rk) Bayes risk: a=di L(a, Q , Di, Ri ) R(di;q)  R(a=di|U,q,S,C) def  R(a|U,q,S,C) = L(a, ) p( |U, q, S, C) d Bayes optimal decision (risk minimization) a* = argmin R(a|U,q,S,C) d* = argmin R(d;q) d a

Risk Minimization Ranking Function
posterior distribution Loss function Query model Doc model “Relevance” model

Special Cases Loss function L(Q, D ,R)=? Prob. Relevance Model
L(Q, D ,R)=L(R) Prob. Relevance Model p(R=1|q,d) L(Q, D ,R)=(Q, D) Prob. Similarity Model Generative Model? Classical prob. Model Doc generation “Generalized” LM approach Unigram + D || TF-IDF + cosine Vector space Model Query generation LM approach (Lafferty & Zhai 01)

A Markov Chain Method for Estimating Query Model
“Browsing”  p(w|U) User q=q1,q2, …, qm Query w1 w2 … wn w1 w2 … wn doc source Markov chain “translation” channel t,D(q|w) query model = (posterior) prob. dist. over initial terms prior Markov chain Restrict D for pseudo feedback

Markov Chain Translation Probabilities
1- stop 1- stop 1- stop w0 p(w1|d0) w1 w2 p(w2|d2) ... d0 p(d0|w0)  d1 p(d1|w1)  A B Matrix Formulation:

Sample Query Probabilities
q=star wars (TREC8) Political sense Movie sense q=star wars (feedback on TREC8) Political sense

Evaluation KL-divergence Unigram Retrieval Model
Fixed linear interpolation smoothing for Comparing two methods for estimating Maximum likelihood (= query likelihood, simple LM) Markov chain on top 50 docs Three testing collections AP89 (~250MB + 50 queries) TREC8 ad hoc (~2GB + 50 queries) TREC8 web track (~2GB + 50 queries)

Results: Simple LM vs. Markov Chain
AP89 TREC8 Web

Results: Rocchio vs. Markov Chain
AP89 TREC8 Web

Rocchio vs. Markov Chain: TREC8

Rocchio vs. Markov Chain : Web

Conclusions and Future Work
Risk minimization as a new general retrieval framework Goes beyond the Probability Ranking Principle (PRR) Recovers existing models Extends existing work on language modeling Markov chain model expansion Efficient “translation” model Applicable to both query model and document model Empirically effective Future Work Explore utility-based ranking criterion (e.g., MMR) Explore new models and new estimation methods

John Lafferty, Chengxiang Zhai School of Computer Science

Similar presentations

Presentation on theme: "John Lafferty, Chengxiang Zhai School of Computer Science"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

John Lafferty, Chengxiang Zhai School of Computer Science

Similar presentations

Presentation on theme: "John Lafferty, Chengxiang Zhai School of Computer Science"— Presentation transcript:

Similar presentations

About project

Feedback