Term Necessity Prediction P(t | R q ) Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie Mellon University Oct.

Term Necessity Prediction P(t | R q ) Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie Mellon University Oct 27, CIKM 2010 1 Necessity is as important as idf (theory) Explains behavior of IR models (practice) Can be predicted Performance gain Main Points

Definition of Necessity P(t | R q ) Directly calculated given relevance judgements for q Docs that contain t Relevant (q) 2 P(t | R q ) = 0.4 Collection Necessity == 1 – mismatch == term recall

Why Necessity? Roots in Probabilistic Models Binary Independence Model –[Robertson and Spärck Jones 1976] –“Relevance Weight”, “Term Relevance” P(t | R) is effectively the only part about relevance. 3 Necessity odds idf (sufficiency) Necessity is as important as idf (theory) Explains behavior of IR models (practice) Can be predicted Performance gain Main Points

Without Necessity The emphasis problem for idf-only term weighting –Emphasize high idf terms in query “prognosis/viability of a political third party in U.S.” (Topic 206) 4

Ground Truth partypoliticalthirdviabilityprognosis True P(t | R)0.97960.71430.59180.04080.0204 idf 2.402 2.513 2.187 5.017 7.471 5 Emphasis TREC 4 topic 206

Indri Top Results 1. (ZF32-220-147) Recession concerns lead to a discouraging prognosis for 1991 2. (AP880317-0017) Politics … party … Robertson's viability as a candidate 3. (WSJ910703-0174) political parties … 4. (AP880512-0050) there is no viable opposition … 5. (WSJ910815-0072) A third of the votes 6. (WSJ900710-0129) politics, party, two thirds 7. (AP880729-0250) third ranking political movement… 8. (AP881111-0059) political parties 9. (AP880224-0265) prognosis for the Sunday school 10. (ZF32-051-072) third party provider (Google, Bing still have top 10 false positives. Emphasis also a problem for large search engines!) 6

Without Necessity The emphasis problem for idf-only term weighting –Emphasize high idf terms in query “prognosis/viability of a political third party in U.S.” (Topic 206) –False positives throughout rank list especially detrimental at top rank –No term recall hurts precision at all recall levels –(This is true for BIM, and also BM25, LM that use tf.) How significant is the emphasis problem? 7

Failure Analysis of 44 Topics from TREC 6-8 8 RIA workshop 2003 (7 top research IR systems, >56 expert*weeks) Necessity term weighting Necessity guided expansion Basis: Term Necessity Prediction Necessity is as important as idf (theory) Explains behavior of IR models (practice) & Bigrams, &Term restriction using doc fields Can be predicted Performance gain Main Points

Given True Necessity +100% over BIM (in precision at all recall levels) [Robertson and Spärk Jones 1976] +30-80% over Language Model, BM25 (in MAP) This work For a new query w/o relevance judgements, need to predict necessity. –Predictions don’t need to be very accurate to show performance gain. 9

(Examples from TREC 3 topics) Term in Query Oil Spills Term limitations for US Congress members Insurance Coverage which pays for Long Term Care School Choice Voucher System and its effects on the US educational program Vitamin the cure or cause of human ailments P(t | R) 0.9914 0.9831 0.6885 0.2821 0.1071 How Necessary are Words? 10

Mismatch Statistics Mismatch variation across terms (TREC 3 title) (TREC 9 desc) –Not constant, need prediction 11

Mismatch Statistics (2) Mismatch variation for the same term in different queries TREC 3 recurring words –Query dependent features needed (1/3 term occurrences have necessity variation>0.1) 12

Prior Prediction Approaches Croft/Harper combination match (1979) –treats P(t | R) as a tuned constant –when >0.5, rewards docs that match more query terms Greiff’s (1998) exploratory data analysis –Used idf to predict overall term weighting –Improved over BIM Metzler’s (2008) generalized idf –Used idf to predict P(t | R) –Improved over BIM Years of simple idf feature, limited success –Missing piece: P(t | R) = term necessity = term recall 13

Factors that Affect Necessity What causes a query term to not appear in relevant documents? Topic Centrality (Concept Necessity) –E.g., Laser research related or potentially related to US defense, Welfare laws propounded as reforms Synonyms –E.g., movie == film == … Abstractness –E.g., Ailments in the vitamin query, Dog Maulings, Christian Fundamentalism –Worst thing is a rare & abstract term, e.g. prognosis 14

Features We need to –Identify synonyms/searchonyms of a query term –in a query dependent way Use Thesauri? –Biased (not collection dependent) –Static (not query dependent) –Not promising, Not easy Term-term similarity in concept space! –Local LSI (Latent Semantic Indexing) LSI of (e.g. 200) top ranked documents keep (e.g. 150) dimensions 15

Features Topic Centrality –Length of term vector after dimension reduction (local LSI) Synonymy (Concept Necessity) –Average similarity scores of top 5 similar terms Replaceability –Adjust the Synonymy measure by how many new documents the synonyms match Abstractness –Users modify abstract terms with concrete terms 16 effects on the US educational programprognosis of a political third party

Experiments Necessity Prediction Error –Regression problem Model: RBF kernel regression, M:  P(t | R) Necessity for Term Weighting –End-to-End retrieval performance –How to weight terms by their necessity In BM25 –Binary Independence Model In Language Models –Relevance model P m (t | R) – multinomial (Lavrenko and Croft 2001) 17

Necessity Prediction Example 18 partypoliticalthirdviabilityprognosis True P(t | R)0.97960.71430.59180.04080.0204 Predicted0.75850.65230.62360.30800.2869 Emphasis Trained on TREC 3, tested on TREC 4

Necessity Prediction Error 19 L1 Loss: The lower The better Necessity is as important as idf Explains behavior of IR models Can be predicted Performance gain Main Points

Predicted Necessity Weighting 20 TREC train sets33-53-77 Test/x-validation4688 LM desc – Baseline0.17890.15860.1923 LM desc – Necessity0.22610.19590.23140.2333 Improvement26.38%23.52%20.33%21.32% P@10 Baseline0.41600.29800.3860 Necessity0.49400.34200.42200.4380 P@20 Baseline0.34500.24400.3310 Necessity0.41800.29000.35400.3610 10-25% gain (necessity weight) 10-20% gain (top Precision)

TREC train sets3-991113 Test/x-validation10 1214 LM desc – Baseline0.1627 0.02390.1789 LM desc – Necessity0.18130.18100.05970.2233 Improvement11.43%11.25%149.8%24.82% P@10 Baseline0.3180 0.02000.4720 Necessity0.32800.34000.04670.5360 P@20 Baseline0.2400 0.02110.4460 Necessity0.27900.28100.04110.5030 Predicted Necessity Weighting (ctd.) 21 Necessity is as important as idf Explains behavior of IR models Can be predicted Performance gain Main Points

vs. Relevance Model Test/x-validation468810 1214 Relevance Model desc0.24230.17990.2352 0.1888 0.02210.1774 RM reweight-Only desc0.22150.17050.2435 0.1700 0.06920.1945 RM reweight-Trained desc0.23300.19210.25420.25630.18090.17930.05340.2258 22 Weight Only ≈ Expansion Supervised > Unsupervised (5-10%) Relevance Model: #weight( 1-λ #combine( t 1 t 2 ) λ #weight( w 1 t 1 w 2 t 2 w 3 t 3 … ) x ~ y w 1 ~ P(t 1 |R) w 2 ~ P(t 2 |R) x y

23 Necessity is as important as idf (theory) Explains behavior of IR models (practice) Effective features can predict necessity Performance gain Take Home Messages

Acknowledgements Reviewers from multiple venues Ni Lao, Frank Lin, Yiming Yang, Stephen Robertson, Bruce Croft, Matthew Lease –Discussions & references David Fisher, Mark Hoy –Maintaining the Lemur toolkit Andrea Bastoni and Lorenzo Clemente –Maintaining LSI code for Lemur toolkit SVM-light, Stanford parser TREC –All the data NSF Grant IIS-0707801 and IIS-0534345 24 Feedback: Le Zhao (lezhao@cs.cmu.edu)

Term Necessity Prediction P(t | R q ) Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie Mellon University Oct.

Similar presentations

Presentation on theme: "Term Necessity Prediction P(t | R q ) Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie Mellon University Oct."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Term Necessity Prediction P(t | R q ) Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie Mellon University Oct.

Similar presentations

Presentation on theme: "Term Necessity Prediction P(t | R q ) Le Zhao and Jamie Callan Language Technologies Institute School of Computer Science Carnegie Mellon University Oct."— Presentation transcript:

Similar presentations

About project

Feedback