Presentation on theme: "Cumulative Progress in Language Models for Information Retrieval Antti Puurula 6/12/2013 Australasian Language Technology Workshop University of Waikato."— Presentation transcript:
Cumulative Progress in Language Models for Information Retrieval Antti Puurula 6/12/2013 Australasian Language Technology Workshop University of Waikato
Ad-hoc Information Retrieval Ad-hoc Information Retrieval (IR) forms the basic task in IR: Given a query, retrieve and rank documents in a collection Origins: Cranfield 1 (1958-1960), Cranfield 2 (1962-1966), SMART (1961-1999) Major evaluations: TREC Ad-hoc (1990-1999), TREC Robust (2003-2005), CLEF (2000-2009), INEX (2009-2010), NTCIR (1999-2013), FIRE (2008-2013)
Illusionary Progress in Ad-hoc IR TREC ad-hoc evaluations stopped in 1999, as progress plateaued More diverse tasks became the foci of research “There is little evidence of improvement in ad-hoc retrieval technology over the past decade” (Armstrong et al. 2009) Weak baselines, non-cumulative improvements ⟶ “no way of using LSI achieves a worthwhile improvement in retrieval accuracy over BM25” (Atreya & Elkan, 2010) ⟶ “there remains very little room for improvement in ad hoc search” (Trotman & Keeler, 2011)
Progress in Language Models for IR? Language Models (LM) form one of the main approaches to IR Many improvements to LMs not adopted generally or evaluated systematically TF-IDF feature weighting Pitman-Yor Process smoothing Feedback models Are these improvements consistent across standard datasets, cumulative, and do they improve on a strong baseline?
Pitman-Yor Process Smoothing Standard methods for smoothing in IR LMs are Dirichlet Prior (DP) and 2-Stage Smoothing (2SS) (Zhai & Lafferty 2004, Smucker & Allan 2007) Recent suggested improvement is Pitman-Yor Process smoothing (PYP), an approximation to inference on a Pitman-Yor Process (Momtazi & Klakow 2010, Huang & Renals 2010) All methods interpolate unsmoothed parameters with a background distribution. PYP additionally discounts the unsmoothed counts
Pitman-Yor Process Smoothing 2 All methods share the form: DP: 2SS: PYP:, and
Pitman-Yor Process Smoothing 2 All methods share the form: DP: 2SS: PYP:, and,
TF-IDF Feature Weighting Multinomial modelling assumptions of text can be corrected with TF-IDF weighting (Rennie et al. 2003, Frank & Bouckaert 2006) Traditional view: IDF-weighting unnecessary with IR LMs (Zhai & Lafferty 2004) Recent view: combination is complementary (Smucker & Allan 2007, Momtazi et al. 2010)
TF-IDF Feature Weighting 3 IDF has a overlapping function to collection smoothing (Hiemstra & Kraaij 1998) Interaction taken into account by replacing collection model by a uniform model in smoothing:
Results Significant differences: PYP > DP PYP+TI > 2SS PYP+TI+FB > PYP+TI PYP+TI+FB improves on 2SS by 4.07 MAP@50 absolute, a 17.1% relative improvement
Discussion The 3 evaluated improvements in language models for IR: require little additional computation can be implemented with small modifications to existing IR systems are substantial, significant and cumulative across 13 standard datasets, compared to DP and 2SS baselines (4.07 MAP@50 absolute, 17.1% relative) Improvements requiring more computation possible document neighbourhood smoothing, word correlation models, passage- based LMs, bigram LMs, … More extensive evaluations needed for confirming progress
Your consent to our cookies if you continue to use this website.