Estimating Topical Context by Diverging from External Resources SIGIR’13, July 28–August 1, 2013, Dublin, Ireland. Presenter: SHIH, KAI WUN Romain Deveaud.

Slides:

Advertisements

Similar presentations

Date : 2013/05/27 Author : Anish Das Sarma, Lujun Fang, Nitin Gupta, Alon Halevy, Hongrae Lee, Fei Wu, Reynold Xin, Gong Yu Source : SIGMOD’12 Speaker.

Advertisements

A Machine Learning Approach for Improved BM25 Retrieval

Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.

1 Entity Ranking Using Wikipedia as a Pivot (CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps 2010/12/14 Yu-wen,Hsu.

Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.

1 Block-based Web Search Deng Cai *1, Shipeng Yu *2, Ji-Rong Wen * and Wei-Ying Ma * * Microsoft Research Asia 1 Tsinghua University 2 University of Munich.

Chapter 5: Query Operations Baeza-Yates, 1999 Modern Information Retrieval.

ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.

An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,

Integrating Multiple Resources for Diversified Query Expansion Arbi Bouchoucha, Xiaohua Liu, and Jian-Yun Nie Dept. of Computer Science and Operations.

Scalable Text Mining with Sparse Generative Models

The Relevance Model  A distribution over terms, given information need I, (Lavrenko and Croft 2001). For term r, P(I) can be dropped w/o affecting the.

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 30, (2014) BERLIN CHEN, YI-WEN CHEN, KUAN-YU CHEN, HSIN-MIN WANG2 AND KUEN-TYNG YU Department of Computer.

1 Probabilistic Language-Model Based Document Retrieval.

Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.

Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.

Exploiting Wikipedia as External Knowledge for Document Clustering Sakyasingha Dasgupta, Pradeep Ghosh Data Mining and Exploration-Presentation School.

Leveraging Conceptual Lexicon ： Query Disambiguation using Proximity Information for Patent Retrieval Date : 2013/10/30 Author : Parvaz Mahdabi, Shima.

Minimal Test Collections for Retrieval Evaluation B. Carterette, J. Allan, R. Sitaraman University of Massachusetts Amherst SIGIR2006.

Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?

A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA

Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.

1 Retrieval and Feedback Models for Blog Feed Search SIGIR 2008 Advisor ： Dr. Koh Jia-Ling Speaker ： Chou-Bin Fan Date ：

1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.

Linking Wikipedia to the Web Antonio Flores Bernal Department of Computer Sciencies San Pablo Catholic University 2010.

A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.

1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu.

A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.

Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Presented by Chen Yi-Ting.

Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.

April 14, 2003Hang Cui, Ji-Rong Wen and Tat- Seng Chua 1 Hierarchical Indexing and Flexible Element Retrieval for Structured Document Hang Cui School of.

Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.

Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,

Exploring Text: Zipf’s Law and Heaps’ Law. (a) (b) (a) Distribution of sorted word frequencies (Zipf’s law) (b) Distribution of size of the vocabulary.

DISCRIMINATIVE TRAINING OF LANGUAGE MODELS FOR SPEECH RECOGNITION Hong-Kwang Jeff Kuo, Eric Fosler-Lussier, Hui Jiang, Chin-Hui Lee ICASSP 2002 Min-Hsuan.

INTERESTING NUGGETS AND THEIR IMPACT ON DEFINITIONAL QUESTION ANSWERING Kian-Wei Kor, Tat-Seng Chua Department of Computer Science School of Computing.

Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign.

CS 533 Information Retrieval Systems.  Introduction  Connectivity Analysis  Kleinberg’s Algorithm  Problems Encountered  Improved Connectivity Analysis.

Predicting Question Quality Bruce Croft and Stephen Cronen-Townsend University of Massachusetts Amherst.

Probabilistic Models of Novel Document Rankings for Faceted Topic Retrieval Ben Cartrette and Praveen Chandar Dept. of Computer and Information Science.

Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.

1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)

1 Using The Past To Score The Present: Extending Term Weighting Models with Revision History Analysis CIKM’10 Advisor ： Jia Ling, Koh Speaker ： SHENG HONG,

Iterative Translation Disambiguation for Cross Language Information Retrieval Christof Monz and Bonnie J. Dorr Institute for Advanced Computer Studies.

Positional Relevance Model for Pseudo–Relevance Feedback Yuanhua Lv & ChengXiang Zhai Department of Computer Science, UIUC Presented by Bo Man 2014/11/18.

Semantic v.s. Positions: Utilizing Balanced Proximity in Language Model Smoothing for Information Retrieval Rui Yan†, ♮, Han Jiang†, ♮, Mirella Lapata‡,

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.

NTNU Speech Lab Dirichlet Mixtures for Query Estimation in Information Retrieval Mark D. Smucker, David Kulp, James Allan Center for Intelligent Information.

1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:

Query Suggestions in the Absence of Query Logs Sumit Bhatia, Debapriyo Majumdar,Prasenjit Mitra SIGIR’11, July 24–28, 2011, Beijing, China.

1 What Makes a Query Difficult? David Carmel, Elad YomTov, Adam Darlow, Dan Pelleg IBM Haifa Research Labs SIGIR 2006.

Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.

The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.

DISTRIBUTED INFORMATION RETRIEVAL Lee Won Hee.

A Supervised Machine Learning Algorithm for Research Articles Leonidas Akritidis, Panayiotis Bozanis Dept. of Computer & Communication Engineering, University.

A Generation Model to Unify Topic Relevance and Lexicon-based Sentiment for Opinion Retrieval Min Zhang, Xinyao Ye Tsinghua University SIGIR

Indri at TREC 2004: UMass Terabyte Track Overview Don Metzler University of Massachusetts, Amherst.

A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.

SIGIR 2005 Relevance Information: A Loss of Entropy but a Gain for IDF? Arjen P. de Vries Thomas Roelleke,

Usefulness of Quality Click- through Data for Training Craig Macdonald, ladh Ounis Department of Computing Science University of Glasgow, Scotland, UK.

University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G

LEARNING IN A PAIRWISE TERM-TERM PROXIMITY FRAMEWORK FOR INFORMATION RETRIEVAL Ronan Cummins, Colm O’Riordan (SIGIR’09) Speaker : Yi-Ling Tai Date : 2010/03/15.

An Effective Statistical Approach to Blog Post Opinion Retrieval Ben He, Craig Macdonald, Jiyin He, Iadh Ounis (CIKM 2008)

and Knowledge Graphs for Query Expansion Saeid Balaneshinkordan

Martin Rajman, Martin Vesely

Compact Query Term Selection Using Topically Related Text

Relevance and Reinforcement in Interactive Browsing

Presentation transcript:

Estimating Topical Context by Diverging from External Resources SIGIR’13, July 28–August 1, 2013, Dublin, Ireland. Presenter: SHIH, KAI WUN Romain Deveaud University of Avignon - LIA Avignon, France fr Eric SanJuan University of Avignon - LIA Avignon, France fr Patrice Bellot Aix-Marseille University - LSIS Marseille, France 1

Outline Introduction Divergence From Resources Experiments Conclusion & Future Work 2

Introduction(1/3) Automatically retrieving documents that are relevant to this initial information need may thus be challenging without additional information about the topical context of the query. One common approach to tackle this problem is to extract evidences from query-related documents. The basic idea is to expand the query with words or multi-word terms extracted from feedback documents. Words that convey the most information or that are the most relevant to the initial query are then used to reformulate the query. They can come from the target collection or from external sources and several sources can be combined. 3

Introduction(2/3) Documents are then ranked based, among others, on their similarity to the estimated topical context. We explore the opposite direction and choose to carry experiments with a method that discounts documents scores based on their divergences from pseudo-relevant subsets of external resources. We allow the method to take several resources into account and to weight the divergences in order to provide a comprehensive interpretation of the topical context. More, our method equally considers sequences of 1, 2 or 3 words and chooses which terms best describe the topical context without any supervision. 4

Introduction(3/3) The use of external data sets had been extensively studied in the pseudo-relevance feedback setting, and proved to be effective at improving search performance when choosing proper data. However studies mainly concentrated on demonstrating how the use of a single resource could improve performance. Data sources like Wikipedia, Word-Net, news corpora or even the web itself were used separately for enhancing search performances. 5

Divergence From Resources(1/5) In this work, we use a language modeling approach to information retrieval. Our goal is to accurately model the topical context of a query by using external resources. 6

Divergence From Resources(2/5) We use the Kullback-Leibler divergence to measure the information gain (or drift) between a given resource R and a document D. Formally, the KL divergence between two language models R and D is written as: where t is a term belonging to vocabulary V. The first part is the resource entropy and does not affect ranking of documents, which allows us to simplify the KL divergence and to obtain equation (1). 7

Divergence From Resources(3/5) In order to capture the topical context from the resource, we estimate the θ R model through pseudo-relevance feedback. Given a ranked list R Q obtained by retrieving the top N documents of R using query likelihood, the feedback query model is estimated by: The right-hand expression of this estimation is actually equivalent to computing the entropy of the term t in the pseudo-relevant subset R Q. When forming the V set, we slide a window over the entire textual content of R Q and consider all sequences of 1, 2 or 3 words. 8

Divergence From Resources(4/5) Following equation (1), we compute the information divergence between a resource R and a document D as: 9

Divergence From Resources(5/5) The final score of a document D with respect to a given user query Q is determined by the linear combination of query word matches and the weighted divergence from general resources. It is formally written as: where S is a set of resources, P(Q| θ D ) is standard query likelihood with Dirichlet smoothing and φ R represents the weight given to resource R. We use here the information divergence to reduce the score of a document: the greater the divergence, the lower the score of the document will be. 10

Experiments- Experimental setup(1/6) We performed our evaluation using two main TREC collections which represent two different search contexts. The first one is the WT10g web collection and consists of 1,692,096 web pages, as well as the associated TREC topics and judgments. The second data set is the Robust04 collection, which is composed of news articles coming from various newspapers. 11

Experiments- Experimental setup(2/6) The test set contains 250 topics and relevance judgements of the Robust 2004 track. Along with the test collections, we used a set of external resources from which divergences are computed. This set is composed of four general resources: Wikipedia as an encyclopedic source, the New York Times and GigaWord corpora as sources of news data and the category B of the ClueWeb092 collection as a web source. 12

Experiments- Experimental setup(3/6) The English GigaWord LDC corpus consists of 4,111,240 news- wire articles collected from four distinct international sources including the New York Times. The New York Times LDC corpus contains 1,855,658 news articles published between 1987 and The Wikipedia collection is a recent dump from May 2012 of the online encyclopedia that contains 3,691,092 documents3. The resulting corpus contains 29,038,220 web pages. 13

Experiments- Experimental setup(4/6) Indexing and retrieval were performed using Indri. We employ a Dirichlet smoothing and set the parameter to 1,500. Documents are ranked using equation(2). We compare the performance of the approach presented in Section 2 (DfRes) with that of three baselines: Query Likelihood (QL), Relevance Models (RM3) and Mixture of Relevance Models (MoRM). 14

Experiments- Experimental setup(5/5) Table 1, the MoRM and DfRes approaches both perform feedback using all external resources as well as the target collection, while RM3 only performs feedback using the target collection. QL uses no additional information. RM3, MoRM and DfRes depend on three free-parameters: λ which controls the weight given to the original query, k which is the number of terms and N which is the number of feedback documents from which terms are extracted. 15

Experiments- Experimental setup(5/6) We performed leave-one-query-out cross-validation to find the best parameter setting for λ and averaged the performance for all queries. Previous research by He and Ounis showed that doing PRF with the top 10 pseudo-relevant feedback documents was as effective as doing PRF with only relevant documents present in the top 10, and that there are no statistical differences. Following these findings, we set N = 10 and also k = 20, which was found to be a good PRF setting. 16

Experiments- Experimental setup(6/6) DfRes depends on an additional parameter φ R which controls the weight given to each resource. We also perform leave-one-query- out cross-validation to learn the best setting for each resource. In the following section, when discussing results obtained using single sources of expansion with DfRes, we use the notation DfRes-r where r ∈ (Web,Wiki,NYT,Gigaword). 17

Experiments - Results (1/8) The main observation we can draw from the ad hoc retrieval results presented in Table 1 is that using a combination of external information sources performs always better than only using the target collection. DfRes significantly outperforms RM3 on both collections, which confirms that state that combining external resources improves retrieval. 18

Experiments - Results (2/8) We see from Figure 1 that DfRes-Gigaword is ineffective on the WT10g collection. Another remarkable result is the ineffectiveness of the WT10g collection as a single source of expansion. 19

Experiments - Results (3/8) However we see from Table 2 that the learned weight φ R of this resource is very low (= 0.101), which significantly reduces its influence compared to other best performing resources (such as NYT or Web). Results are more coherent on the Robust collection. DfRes-NYT and DfRes-Gigaword achieve very good results, while the combination of all resources consistently achieves the best results. The very high weights learned for these resources hence reflect these good performances. 20

Experiments - Results (4/8) In this specific setting, it seems that the nature of the good- performing resources is correlated with the nature of the target collection. We observed that NYT and Gigaword articles, which are focused contributions produced by professional writers, are smaller on average (in unique words) than Wikipedia or Web documents. 21

Experiments - Results (5/8) We explored the influence of the number of feedback documents used for the approximation of each resource. Performances indeed remain almost constant for all resources as N varies. Changes in MAP are about 2% from N = 1 to N = 20 depending on the resource. 22

Experiments - Results (6/8) However we also explored the influence of the number of terms used to estimate each resource's model. While we could expect that increasing the number of terms would improve the granularity of the model and maybe capture more contextual evidences, we see from Figure 2 that using 100 terms is not really different than using 20 terms. 23

Experiments - Results (7/8) We even see on Figure 1 that only relying on the divergence from resources (i.e. setting = 0) achieves better results than only relying on the user query (i.e. setting = 1). More, setting = 0 for DfRes also outperforms MoRM. This suggests that DfRes is actually better as estimating the topical context of the information need than the user keyword query. 24

Experiments - Results (8/8) We also observe from Figure 1 and 2 that the NYT is the resource that provides the best estimation of the topical context for the two collections. One of the originality of the DfRes is that it can automatically take into account n-grams without any supervision. 25

Conclusion & Future Work Accurately estimating the topical context of a query is a challenging issue. We experimented a method that discounts documents based on their average divergence from a set of external resources. Results showed that, while reinforcing previous research, this method performs at least as good as a state-of-the-art resource combination approach, and sometimes achieves significantly higher results. 26