Query Reformulation: User Relevance Feedback. Introduction Difficulty of formulating user queries –Users have insufficient knowledge of the collection.

Slides:



Advertisements
Similar presentations
Relevance Feedback User tells system whether returned/disseminated documents are relevant to query/information need or not Feedback: usually positive sometimes.
Advertisements

Chapter 5: Query Operations Hassan Bashiri April
Introduction to Information Retrieval
INSTRUCTOR: DR.NICK EVANGELOPOULOS PRESENTED BY: QIUXIA WU CHAPTER 2 Information retrieval DSCI 5240.
Query operations 1- Introduction 2- Relevance feedback with user relevance information 3- Relevance feedback without user relevance information - Local.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
Introduction to Information Retrieval (Part 2) By Evren Ermis.
1 Relevance Feedback and other Query Modification Techniques 課程名稱 : 資訊擷取與推薦技術 指導教授 : 黃三益 教授 報告者 : 博一 楊錦生 (d ) 博一 曾繁絹 (d )
Evaluating Search Engine
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement: Relevance Feedback Information Filtering.
Database Management Systems, R. Ramakrishnan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides.
Learning Techniques for Information Retrieval Perceptron algorithm Least mean.
1 Statistical correlation analysis in image retrieval Reporter : Erica Li 2004/9/30.
ISP433/633 Week 3 Query Structure and Query Operations.
Chapter 5: Query Operations Baeza-Yates, 1999 Modern Information Retrieval.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) IR Queries.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Modeling Modern Information Retrieval
Modern Information Retrieval Chapter 5 Query Operations.
Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.
1 Query Language Baeza-Yates and Navarro Modern Information Retrieval, 1999 Chapter 4.
Information Retrieval: Human-Computer Interfaces and Information Access Process.
Recall: Query Reformulation Approaches 1. Relevance feedback based vector model (Rocchio …) probabilistic model (Robertson & Sparck Jones, Croft…) 2. Cluster.
SIMS 202 Information Organization and Retrieval Prof. Marti Hearst and Prof. Ray Larson UC Berkeley SIMS Tues/Thurs 9:30-11:00am Fall 2000.
Evaluating the Performance of IR Sytems
1 CS 430 / INFO 430 Information Retrieval Lecture 3 Vector Methods 1.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Internet Resources Discovery (IRD) Advanced Topics.
MARS: Applying Multiplicative Adaptive User Preference Retrieval to Web Search Zhixiang Chen & Xiannong Meng U.Texas-PanAm & Bucknell Univ.
Learning Techniques for Information Retrieval We cover 1.Perceptron algorithm 2.Least mean square algorithm 3.Chapter 5.2 User relevance feedback (pp )
Chapter 5: Information Retrieval and Web Search
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
LOGO XML Keyword Search Refinement 郭青松. Outline  Introduction  Query Refinement in Traditional IR  XML Keyword Query Refinement  My work.
Query Expansion.
APPLICATIONS OF DATA MINING IN INFORMATION RETRIEVAL.
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement and Relevance Feedback.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
Query Operations J. H. Wang Mar. 26, The Retrieval Process User Interface Text Operations Query Operations Indexing Searching Ranking Index Text.
1 Query Operations Relevance Feedback & Query Expansion.
Evaluation INST 734 Module 5 Doug Oard. Agenda  Evaluation fundamentals Test collections: evaluating sets Test collections: evaluating rankings Interleaving.
Chapter 6: Information Retrieval and Web Search
1 Computing Relevance, Similarity: The Vector Space Model.
Relevance Feedback Hongning Wang What we have learned so far Information Retrieval User results Query Rep Doc Rep (Index) Ranker.
Chap. 5 Chapter 5 Query Operations. 2 Chap. 5 Contents Introduction User relevance feedback Automatic local analysis Automatic global analysis Trends.
IR Theory: Relevance Feedback. Relevance Feedback: Example  Initial Results Search Engine2.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
Evidence from Behavior
Performance Measurement. 2 Testing Environment.
Information Retrieval and Web Search Relevance Feedback. Query Expansion Instructor: Rada Mihalcea.
Relevance Feedback Hongning Wang
Hsin-Hsi Chen5-1 Chapter 5 Query Operations Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University.
Relevance Feedback Prof. Marti Hearst SIMS 202, Lecture 24.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Query Refinement and Relevance Feedback.
1 CS 430: Information Discovery Lecture 21 Interactive Retrieval.
Relevant Document Distribution Estimation Method for Resource Selection Luo Si and Jamie Callan School of Computer Science Carnegie Mellon University
Lecture 9: Query Expansion. This lecture Improving results For high recall. E.g., searching for aircraft doesn’t match with plane; nor thermodynamic with.
Sampath Jayarathna Cal Poly Pomona
Lecture 12: Relevance Feedback & Query Expansion - II
Text Based Information Retrieval
Special Topics on Information Retrieval
Relevance Feedback Hongning Wang
Relevance Feedback & Query Expansion
Query Operations Berlin Chen 2003 Reference:
Relevance and Reinforcement in Interactive Browsing
Retrieval Utilities Relevance feedback Clustering
Presentation transcript:

Query Reformulation: User Relevance Feedback

Introduction Difficulty of formulating user queries –Users have insufficient knowledge of the collection make-up –Users have insufficient knowledge of the retrieval environment Query reformulation to improve user query –two basic methods query expansion –Expanding the original query with new terms term reweighting –Reweighting the terms in the expanded query

Introduction Approaches for query reformulation –user relevance feedback based on feedback information from the user –local analysis based on information derived from the set of documents initially retrieved (local set) –global analysis based on global information derived from the document collection

User Relevance Feedback User’s role in URF cycle –is presented with a list of the retrieved documents –marks relevant documents Main idea of URF –selecting important terms, or expressions, attached to the documents that have been identified as relevant by the user –enhancing the importance of these terms in new query formulation –effect: the new query will be moved towards the relevant documents and away from the non-relevant ones

User Relevance Feedback Advantages of URF –it shields the user from the details of the query reformulation process users only have to provide a relevance judgment on documents –it breaks down the whole searching task into a sequence of small steps which are easier to grasp –it provides a controlled process designed to emphasize relevant terms and de-emphasize non-relevant terms

URF for Vector Model Assumptions –the term-weight vectors of the documents identified as relevant to the query have similarities among themselves. –non-relevant documents have term-weight vectors which are dissimilar from the ones for the relevant documents. Basic idea –reformulate the query such that it gets closer to the term-weight vector space of the relevant documents

The Perfect (Vector Model) Query Assume we know what documents are relevant and which are not. Given: –a collection of N documents –C r : the set of relevant documents What is the optimal query?

Back to Reality Actually, what we are trying to figure out is which documents are relevant and which are not. Our ideal query & definitions: –a collection of N documents –C r : the set of relevant documents –D r : set of documents user identified as relevant –D n : set of retrieved documents not relevant –α, β, γ : tuning constants Modified Query (Rochio)

Rochio & Ide Variations Standard Rochio Ide (Regular) Ide (Dec_Hi) where max nonrelevant (d j ): the highest ranked non-relevant document

Tuning the Feedback Modified Query How do we set the tuning constants α, β, γ? –Rochio originally set α = 1 –Ide originally set α = β = γ = 1 Often, positive relevance feedback is more valuable than negative relevance feedback. –this implies: β > γ –purely positive feedback mechanism: γ = 0

URF for Vector Model Includes both query expansion and term reweighting Advantages –simplicity modified term weights are computed directly from the set of retrieved documents –good results modified query vector does reflect a portion of the intended query semantics Issue: As with all learning techniques, this assumes the information need is relatively static.

Evaluation of Relevance Feedback Strategies Simplistic evaluation is to compare the results of the modified query to the original query. –Does not work!!! –Results are great but mostly due to higher ranking of documents returned by original query. –User has already seen these documents.

Evaluation of Relevance Feedback Strategies More realistic evaluation –Compute precision and recall on residual collection (those documents not returned by the original query) –Because highly-ranked documents are removed, these results can be worse than for the original query. –That is okay if we are comparing between relevance feedback approaches.