Automatically obtain a description for a larger cluster of relevant documents Identify terms related to query terms  Synonyms, stemming variations, terms.

Slides:



Advertisements
Similar presentations
Chapter 5: Query Operations Hassan Bashiri April
Advertisements

Introduction to Information Retrieval
Query operations 1- Introduction 2- Relevance feedback with user relevance information 3- Relevance feedback without user relevance information - Local.
Ke Liu1, Junqiu Wu2, Shengwen Peng1,Chengxiang Zhai3, Shanfeng Zhu1
 Andisheh Keikha Ryerson University Ebrahim Bagheri Ryerson University May 7 th
1 Advanced information retrieval Chapter. 05: Query Reformulation.
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
Database Management Systems, R. Ramakrishnan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides.
Learning Techniques for Information Retrieval Perceptron algorithm Least mean.
Automatic Image Annotation and Retrieval using Cross-Media Relevance Models J. Jeon, V. Lavrenko and R. Manmathat Computer Science Department University.
Chapter 5: Query Operations Baeza-Yates, 1999 Modern Information Retrieval.
CSM06 Information Retrieval Lecture 3: Text IR part 2 Dr Andrew Salway
A novel log-based relevance feedback technique in content- based image retrieval Reporter: Francis 2005/6/2.
1 Query Language Baeza-Yates and Navarro Modern Information Retrieval, 1999 Chapter 4.
Recall: Query Reformulation Approaches 1. Relevance feedback based vector model (Rocchio …) probabilistic model (Robertson & Sparck Jones, Croft…) 2. Cluster.
Query Reformulation: User Relevance Feedback. Introduction Difficulty of formulating user queries –Users have insufficient knowledge of the collection.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Relevance Feedback Main Idea:
Learning Techniques for Information Retrieval We cover 1.Perceptron algorithm 2.Least mean square algorithm 3.Chapter 5.2 User relevance feedback (pp )
Modern Information Retrieval Chapter 5 Query Operations 報告人:林秉儀 學號:
Clustering. What is clustering? Grouping similar objects together and keeping dissimilar objects apart. In Information Retrieval, the cluster hypothesis.
Web Search - Summer Term 2006 II. Information Retrieval (Models, Cont.) (c) Wolfgang Hürst, Albert-Ludwigs-University.
Query Operations: Automatic Global Analysis. Motivation Methods of local analysis extract information from local set of documents retrieved to expand.
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics – Bag of concepts – Semantic distance between two words.
Query Expansion.
Information Retrieval and Web Search Relevance Feedback. Query Expansion Instructor: Rada Mihalcea Class web page:
Analyzing and Evaluating Query Reformulation Strategies in Web Search Logs ReporterHsan-Yu Lin.
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Query Routing in Peer-to-Peer Web Search Engine Speaker: Pavel Serdyukov Supervisors: Gerhard Weikum Christian Zimmer Matthias Bender International Max.
Query Operations J. H. Wang Mar. 26, The Retrieval Process User Interface Text Operations Query Operations Indexing Searching Ranking Index Text.
Query Operations. Query Models n IR Systems usually adopt index terms to process queries; n Index term: u A keyword or a group of selected words; u Any.
1 Query Operations Relevance Feedback & Query Expansion.
Evaluation INST 734 Module 5 Doug Oard. Agenda  Evaluation fundamentals Test collections: evaluating sets Test collections: evaluating rankings Interleaving.
1 Computing Relevance, Similarity: The Vector Space Model.
CPSC 404 Laks V.S. Lakshmanan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides at UC-Berkeley.
Relevance Feedback: New Trends Derive global optimization methods: More computationally robust Consider the correlation between different attributes Incorporate.
SINGULAR VALUE DECOMPOSITION (SVD)
Chap. 5 Chapter 5 Query Operations. 2 Chap. 5 Contents Introduction User relevance feedback Automatic local analysis Automatic global analysis Trends.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
Semantic Wordfication of Document Collections Presenter: Yingyu Wu.
Query Expansion By: Sean McGettrick. What is Query Expansion? Query Expansion is the term given when a search engine adding search terms to a user’s weighted.
© 2004 Chris Staff CSAW’04 University of Malta of 15 Expanding Query Terms in Context Chris Staff and Robert Muscat Department of.
Query Suggestion. n A variety of automatic or semi-automatic query suggestion techniques have been developed  Goal is to improve effectiveness by matching.
Vector Space Models.
C.Watterscsci64031 Probabilistic Retrieval Model.
Motivation  Methods of local analysis extract information from local set of documents retrieved to expand the query  An alternative is to expand the.
Information Retrieval and Web Search Relevance Feedback. Query Expansion Instructor: Rada Mihalcea.
Generating Query Substitutions Alicia Wood. What is the problem to be solved?
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Hsin-Hsi Chen5-1 Chapter 5 Query Operations Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University.
Information Retrieval CSE 8337 Spring 2007 Query Operations Material for these slides obtained from: Modern Information Retrieval by Ricardo Baeza-Yates.
Information Retrieval CSE 8337 Spring 2003 Query Operations Material for these slides obtained from: Modern Information Retrieval by Ricardo Baeza-Yates.
Query expansion COMP423. Menu Query expansion Two approaches Relevance feedback Thesaurus-based Most Slides copied from
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics Semantic distance between two words.
Collection Synthesis Donna Bergmark Cornell Digital Library Research Group March 12, 2002.
Linguistic Graph Similarity for News Sentence Searching
Web News Sentence Searching Using Linguistic Graph Similarity
An Automatic Construction of Arabic Similarity Thesaurus
Information Retrieval on the World Wide Web
Author: Kazunari Sugiyama, etc. (WWW2004)
Relevance Feedback & Query Expansion
Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web Yizhe Ge.
Automatic Global Analysis
Query Operations Berlin Chen 2003 Reference:
Retrieval Utilities Relevance feedback Clustering
Yet another Example T This happens to be a rank-7 matrix
Information Retrieval and Web Design
Query Operations Berlin Chen
Presentation transcript:

Automatically obtain a description for a larger cluster of relevant documents Identify terms related to query terms  Synonyms, stemming variations, terms close to query terms Local analysis Use correlated terms from retrieved documents for query expansion

Three types of clusters Association clusters  Stems co-occurring frequently inside documents have a synonymity association

Un-normalized correlation factor S u,v =C u,v Normalized correlation factor

 Build local association clusters as follows  Find clusters for the query terms

Metric clusters  Consider the distance between two terms to compute their correlation factor

Un-normalized correlation factor S u,v =C u,v Normalized correlation factor  Build local metric clusters as follows

Scalar clusters  Two stems with similar neighborhoods have some synonymity relationship

 A term S u is a neighbor of S v if S u belongs to a cluster (of size n) associated with S v  Neighbor stems having a synonymity relationship are not necessarily synonyms in the grammatical sense  Union of un-normalized and normalized clusters provides a better representation of possible correlations Metric clusters seem to perform better than purely association clusters

Global analysis Expand the query using information from the whole set of documents in the collection  Build a thesaurus-like structure  Select terms for expansion based on their similarity to the whole query Previous approaches failed to yield good results by considering individual query terms

Query expression done in three steps  Represent the query as follows

 Compute the similarity between each term correlated to the query terms and the whole query

 Expand the query with the top r ranked terms according to the similarity computed Yield improved retrieval performance in the range of 20%