Modern Information Retrieval

Slides:



Advertisements
Similar presentations
Retrieval Evaluation J. H. Wang Mar. 18, Outline Chap. 3, Retrieval Evaluation –Retrieval Performance Evaluation –Reference Collections.
Advertisements

1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
1 Retrieval Performance Evaluation Modern Information Retrieval by R. Baeza-Yates and B. Ribeiro-Neto Addison-Wesley, (Chapter 3)
Exercising these ideas  You have a description of each item in a small collection. (30 web sites)  Assume we are looking for information about boxers,
Introduction to Information Retrieval (Part 2) By Evren Ermis.
Information Retrieval IR 7. Recap of the last lecture Vector space scoring Efficiency considerations Nearest neighbors and approximations.
Evaluating Search Engine
Search Engines and Information Retrieval
IR Challenges and Language Modeling. IR Achievements Search engines  Meta-search  Cross-lingual search  Factoid question answering  Filtering Statistical.
Evaluation.  Allan, Ballesteros, Croft, and/or Turtle Types of Evaluation Might evaluate several aspects Evaluation generally comparative –System A vs.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Information Retrieval in Practice
INFO 624 Week 3 Retrieval System Evaluation
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.
Evaluating the Performance of IR Sytems
Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
WXGB6106 INFORMATION RETRIEVAL Week 3 RETRIEVAL EVALUATION.
ISP 433/633 Week 6 IR Evaluation. Why Evaluate? Determine if the system is desirable Make comparative assessments.
Evaluation.  Allan, Ballesteros, Croft, and/or Turtle Types of Evaluation Might evaluate several aspects Evaluation generally comparative –System A vs.
Evaluation Information retrieval Web. Purposes of Evaluation System Performance Evaluation efficiency of data structures and methods operational profile.
LIS618 lecture 11 i/r performance evaluation Thomas Krichel
Chapter 5: Information Retrieval and Web Search
Search and Retrieval: Relevance and Evaluation Prof. Marti Hearst SIMS 202, Lecture 20.
Search Engines and Information Retrieval Chapter 1.
Information Retrieval and Web Search IR Evaluation and IR Standard Text Collections.
Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.
Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?
IR Evaluation Evaluate what? –user satisfaction on specific task –speed –presentation (interface) issue –etc. My focus today: –comparative performance.
1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
Chapter 6: Information Retrieval and Web Search
Information retrieval 1 Boolean retrieval. Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text)
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
IR System Evaluation Farhad Oroumchian. IR System Evaluation System-centered strategy –Given documents, queries, and relevance judgments –Try several.
LIS618 lecture 3 Thomas Krichel Structure of talk Document Preprocessing Basic ingredients of query languages Retrieval performance evaluation.
Relevance Feedback Hongning Wang What we have learned so far Information Retrieval User results Query Rep Doc Rep (Index) Ranker.
WIRED Week 3 Syllabus Update (next week) Readings Overview - Quick Review of Last Week’s IR Models (if time) - Evaluating IR Systems - Understanding Queries.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Evaluation of (Search) Results How do we know if our results are any good? Evaluating a search engine  Benchmarks  Precision and recall Results summaries:
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Conceptual structures in modern information retrieval Claudio Carpineto Fondazione Ugo Bordoni
Performance Measurement. 2 Testing Environment.
Performance Measures. Why to Conduct Performance Evaluation? 2 n Evaluation is the key to building effective & efficient IR (information retrieval) systems.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Relevance Feedback Hongning Wang
Evaluation of Information Retrieval Systems Xiangming Mu.
Evaluation. The major goal of IR is to search document relevant to a user query. The evaluation of the performance of IR systems relies on the notion.
Information Retrieval Quality of a Search Engine.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
Information Retrieval Lecture 3 Introduction to Information Retrieval (Manning et al. 2007) Chapter 8 For the MSc Computer Science Programme Dell Zhang.
Retrieval Evaluation Modern Information Retrieval, Chapter 3
1 INFILE - INformation FILtering Evaluation Evaluation of adaptive filtering systems for business intelligence and technology watch Towards real use conditions.
Information Retrieval in Practice
Text Based Information Retrieval
Multimedia Information Retrieval
Evaluation.
Modern Information Retrieval
IR Theory: Evaluation Methods
Q4 Measuring Effectiveness
CS246: Information Retrieval
Retrieval Evaluation - Reference Collections
Retrieval Evaluation - Measures
Retrieval Evaluation - Measures
Retrieval Performance Evaluation - Measures
Precision and Recall Reminder:
Presentation transcript:

Modern Information Retrieval Chapter 3 Retrieval Evaluation

The most common measures of system performance are time and space an inherent tradeoff Data retrieval time and space indexing Information retrieval precision of the answer set also important

evaluation considerations query with/without feedback query interface design real data/synthetic data real life/laboratory environment repeatability and scalability

recall and precision recall: fraction of relevant documents which has been retrieved precision: fraction of retrieved documents which is relevant

can we precisely compute precisions? can we precisely compute recalls?

precision versus recall curve: a standard evaluation strategy

interpolation procedure for generating the 11 standard recall levels Rq={d3,d56,d129} where j is in {0,1,2,…,10} and P(r) is a known precision

to evaluate the retrieval strategy over all test queries, the precisions at each recall level are averaged

another approach: compute average precision at given relevant document cutoff values advantages?

single value summary for each query average precision at seen relevant documents example in Figure 3.2 favor systems which retrieve relevant documents quickly can have a poor overall recall performance R-precision R: total number of relevant documents examples in Figures 3.2 and 3.3

precision histogram

combining recall and precision the harmonic mean it assumes a high value only when both recall and precision are high

the E measure b=1, complement of the harmonic mean b>1, the user is more interested in precision b<1, the user is more interested in recall

user-oriented measures

coverage ratio: fraction of the documents known to be relevant which has been retrieved the system finds the relevant documents the user expected to see

novelty ratio: fraction of the relevant documents retrieved which was previously unknown to the user the system reveals new relevant documents previously unknown to the user

relative recall: the ratio between the number of relevant documents found and the number of relevant documents the user expected to find relative recall= when the relative recall equals to 1 (the user finds enough relevant documents), the user stops searching

recall effort: the ratio between the number of relevant documents the user expected to find and the number of documents examined in an attempt to find the expected relevant documents research in IR lack a solid formal framework lack robust and consistent testbeds and benchmarks Text REtrieval Conference

retrieval techniques methods using automatic thesauri sophisticated term weighting natural language techniques relevance feedback advanced pattern matching document collection over 1 million documents newspaper, patents, etc. topics in natural language conversion done by the system

relevant documents the pooling method: for each topic, collect the top k documents generated by each participating system and decide their relevance by human assessors the benchmark tasks ad hoc task filtering task Chinese cross languages spoken document retrieval high precision very large collection

evaluation measures summary table statistics: number of documents retrieved, number of relevant documents retrieved, number of relevant documents not retrieved, etc. recall-precision averages document level averages: average precision at seen relevant documents average precision histogram