Precision and Recall Reminder:

Slides:



Advertisements
Similar presentations
Introduction to Information Retrieval
Advertisements

Retrieval Evaluation J. H. Wang Mar. 18, Outline Chap. 3, Retrieval Evaluation –Retrieval Performance Evaluation –Reference Collections.
PRES A Score Metric for Evaluating Recall- Oriented IR Applications Walid Magdy Gareth Jones Dublin City University SIGIR, 22 July 2010.
Probabilistic Language Processing Chapter 23. Probabilistic Language Models Goal -- define probability distribution over set of strings Unigram, bigram,
Exercising these ideas  You have a description of each item in a small collection. (30 web sites)  Assume we are looking for information about boxers,
Information Retrieval IR 7. Recap of the last lecture Vector space scoring Efficiency considerations Nearest neighbors and approximations.
Precision and Recall.
Evaluating Search Engine
Information Retrieval Ling573 NLP Systems and Applications April 26, 2011.
Evaluation.  Allan, Ballesteros, Croft, and/or Turtle Types of Evaluation Might evaluate several aspects Evaluation generally comparative –System A vs.
Modern Information Retrieval
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Probabilistic Information Retrieval.
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
SLIDE 1IS 240 – Spring 2007 Prof. Ray Larson University of California, Berkeley School of Information Tuesday and Thursday 10:30 am - 12:00.
Information Access I Measurement and Evaluation GSLT, Göteborg, October 2003 Barbara Gawronska, Högskolan i Skövde.
Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.
CS 430 / INFO 430 Information Retrieval
Evaluating the Performance of IR Sytems
Indexing and Representation: The Vector Space Model Document represented by a vector of terms Document represented by a vector of terms Words (or word.
SLIDE 1IS 240 – Spring 2007 Prof. Ray Larson University of California, Berkeley School of Information Tuesday and Thursday 10:30 am - 12:00.
1 CS 430 / INFO 430 Information Retrieval Lecture 10 Probabilistic Information Retrieval.
Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Evaluation CSC4170 Web Intelligence and Social Computing Tutorial 5 Tutor: Tom Chao Zhou
WXGB6106 INFORMATION RETRIEVAL Week 3 RETRIEVAL EVALUATION.
ISP 433/633 Week 6 IR Evaluation. Why Evaluate? Determine if the system is desirable Make comparative assessments.
Evaluation.  Allan, Ballesteros, Croft, and/or Turtle Types of Evaluation Might evaluate several aspects Evaluation generally comparative –System A vs.
The Relevance Model  A distribution over terms, given information need I, (Lavrenko and Croft 2001). For term r, P(I) can be dropped w/o affecting the.
Evaluation of Image Retrieval Results Relevant: images which meet user’s information need Irrelevant: images which don’t meet user’s information need Query:
Chapter 5: Information Retrieval and Web Search
Search and Retrieval: Relevance and Evaluation Prof. Marti Hearst SIMS 202, Lecture 20.
Evaluation David Kauchak cs458 Fall 2012 adapted from:
Evaluation David Kauchak cs160 Fall 2009 adapted from:
IR Evaluation Evaluate what? –user satisfaction on specific task –speed –presentation (interface) issue –etc. My focus today: –comparative performance.
Assessing the Retrieval Chapter 2 considered various ways of breaking text into indexable features Chapter 3 considered various ways of weighting combinations.
Information Retrieval Lecture 7. Recap of the last lecture Vector space scoring Efficiency considerations Nearest neighbors and approximations.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Chapter 6: Information Retrieval and Web Search
IR System Evaluation Farhad Oroumchian. IR System Evaluation System-centered strategy –Given documents, queries, and relevance judgments –Try several.
Assessing The Retrieval A.I Lab 박동훈. Contents 4.1 Personal Assessment of Relevance 4.2 Extending the Dialog with RelFbk 4.3 Aggregated Assessment.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Evaluation of (Search) Results How do we know if our results are any good? Evaluating a search engine  Benchmarks  Precision and recall Results summaries:
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Lecture 3: Retrieval Evaluation Maya Ramanath. Benchmarking IR Systems Result Quality Data Collection – Ex: Archives of the NYTimes Query set – Provided.
C.Watterscs64031 Evaluation Measures. C.Watterscs64032 Evaluation? Effectiveness? For whom? For what? Efficiency? Time? Computational Cost? Cost of missed.
Performance Measurement. 2 Testing Environment.
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Evaluation of Retrieval Effectiveness 1.
What Does the User Really Want ? Relevance, Precision and Recall.
Information Retrieval Performance Measurement Using Extrapolated Precision William C. Dimm DESI VI June 8, 2015.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Natural Language Processing Topics in Information Retrieval August, 2002.
Evaluation. The major goal of IR is to search document relevant to a user query. The evaluation of the performance of IR systems relies on the notion.
Information Retrieval Quality of a Search Engine.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture Probabilistic Information Retrieval.
Information Retrieval Lecture 3 Introduction to Information Retrieval (Manning et al. 2007) Chapter 8 For the MSc Computer Science Programme Dell Zhang.
1 CS 430 / INFO 430 Information Retrieval Lecture 10 Evaluation of Retrieval Effectiveness 1.
Evaluation of Information Retrieval Systems
7CCSMWAL Algorithmic Issues in the WWW
H. Nottelmann, N. Fuhr Presented by Tao Tao February 12, 2004
Evaluation.
Modern Information Retrieval
CS 430: Information Discovery
Chapter 5: Information Retrieval and Web Search
Evaluation of Information Retrieval Systems
Cumulated Gain-Based Evaluation of IR Techniques
Retrieval Evaluation - Measures
Retrieval Evaluation - Measures
Retrieval Performance Evaluation - Measures
Information Retrieval and Web Design
Precision and Recall.
Presentation transcript:

Precision and Recall Reminder: Precision : % of # of retrieved documents that are relevant Recall : % of all relevant documents that are retrieved CS466-13

A C B Not Relevant Relevant Not Rel Not Ret Rel but Ret but Not Ret Rel and Ret B Not Retrieved Relevant Not Relevant Retrieved CS466-13

Computing Precision and Recall Theoretically : a continuous relationship (Precision value for every level for recall) 1.0 Precision Recall 1.0 In practice : 1.0 Can only estimate at a resolution  to the # of relevant docs in the collection Precision Recall 1.0 CS466-13

Interpolation of Precision/Recall .9 P1 Precision DP .7 P2 R1 .20 (DR) R2 .125 .20 .25 Recall Interpolation CS466-13

Extrapolation of Precision/Recall ? ( for Recall less than ) 1.0 ? .9 P1 Precision .7 P2 R1 R2 .125 .20 .25 Recall CS466-13

Precision/Recall Curves 1.0 .9 .8 Precision .7 .6 .5 .4 .3 .2 .1 1/8 2/8 3/8 4/8 5/8 6/8 7/8 8/8 .125 .25 .375 .5 .625 .75 .875 1.0 Recall CS466-13

Precision/Recall Curves (1/1) (2/2) (3/3) 1.0 .9 .8 Precision .7 .667(4/6) .6 .5 .4 .312(5/16) .3 .2 .111(6/54) .030(7/230) .1 .003(8/2664) 1/8 2/8 3/8 4/8 5/8 6/8 7/8 8/8 .125 .25 .375 .5 .625 .75 .875 1.0 Recall CS466-13

F-measure Harmonic mean between Precision and Recall for R = Recall P = Precision for any relevance-scored retrieved set and for Recall and Precision at j-th document in an ordered ranking CS466-13

E-measure Van Rijsbergen(1979) relative importance of Precision/Recall weighted by user given parameter  if  = 1, Ej is complement to harmonic mean if  > 1, Precision more important than Recall if  < 1, Recall more important than Precision CS466-13

Normalized Recall How closely do the ranks of the retrieved documents (e.g. 1,2,4,6,13) match the ideal ranking of true retrieved documents(1,2,3,4,5)? Ideal ranks(1,2,3,4,5) 1 4/5 Recall 3/5 Actual ranks(1,2,4,6,13) 2/5 Worst ranks (196,197,198,199,200) 1/5 1 5 10 15 195 Ranks of retrieved documents CS466-13