Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 3: Retrieval Evaluation Maya Ramanath. Benchmarking IR Systems Result Quality Data Collection – Ex: Archives of the NYTimes Query set – Provided.

Similar presentations


Presentation on theme: "Lecture 3: Retrieval Evaluation Maya Ramanath. Benchmarking IR Systems Result Quality Data Collection – Ex: Archives of the NYTimes Query set – Provided."— Presentation transcript:

1 Lecture 3: Retrieval Evaluation Maya Ramanath

2 Benchmarking IR Systems Result Quality Data Collection – Ex: Archives of the NYTimes Query set – Provided by experts, identified from real search logs, etc. Relevance judgements – For a given query, is the document relevant?

3 Evaluation for Large Collections Cranfield/TREC paradigm – Pooling of results A/B testing – Possible for search engines Crowdsourcing – Let users decide

4 Precision and Recall Relevance judgements are binary – “relevant” or “not-relevant”. – Partition the collection into 2 parts. Precision Recall Can a search engine guarantee 100% recall?

5 F-measure F-Measure: Weighted harmonic mean of Precision and Recall Why use harmonic mean instead of arithmetic mean?

6 Precision-Recall Curves Using precision and recall to evaluate ranked retrieval Source: Introduction to Information Retrieval. Manning, Raghavan and Schuetze, 2008

7 Single measures Precision at k, P@10, P@100, etc. and others…

8 Graded Relevance – NDCG Highly relevant documents should have more importance Higher the rank of a relevant document, more valuable it is to the user

9 Inter-judge Agreement – Fleiss’ Kappa N – number of results n – number of ratings/result k – number of grades n ij – no. of judges who agree that the i th result should have grade j.

10 Tests of Statistical Significance Wilcoxon signed rank test Student’s paired t-test …and more

11 END OF MODULE “IR FROM 20000FT”


Download ppt "Lecture 3: Retrieval Evaluation Maya Ramanath. Benchmarking IR Systems Result Quality Data Collection – Ex: Archives of the NYTimes Query set – Provided."

Similar presentations


Ads by Google