Presentation is loading. Please wait.

Presentation is loading. Please wait.

Focused Relevance Feedback Timothy Chappell Supervisor: Shlomo Geva.

Similar presentations


Presentation on theme: "Focused Relevance Feedback Timothy Chappell Supervisor: Shlomo Geva."— Presentation transcript:

1 Focused Relevance Feedback Timothy Chappell Supervisor: Shlomo Geva

2 Introduction

3 Query-based Information Retrieval Standard search engine paradigm Based on a model of users searching for documents by predicting patterns of text Require users to have a fairly good idea of what they are looking for

4 Relevance Feedback Starts with a normal search query User examines the results returned and provides the search engine with feedback Search engine uses the feedback to learn more about what the user is looking for Search engine modifies the user’s query to return more relevant results

5 Relevance Feedback Document-level approach 1/6 – User enters query into search engine, search engine produces a ranked list of results that match the query 2/6 – Search engine presents the top-ranking results to the user 3/6 – User looks at the results and marks them as relevant/not relevant 4/6 – Search engine utilises feedback, finding other documents that are similar to the relevant results 5/6 – Search engine reranks the remaining documents with the relevance information 6/6 – The new top-ranking results are presented to the user

6 Focused Relevance Feedback Research aimed at determining whether relevance feedback approaches are effective when applied at a higher resolution Users provide feedback in terms of relevant passages, not documents

7 Focused Relevance Feedback Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui ipsum Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis et quasi architecto beatae vitae dicta sunt explicabo. Nemo enim ipsam voluptatem quia voluptas sit aspernatur aut odit aut fugit, sed quia consequuntur magni dolores eos qui ratione voluptatem sequi nesciunt. Neque porro quisquam est, qui ipsum At vero eos et accusamus et iusto odio dignissimos ducimus qui blanditiis praesentium voluptatum deleniti atque corrupti quos dolores et quas molestias excepturi sint occaecati cupiditate non provident, similique sunt in culpa qui officia deserunt mollitia animi, id est laborum et dolorum fuga. Et harum quidem rerum facilis est et expedita distinctio. Nam libero tempore, cum soluta nobis est eligendi optio At vero eos et accusamus et iusto odio dignissimos ducimus qui blanditiis praesentium voluptatum deleniti atque corrupti quos dolores et quas molestias excepturi sint occaecati cupiditate non provident, similique sunt in culpa qui officia deserunt mollitia animi, id est laborum et dolorum fuga. Et harum quidem rerum facilis est et expedita distinctio. Nam libero tempore, cum soluta nobis est eligendi optio

8 Methodology

9 Evaluation Standard method for evaluating the effectiveness of ranking systems is average precision over recall. Recall: the % of relevant documents returned Precision: the % of documents returned that are relevant Ruthven and Lalmas, 2003 [5]

10 Test Collection Test collection that approximates ‘real-world data’ needed The Wikipedia XML Corpus, a collection of documents from Wikipedia converted to XML was used Denoyer and Gallinari, 2006 [1]

11 Relevance Assessments Assessment data from the INEX 2008 Ad Hoc track used to provide relevance information Data consists of segments of text that users have identified as being relevant in relation to particular topics Used for both simulating user feedback for the focused relevance feedback algorithms and evaluating their performance Kamps, Geva, Trotman, Woodley and Koolen, 2009 [3]

12 Evaluation Platform Need for a consistent platform for evaluating different relevance feedback algorithms Written in C and Java Relevance feedback algorithms developed as plugins for the evaluation platform (as dynamic libraries) Will be used as the basis for a new track in INEX 2010.

13 Evaluation Platform Evaluation Platform Document Collection Assessments Relevance Feedback Algorithm Evaluation platform provides a set of documents and a topic/query to relevance feedback algorithm. Simulates a user interacting with the system

14 Evaluation Platform The plugins are evaluated side by side over a run of 20 different topics For each result returned by the relevance feedback algorithm, a line of a TREC or INEX run is output trec_eval, inex_eval or internal metrics can be used to evaluate algorithm performance

15 Relevance Feedback algorithms Written in Java using the Apache Lucene search engine as a base Most effective algorithm tested was based on the Rocchio relevance feedback approach Tested against the University of Waterloo's Okapi BM25 run from INEX 2008, BICER, which was the best-performing in-context engine Jakarta, 2004[2] Rocchio, 1971[4]

16 Relevance Feedback algorithms RecallLuceneRocchioBICER 0%68.32%75.62%95.15% 10%39.48%70.42%62.65% 20%35.95%70.42%51.06% 30%34.29%69.81%45.38% 40%32.52%61.55%41.84% 50%32.13%54.73%39.63% 60%31.72%51.13%38.27% 70%31.24%47.14%35.97% 80%30.38%41.91%30.60% 90%26.98%35.28%22.43% 100%21.43%21.78%2.70% Average precision30.34%51.22%38.96% R-Precision24.32%49.89%38.82%

17 Relevance Feedback algorithms

18 References [1]L. Denoyer and P. Gallinari. The Wikipedia XML Corpus. In SigIR Forum, [2]A. Jakarta. Apache Lucene-a high-performance, full-featured text search engine library, [3]J. Kamps, S. Geva, A. Trotman, A. Woodley and M. Koolen. Overview of the INEX 2008 ad hoc track. In Advances in Focused Retrieval, pages Springer, 2009.

19 References [4]J. J. Rocchio. Relevance feedback in information retrieval. In The SMART retrieval system - experiments in automatic document processing, pages Englewood Cliffs, NJ: Prentice-Hall, [5]I. Ruthven and M. Lalmas. A survey on the use of relevance feedback for information access systems. Knowl. Eng. Rev., 18(2):95-145, 2003.


Download ppt "Focused Relevance Feedback Timothy Chappell Supervisor: Shlomo Geva."

Similar presentations


Ads by Google