Section Based Relevance Feedback Student: Nat Young Supervisor: Prof. Mark Sanderson.

Slides:



Advertisements
Similar presentations
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Advertisements

Evaluation of Relevance Feedback Algorithms for XML Retrieval Silvana Solomon 27 February 2007 Supervisor: Dr. Ralf Schenkel.
Search Results Need to be Diverse Mark Sanderson University of Sheffield.
Language Model based Information Retrieval: University of Saarland 1 A Hidden Markov Model Information Retrieval System Mahboob Alam Khalid.
Distributed Search over the Hidden Web Hierarchical Database Sampling and Selection Panagiotis G. Ipeirotis Luis Gravano Computer Science Department Columbia.
Carnegie Mellon 1 Maximum Likelihood Estimation for Information Thresholding Yi Zhang & Jamie Callan Carnegie Mellon University
Modern Information Retrieval
Information Retrieval February 24, 2004
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Modern Information Retrieval Chapter 5 Query Operations.
Learning to Advertise. Introduction Advertising on the Internet = $$$ –Especially search advertising and web page advertising Problem: –Selecting ads.
Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.
MSc Software Engineering Dissertation Finding a Research Problem and Additional Guidance Stewart Green.
Evaluating the Performance of IR Sytems
Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
INEX 2003, Germany Searching in an XML Corpus Using Content and Structure INEX 2003, Germany Yiftah Ben-Aharon, Sara Cohen, Yael Grumbach, Yaron Kanza,
Cell Biology and Genetics
An Overview of Relevance Feedback, by Priyesh Sudra 1 An Overview of Relevance Feedback PRIYESH SUDRA.
Important Task in Patents Retrieval Recall is an Important Factor Given Query Patent -> the Task is to Search all Related Patents Patents have Complex.
The Relevance Model  A distribution over terms, given information need I, (Lavrenko and Croft 2001). For term r, P(I) can be dropped w/o affecting the.
Modern Retrieval Evaluations Hongning Wang
Minimal Test Collections for Retrieval Evaluation B. Carterette, J. Allan, R. Sitaraman University of Massachusetts Amherst SIGIR2006.
Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?
IR Evaluation Evaluate what? –user satisfaction on specific task –speed –presentation (interface) issue –etc. My focus today: –comparative performance.
A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA
 An important problem in sponsored search advertising is keyword generation, which bridges the gap between the keywords bidded by advertisers and queried.
©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.
Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.
Redeeming Relevance for Subject Search in Citation Indexes Shannon Bradshaw The University of Iowa
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010.
© Paul Buitelaar – November 2007, Busan, South-Korea Evaluating Ontology Search Towards Benchmarking in Ontology Search Paul Buitelaar, Thomas.
The Effect of Collection Organization and Query Locality on IR Performance 2003/07/28 Park,
Giorgos Giannopoulos (IMIS/”Athena” R.C and NTU Athens, Greece) Theodore Dalamagas (IMIS/”Athena” R.C., Greece) Timos Sellis (IMIS/”Athena” R.C and NTU.
Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.
Self Organization of a Massive Document Collection Advisor : Dr. Hsu Graduate : Sheng-Hsuan Wang Author : Teuvo Kohonen et al.
Crawling and Aligning Scholarly Presentations and Documents from the Web By SARAVANAN.S 09/09/2011 Under the guidance of A/P Min-Yen Kan 10/23/
Chapter 6: Information Retrieval and Web Search
Information Retrieval Effectiveness of Folksonomies on the World Wide Web P. Jason Morrison.
Colorado Temperature Database Mark Coleman 1019 Boltz Drive Fort Collins, Colorado
1 FollowMyLink Individual APT Presentation Third Talk February 2006.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. An information-pattern-based approach to novelty detection Presenter : Lin, Shu-Han Authors : Xiaoyan.
Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.
Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.
Performance Measurement. 2 Testing Environment.
The Cross Language Image Retrieval Track: ImageCLEF Breakout session discussion.
Cs Future Direction : Collaborative Filtering Motivating Observations:  Relevance Feedback is useful, but expensive a)Humans don’t often have time.
Why IR test collections are so bad Mark Sanderson University of Sheffield.
The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.
DISTRIBUTED INFORMATION RETRIEVAL Lee Won Hee.
Evaluation. The major goal of IR is to search document relevant to a user query. The evaluation of the performance of IR systems relies on the notion.
A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.
The Effect of Database Size Distribution on Resource Selection Algorithms Luo Si and Jamie Callan School of Computer Science Carnegie Mellon University.
1 Learning to Impress in Sponsored Search Xin Supervisors: Prof. King and Prof. Lyu.
To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.
Date of Presentation Name of Presenter Insert image _________ Toolkit.
Usefulness of Quality Click- through Data for Training Craig Macdonald, ladh Ounis Department of Computing Science University of Glasgow, Scotland, UK.
Relevant Document Distribution Estimation Method for Resource Selection Luo Si and Jamie Callan School of Computer Science Carnegie Mellon University
All Your Queries are Belong to Us: The Power of File-Injection Attacks on Searchable Encryption Yupeng Zhang, Jonathan Katz, Charalampos Papamanthou University.
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Microsoft Research Cambridge,
Ricardo EIto Brun Strasbourg, 5 Nov 2015
Future Direction #3: Collaborative Filtering
Walid Magdy Gareth Jones
Multimedia Information Retrieval
إعداد د/زينب عبد الحافظ أستاذ مساعد بقسم الاقتصاد المنزلي
Modern Information Retrieval
Panagiotis G. Ipeirotis Luis Gravano
Cell Biology and Genetics
Information Retrieval and Web Design
Future Direction : Collaborative Filtering
Presentation transcript:

Section Based Relevance Feedback Student: Nat Young Supervisor: Prof. Mark Sanderson

Relevance Feedback SE user marks document(s) as relevant – E.g. “find more like this” – Terms are extracted from full document – Whole document may not be relevant Could marking a sub-section relevant be better?

Test Collections Simulate a real user’s search process – Submit queries in batch mode – Evaluate the result sets Relevance Judgments – QREL: pairs (1 … n) – Traditionally produced by human assessors

Building a Test Collection Documents – 1,388,939 research papers – Stop words removed – Porter Stemmer applied Topics – 100 random documents – Their sub-sections (6 per document)

Building a Test Collection In-edges – Documents that cite paper X – Found 943 using the CiteSeerX database Out-edges – Documents cited by paper X – Found 397 using pattern matching on titles

QRELs Total – 1,340 QRELs – Avg QRELs per document Previous work: – Anna Richie et. al. (2006) 82 Topics, Avg QRELs 196 Topics, Avg. 4.5 QRELs – Last year 71 Topics, Avg. 2.9 QRELs

Section Queries RQ1 Do the sections return different results? Pearson’s rAllAbstractIntroMethodResultsConclusionReferences All Abstract Intro Method Results Conclusion References

Section Queries RQ2 Do the sections return different relevant results? Avg. = The average number of relevant results 20. E.g. Abstract queries returned 2 QRELs

Section Queries AbstractIntroMethodResultsConclusionReferences All Abstract Intro Method Results 0.39 Conclusion 0.42 References Average intersection sizes of relevant results E.g. Avg(|Abstract ∩ All|) = 0.63 Avg(|Abstract \ All|) = ((0.63 / 2) * 100) = 68.5% difference

Section Queries Average set complement % of relevant results AbstractIntroMethodResultsConclusionReferences All Abstract Intro Method Results 73 Conclusion 75 References E.g. Section X returned n% different relevant results than section Y

Next Practical Significance – Does SRF provide benefits over standard RF?