Presentation is loading. Please wait.

Presentation is loading. Please wait.

Section Based Relevance Feedback Student: Nat Young Supervisor: Prof. Mark Sanderson.

Similar presentations


Presentation on theme: "Section Based Relevance Feedback Student: Nat Young Supervisor: Prof. Mark Sanderson."— Presentation transcript:

1 Section Based Relevance Feedback Student: Nat Young Supervisor: Prof. Mark Sanderson

2 Relevance Feedback SE user marks document(s) as relevant – E.g. “find more like this” – Terms are extracted from full document – Whole document may not be relevant Could marking a sub-section relevant be better?

3 Test Collections Simulate a real user’s search process – Submit queries in batch mode – Evaluate the result sets Relevance Judgments – QREL: pairs (1 … n) – Traditionally produced by human assessors

4 Building a Test Collection Documents – 1,388,939 research papers – Stop words removed – Porter Stemmer applied Topics – 100 random documents – Their sub-sections (6 per document)

5 Building a Test Collection In-edges – Documents that cite paper X – Found 943 using the CiteSeerX database Out-edges – Documents cited by paper X – Found 397 using pattern matching on titles

6 QRELs Total – 1,340 QRELs – Avg. 13.4 QRELs per document Previous work: – Anna Richie et. al. (2006) 82 Topics, Avg. 11.4 QRELs 196 Topics, Avg. 4.5 QRELs – Last year 71 Topics, Avg. 2.9 QRELs

7 Section Queries RQ1 Do the sections return different results? Pearson’s rAllAbstractIntroMethodResultsConclusionReferences All1.000.060.140.090.050.110.14 Abstract0.061.000.090.010.070.080.04 Intro0.140.091.000.060.100.120.11 Method0.090.010.061.000.090.080.07 Results0.050.070.100.091.000.130.09 Conclusion0.110.080.120.080.131.000.08 References0.140.040.110.070.090.081.00

8 Section Queries RQ2 Do the sections return different relevant results? Avg. = The average number of relevant results returned @ 20. E.g. Abstract queries returned 2 QRELs

9 Section Queries AbstractIntroMethodResultsConclusionReferences All 0.630.640.460.340.50.64 Abstract 0.60.440.430.620.53 Intro 0.430.390.450.53 Method 0.320.410.38 Results 0.39 Conclusion 0.42 References Average intersection sizes of relevant results E.g. Avg(|Abstract ∩ All|) = 0.63 Avg(|Abstract \ All|) = 1.37 100 - ((0.63 / 2) * 100) = 68.5% difference

10 Section Queries Average set complement % of relevant results AbstractIntroMethodResultsConclusionReferences All 71 79847771 Abstract 7078 6973 Intro 79817874 Method 797375 Results 73 Conclusion 75 References E.g. Section X returned n% different relevant results than section Y

11 Next Practical Significance – Does SRF provide benefits over standard RF?


Download ppt "Section Based Relevance Feedback Student: Nat Young Supervisor: Prof. Mark Sanderson."

Similar presentations


Ads by Google