Presentation is loading. Please wait.

Presentation is loading. Please wait.

Why IR test collections are so bad Mark Sanderson University of Sheffield.

Similar presentations


Presentation on theme: "Why IR test collections are so bad Mark Sanderson University of Sheffield."— Presentation transcript:

1 Why IR test collections are so bad Mark Sanderson University of Sheffield

2 What is the problem? Test collections (as a whole) (by & large) –Don’t support context Have a fixed context What do I mean by that? –Don’t support context of different data –Very restricted notion of user

3 Data Very few test collections out there –<100? Why? –Perception that test collections are hard to produce Implies people will use centrally built ones Don’t try to build their own

4 Users Test collections simulate –One time search User –Wants to enter a verbose query No spelling mistakes –Wants to search only once

5 Only once? We’ve long known “real search” is interactive –But generally thought that test collections still help in picking an algorithm for interactive search –I’m now not sure that this is true

6 If you only search once… …you want a retrieval system that –Uses pseudo-relevance feedback –Allows matching on a subset of query words –It doesn’t matter how long search takes –It doesn’t matter if the user understands how the search system did what it did. Cause they don’t need to re-formulate their query.

7 Why are test collections like this? Look back 30 years What was IR like? –Search intermediaries Expectations were that IR systems would replace intermediaries –Search once, look through lots of documents?

8 Test collections simulate “intermediary situation” rather well –Long query –Single search –User willing to look through long list of documents Can’t be bothered to re-formulate?

9 For common searching tasks If you are willing to search more than once –Do you want a search system tuned to single search?

10 Do we ditch the collections? No –Search engines use them a lot MSN Search –500 topic test collection per week

11 What do we do? We build more –Rapidly built and prototyping of test collections Helps with dynamic User modelling –Think about how to simulate people entering multiple searches in test collections

12 Conclusion I hope you agree test collections are bad But that there are potential ways of improving them.


Download ppt "Why IR test collections are so bad Mark Sanderson University of Sheffield."

Similar presentations


Ads by Google