Presentation is loading. Please wait.

Presentation is loading. Please wait.

Do Batch and User Evaluations Give the Same Results? Authors: William Hersh, Andrew Turpin, Susan Price, Benjamin Chan, Dale Kraemer, Lynetta Sacherek,

Similar presentations


Presentation on theme: "Do Batch and User Evaluations Give the Same Results? Authors: William Hersh, Andrew Turpin, Susan Price, Benjamin Chan, Dale Kraemer, Lynetta Sacherek,"— Presentation transcript:

1 Do Batch and User Evaluations Give the Same Results? Authors: William Hersh, Andrew Turpin, Susan Price, Benjamin Chan, Dale Kraemer, Lynetta Sacherek, Daniel Olson Presenters: Buğra M. Yıldız, Emre Varol

2 Introduction-1 There is a continuing debate about whether the results from batch evaluations, consisting of measuring recall and precision in a non- intractive environment can be generalized to real world. Some IR researchers argue that searching in real world is much more complex.

3 Introduction-2 If the batch searching results don't reflect the real world case, then system design decisions and measurement of results are misleading. The purpose of this study is to capture whether IR approaches that showed good performance in lab settings(batch environment), could be translate that effectiveness to real world.

4 Experiment 1 Purpose: Establishment of the best weighting approach for batch searching using previous TREC interactive data. Setup: MG retrieval system is used. The prior interactive data( from TREC 6 and 7) is converted into a tst collection.

5 Experiment 1 Results: The experiment set out to determine a baseline performance and one with maximum improvement that could be usd in subsequent user experiments. Q-ExpressionWeighting TypeAverage Precision% Improvement BB-ACB-BAATFIDF0.21290% AB-BFD-BAAOkapi 0.3850 81%

6 Experiment 2 Purpose: Determining whether batch measures give comparable results with human searchers with the new TREC interactive data. Setup: The main performance measure used in the TREC-8 interactive track was instance recall, defined as the proportion of true instances identified by a user searching on the topic.

7 Experiment 2 Results: 12 librarians and 12 graduate students participated to the experiment. While there was essentially no difference between searcher types, the Okapi system showed an 18.2% improvement in instance recall and an 8.1% improvement in instance precision, both of which were not statistically significant.

8 Experiment 3 and 4 Experiment 3 is the verification of that the experiment results are not result of the data sets. Experiment 4:

9 Conclusion The experiments in this study showed that batch and user searching experiments do not give the same result. But, since this is a limited study, further researches should be done for getting a clearer result.


Download ppt "Do Batch and User Evaluations Give the Same Results? Authors: William Hersh, Andrew Turpin, Susan Price, Benjamin Chan, Dale Kraemer, Lynetta Sacherek,"

Similar presentations


Ads by Google