Presentation is loading. Please wait.

Presentation is loading. Please wait.

Q4 Measuring Effectiveness

Similar presentations


Presentation on theme: "Q4 Measuring Effectiveness"— Presentation transcript:

1 Q4 Measuring Effectiveness

2 Performance Evaluation
How do you evaluate the performance of an information retrieval system? Or compare two different systems?

3 Evaluation of IR Systems
Using Recall & Precision Conduct query searches Try many different queries Compare results of Precision & Recall Recall & Precision need to be considered together. Results varies depending on test data and queries. Recall & Precision is only one aspect of system performance High recall/high precision is desirable, but not necessary the most important thing that the user considers. 4

4 Data quality consideration
Coverage of database Completeness and accuracy of data Indexing methods and indexing quality indexing types currency of indexing ( Is it updated often?) indexing sizes 17

5 Interface Consideration
User friendly interface How long does it take for a user to learn advanced features? How well can the user explore or interact with the query output? How easy is it to customize output displays?

6 User satisfaction The final test is the user!
User satisfaction is more important then precision and recall Measuring user satisfaction Survey Use statistics User experiments

7 Cleverdon – The Cranfield Experiments 1950s/1960s
Time Lag. The interval between the demand being made and the answer being given. Presentation. The physical form of the output. User effort. The effort, intellectual or physical, demanded of the user. Recall. The ability of the system to present all relevant documents. Precision. The ability of the system to withhold non-relevant documents.

8 Cleverdon says Precision & Recall
Measure ability to find the relevant information If a system can’t identify relevant information, what use is it?

9 Why not the others? According to Cleverdon: Time lag Presentation
a function of hardware Presentation successful if the user can read and understand the list of references returned User effort can be measured with a straightforward examination of a small number of cases.

10 In Reality Need to consider the user task carefully
Cleverdon was focusing on batch interfaces Interactive browsing interfaces very significant (Turpin & Hersh) Interactive systems User effort & presentation very important

11 In Spite of That Precision & Recall Usability extensively evaluated
not so much

12 Why Not Usability Usability requires a user-study
Every new feature needs a new study (expensive) High variance – many confounding factors Offline analysis of accuracy Once a dataset is found Easy to control factors Repeatable Automatic Free If the system isn’t accurate, it isn’t going to be usable

13 Measures • From IR – (User) precision, aspectual recall
• From experimental psychology – Quantitative: time, number of errors, … – Qualitative: user opinions • Example evaluation measures: System viewpoint User viewpoint Effectiveness recall/precision quality of solution Efficiency retrieval time task completion time Satisfaction Preference confidence


Download ppt "Q4 Measuring Effectiveness"

Similar presentations


Ads by Google