1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL.

1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL

2 Content Motivation Methodology Results Conclusions

3 The motivation Improve navigational experience for both normal users and users of handheld devices. Intuitively, we would expect grouping documents to reduce search time.

4 Introduction We quantify the benefits of grouping documents based on classification. We study how the benefits of grouping degrade with classification errors. We take into account errors that arise from both the user and the classifier.

5 The methodology Three types of simulated user model: 1.The user knows the class. 2.The user doesnt know the class. 3.The user thinks he knows the class. Two classification scenarios: 1.Correct classification 2.misclassification

6 The methodology To measure the benefits, we define: –class rank. –document rank. For ranked-list results, scroll rank is used

7 The Methodology For categorized results, based on different user models and operation scenarios, we define: –In-Class Rank(ICR), –Scrolled-Classification Rank(SCR), –Out-Class/Scroll-Class Rank(OSCR) –Out-Class/Revert Rank(ORR).

8 The methodology query doc1 doc2 doc3 doc4 doc5 doc6 doc7 doc1 doc2 doc3 doc4 doc5 doc6 doc7 Class1: Class2: SR=6 ICR=1+3=4 SCR=1+3=4 ORR=2+4+6 =12 OSCR=2+4+ 4=10

9 The methodology Simulated user/target Correctly classified misclassified Knows classICROSCR or ORR Does not know class SCR or SR Thinks knows class OSCR or ORR

10 The methodology Known-Item Search (Target Testing), followed by comparison of the ranks. Given a document, we generate a query so that the target document appears within a designated range of scroll rank.

11 The implementation Open Directory Project provides an oracle for classification so that we can control both user and machine error. Search Engine is based on Lucene, which is an open source tool.

12 The ideal case with an Oracle

13 KNN Classifier

14 More realistic scenario

15 Conclusions Classification-based display can improve users interaction with SE However, this depends on the user strategy: –The hybrid strategy has the best performance. –Using a hybrid strategy, performance degrades gracefully with errors

16 Thanks!

19 Reference Kummamuru, K., Lotlikar, R., Roy, S., Singal, K., Krishnapuram, R.: A hierarchi-cal monothetic document clustering algorithm for summarization and browsing search results. In: Proceedings of the 13th International Conference on World Wide Web, pp. 658–665 (2004) Chen, H., Dumais, S.: Bring order to the web: Automatically categorizing search results. In: CHI 2000: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 145–152. ACM Press, New York (2000) http://www.useit.com/alertbox/reading_pattern.html

1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL.

Similar presentations

Presentation on theme: "1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL.

Similar presentations

Presentation on theme: "1 Ranked-Listed or Categorized Results in IR Zheng Zhu, Ingemar J. Cox, Mark Levene Birkbeck College, University of London UCL."— Presentation transcript:

Similar presentations

About project

Feedback