Presentation is loading. Please wait.

Presentation is loading. Please wait.

Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos.

Similar presentations


Presentation on theme: "Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos."— Presentation transcript:

1 Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos Meintanis, Anna Zacchi, Haowei Hsieh, Frank Shipman and Cathy Marshall Center for the Study of Digital Libraries & Department of Computer Science Texas A&M University Microsoft Corporation

2 What is Document Triage? ● People quickly evaluate a large set of documents selecting documents to read ● People organize them into a personal information collection ● People re-read the documents, progressively refining the organization ● Knowledge forms incrementally as initial understanding becomes more refined over time A specific form of information collecting, reading and organizing 2/16

3 Prior Document Triage Study (2004) ● Task: organize the documents to help a teacher prepares a set of lessons on ethnomathematics as a reference librarian ● 24 subjects ● 40 documents from NSDL & Google searches ● Organizing tool: Visual Knowledge Builder (VKB) ● Reading tool: Internet Explorer (IE) ● Logged reading & editing events ● Asked subjects to select five most & least useful documents 3/16

4 Initial Document List 4/16 Document object Collection Metadata Page title Page URL Summary NSDL Search System-generated Visualization based on metadata Google Search

5 Document in a Web Browser 5/16

6 Final Organization Sample 6/16 Categories (Collections) Background Color Border Color Border Thickness

7 Proactive Support for Document Triage 1. Recognizing user interest and document value 2. Representing user interests 3. Recognizing documents of interest 4. Visualizing interest information Motivations 7/16

8 Recognizing User Interest (1) ● Explicit and implicit interest indicators ● Correlation between reading activity and user interest ● Reading time, # of visits, # of scrolls, … ● Correlation between organizing activity and user interest ● Resize, move, delete … ● Correlation between document attributes and user interest ● # of characters, # of links, # of images … 8/16

9 Recognizing User Interest (2) ● Prior work has focused on a single application as the source for interest indicators ● Document triage occurs in the context of multiple applications ● Interest profile is the basis for determining, sharing and storing implicit interest 9/16

10 Interest Profile Manager 10/16

11 Data Analysis (1) 11/16 Document AttributesReading ActivityOrganizing Activity # of characters # of links # of images Reading time # of clicks # of text selections # of scrolls # of scrolling direction changes Time spent in scrolling Scroll offset # of document accesses # of object moves # of object resizes # of object deletions # of content changes # of background color changes # of border color changes # of border width changes

12 Data Analysis (2) ● Identified the correlation between user activity & document attributes and user interest ● Found meaningful interest indicators in user activity ● Reading time, # of scrolls, # of resize events … ● Found meaningful interest indicators in document attributes ● # of characters, # of links, # of images … ● No indicator cannot dominantly identify user interest ● Significant difference between individual styles 12/16

13 Interest Models ● Models to estimate average interest on documents 13/16 Model nameData Statistical Model Reading activity model Reading activity Organizing activity model Organizing activity Combined Model Reading & Organizing activity Qualitative Model Reading & Organizing activity

14 Evaluation (1) ● The same task and topic as in the prior study in 2004 ● 16 subjects ● 40 documents from NSDL & Google searches ● Asked subjects to select five most & least useful documents ● Scaled to a continuous value between 0 (least useful) and 2 (most useful) ● Calculated the absolute value of the difference between the explicit user rating and each model's predicted rating 14/16

15 Evaluation (2) ● Combined and qualitative models using reading and organizing activity show better performance than others 14/15

16 Conclusion ● Predictive models based on user activity collected from multiple applications have been built ● Utilizing user activity from multiple applications rather than single application can improve the accuracy of prediction ● Software infra structure, Interest Profile Manager, has been developed to support the result 16/16


Download ppt "Recognizing User Interest and Document Value from Reading and Organizing Activities in Document Triage Rajiv Badi, Soonil Bae, J. Michael Moore, Konstantinos."

Similar presentations


Ads by Google