Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar.

Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Personal Information Retrieval (PIR) 2  The practice and the study of supporting users to retrieve personal information effectively

Personal Information Retrieval in the Wild  Everyone has unique information & practices  Different information and information needs  Different preference and behavior  Many existing software solutions  Platform-level: desktop search, folder structure  Application-level: email, calendar, office suites 3

Previous Work in PIR (Desktop Search)  Focus  User interface issues [Dumais03,06]  Desktop-specific features [Solus06] [Cohen08]  Limitations  Each based on different environment and user group  None of them performed comparative evaluation  Research findings do not accumulate over the years 4

Our Approach  Develop general techniques for PIR  Start from essential characteristics of PIR  Applicable regardless of users and information types  Make contributions to related areas  Structured document retrieval  Simulated evaluation for known-item finding  Build a platform for sustainable progress  Develop repeatable evaluation techniques  Share the research findings and the data 5

Essential Characteristics of PIR  Many document types  Unique metadata for each type  People combine search and browsing [Teevan04]  Long-term interactions with a single user  People mostly find known-items [Elsweiler07]  Privacy concern for the data set 6 Field-based Search Models Associative Browsing Model Simulated Evaluation Methods

 Challenge  Users may remember different things about the document  How can we present effective results for both cases? Search and Browsing Retrieval Models Registration James User’s Memory Query Retrieval Results Search Browsing Lexical Memory Associative Memory 1. 2. 3. 4. 5. 7

Information Seeking Scenario in PIR Registration James Registration 2011 User Input System Output 2011 Search Browsing Search A user initiate a session with a keyword query The user switches to browsing by clicking on a email document The user switches to back to search with a different query

 Challenge  User’s query originates from what she remembers.  How can we simulate user’s querying behavior realistically? Simulated Evaluation Techniques Registration James User’s Memory Query Retrieval Results Lexical Memory Associative Memory 1. 2. 3. 4. 5. Search Browsing 9

Research Questions  Field-based Search Models  How can we improve the retrieval effectiveness in PIR?  How can we improve the type prediction quality?  Associative Browsing Model  How can we enable the browsing support for PIR?  How can we improve the suggestions for browsing?  Simulated Evaluation Methods  How can we evaluate a complex PIR system by simulation?  How can we establish the validity of simulated evaluation? 10

Field-based Search Models

Searching for Personal Information  An example of desktop search 12

Field-based Search Framework for PIR  Type-specific Ranking  Rank documents in each document collection (type)  Type Prediction  Predict the document type relevant to user’s query  Final Results Generation  Merge into a single ranked list 13

Type-specific Ranking for PIR  Individual collection has type-specific features  Thread-based features for emails  Path-based features for documents  Most of these documents have rich metadata  Email:  Document:  Calendar:  We focus on developing general retrieval techniques for structured documents

Structured Document Retrieval  Field Operator / Advanced Search Interface  User’s search terms are found in multiple fields 15 Understanding Re-finding Behavior in Naturalistic Email Interaction Logs. Elsweiler, D, Harvey, M, Hacker., M [SIGIR'11]

Structured Document Retrieval: Models  Document-based Retrieval Model  Score each document as a whole  Field-based Retrieval Model  Combine evidences from each field q 1 q 2... q m Document-based Scoring Field-based Scoring f1f1 f1f1 f2f2 f2f2 fnfn fnfn... q 1 q 2... q m f1f1 f1f1 f2f2 f2f2 fnfn fnfn... f1f1 f1f1 f2f2 f2f2 fnfn fnfn w1w1 w2w2 wnwn w1w1 w2w2 wnwn 16

17 1 1 2 2 1 2  Field Relevance  Different fields are important for different query terms ‘james’ is relevant when it occurs in ‘registration’ is relevant when it occurs in Field Relevance Model for Structured IR

Estimating the Field Relevance: Overview  If User Provides Feedback  Relevant document provides sufficient information  If No Feedback is Available  Combine field-level term statistics from multiple sources 18 content title from/to Relevant Docs content title from/to Collection content title from/to Top-k Docs + ≅

 Assume a user who marked D R as relevant  Estimate field relevance from the field-level term dist. of D R  We can personalize the results accordingly  Rank higher docs with similar field-level term distribution  This weight is provably optimal under LM retrieval framework Estimating Field Relevance using Feedback 19 DRDR - To is relevant for ‘james’ - Content is relevant for ‘registration’ Field Relevance:

 Linear Combination of Multiple Sources  Weights estimated using training queries  Features  Field-level term distribution of the collection  Unigram and Bigram LM  Field-level term distribution of top-k docs  Unigram and Bigram LM  A priori importance of each field (w j )  Estimated using held-out training queries Estimating Field Relevance without Feedback 20 Unigram is the same to PRM-S Similar to MFLM and BM25F Pseudo-relevance Feedback

21 Retrieval Using the Field Relevance  Comparison with Previous Work  Ranking in the Field Relevance Model q 1 q 2... q m f1f1 f1f1 f2f2 f2f2 fnfn fnfn... f1f1 f1f1 f2f2 f2f2 fnfn fnfn w1w1 w2w2 wnwn w1w1 w2w2 wnwn q 1 q 2... q m f1f1 f1f1 f2f2 f2f2 fnfn fnfn... f1f1 f1f1 f2f2 f2f2 fnfn fnfn P(F 1 |q 1 ) P(F 2 |q 1 ) P(F n |q 1 ) P(F 1 |q m ) P(F 2 |q m ) P(F n |q m ) Per-term Field Weight Per-term Field Score sum multiply

 Retrieval Effectiveness (Metric: Mean Reciprocal Rank) Evaluating the Field Relevance Model 22 Fixed Field Weights Per-term Field Weights

Type Prediction Methods  Field-based collection Query-Likelihood (FQL)  Calculate QL score for each field of a collection  Combine field-level scores into a collection score  Feature-based Method  Combine existing type-prediction methods  Grid Search / SVM for finding combination weights 23

Type Prediction Performance  Pseudo-desktop Collections  CS Collection  FQL improves performance over CQL  Combining features improves the performance further 24 (% of queries with correct prediction)

Summary So Far…  Field relevance model for structured document retrieval  Enables relevance feedback through field weighting  Improves performance using linear feature-based estimation  Type prediction methods for PIR  Field-based type prediction method (FQL)  Combination of features improve the performance further  We move onto associative browsing model  What happens when users can’t recall good search terms?

Associative Browsing Model

Recap: Retrieval Framework for PIR Registration James Keyword SearchAssociative Browsing 27

User Interaction for Associative Browsing  Users enter a concept or document page by search  The system provides a list of suggestions for browsing Data ModelUser Interface

How can we build associations? 29 Manually? Participants wouldn’t create associations beyon d simple tagging operations - Sauermann et al. 2005 Participants wouldn’t create associations beyon d simple tagging operations - Sauermann et al. 2005 Automatically? How would it match user’s preference?

Building the Associative Browsing Model 30 2. Concept Extraction 3. Link Extraction 4. Link Refinement 1. Document Collection Term Similarity Temporal Similarity Co-occurrence Click-based Training

Concept: Search Engine Link Extraction and Refinement 31  Link Scoring  Combination of link type scores  S(c 1,c 2 ) = Σ i [ w i × Link i (c 1,c 2 ) ]  Link Presentation  Ranked list of suggested items  Users click on them for browsing  Link Refinement (training w i )  Maximize click-based relevance  Grid Search : Maximize retrieval effectiveness (MRR)  RankSVM : Minimize error in pairwise preference ConceptsDocuments Term Vector Similarity Temporal Similarity Tag Similarity String SimilarityPath / Type Similarity Co-occurrenceConcept Similarity

Evaluating Associative Browsing Model  Data set: CS Collection  Collect public documents in UMass CS department  CS dept. people competed in known-item finding tasks  Value of browsing for known-item finding  % of sessions browsing was used  % of sessions browsing was used & led to success  Quality of browsing suggestions  Mean Reciprocal Rank using clicks as judgments  10-fold cross validation over the click data collected 32

Value of Browsing for Known-item Finding  Comparison with Simulation Results  Roughly matches in terms of overall usage and success ratio  The Value of Associative Browsing  Browsing was used in 30% of all sessions  Browsing saved 75% of sessions when used Evaluation TypeTotal (#sessions) Browsing usedSuccessful outcome Simulation63,2609,410 (14.8%)3,957 (42.0%) User Study (1)29042 (14.5%)15 (35.7%) User Study (2)14243 (30.2%)32 (74.4%) Document Only Document + Concept

Quality of Browsing Suggestions  Concept Browsing (MRR)  Document Browsing (MRR) 34

Simulated Evaluation Methods

Challenges in PIR Evaluation  Hard to create a ‘test-collection’  Each user has different documents and habits  People will not donate their documents and queries for research  Limitations of user study  Experimenting with a working system is costly  Experimental control is hard with real users and tasks  Data is not reusable by third parties 36

 Simulate components of evaluation  Collection: user’s documents with metadata  Task: search topics and relevance judgments  Interaction: query and click data Our Approach: Simulated Evaluation 37

Simulated Evaluation Overview  Simulated document collections  Pseudo-desktop Collections  Subsets of W3C mailing list + Other document types  CS Collection  UMass CS mailing list / Calendar items / Crawl of homepage  Evaluation Methods Controlled User StudySimulated Interaction Field-based Search DocTrack Search GameQuery Generation Methods Associative Browsing DocTrack Search + Browsing Game Probabilistic User Modeling

Controlled User Study: DocTrack Game  Procedure  Collect public documents in UMass CS dept. (CS Collection)  Build a web interface where participants can find documents  People in CS department participated  DocTrack search game  20 participants / 66 games played  984 queries collected for 882 target documents  DocTrack search+browsing game  30 participants / 53 games played  290 +142 search sessions collected 39

DocTrack Game 40 *Users can use search and browsing for DocTrack search+browsing game

Query Generation for Evaluating PIR  Known-item finding for PIR  A target document represents an information need  Users would take terms from the target document  Query Generation for PIR  Randomly select a target document  Algorithmically take terms from the document  Parameters of Query Generation  Choice of extent : Document [Azzopardi07] vs. Field  Choice of term : Uniform vs. TF vs. IDF vs. TF-IDF [Azzopardi07] 41

Validating of Generated Queries  Basic Idea  Use the set of human-generated queries for validation  Compare at the level of query terms and retrieval scores  Validation by Comparing Query-terms  The generation probability of manual query q from P term  Validation by Compare Retrieval Scores [Azzopardi07]  Two-sided Kolmogorov-Smirnov test 42

Validation Results for Generated Queries  Validation based on query terms  Validation based on retrieval score distribution

Probabilistic User Model for PIR  Query generation model  Term selection from a target document  State transition model  Use browsing when result looks marginally relevant  Link selection model  Click on browsing suggestions based on perceived relevance 44

A User Model for Link Selection  User’s level of knowledge  Random : randomly click on a ranked list  Informed : more likely to click on more relevant item  Oracle : always click on the most relevant item  Relevance estimated using the position of the target item 45

Success Ratio of Browsing  Varying the level of knowledge and fan-out for simulation  Exploration is valuable for users with low knowledge level More Exploration 46

Community Efforts using the Data Sets 47

Conclusions & Future Work

Major Contributions  Field-based Search Models  Field relevance model for structured document retrieval  Field-based and combination-based type prediction method  Associative Browsing Model  An adaptive technique for generating browsing suggestions  Evaluation of associative browsing in known-item finding  Simulated Evaluation Methods for Known-item Finding  DocTrack game for controlled user study  Probabilistic user model for generating simulated interaction 49

Field Relevance for Complex Structures  Current work assumes documents with flat structure  Field Relevance for Complex Structures?  XML documents with hierarchical structure  Joined Database Relations with graph structure

Cognitive Model of Query Generation  Current query generation methods assume:  Queries are generated from the complete document  Query-terms are chosen independently from one another  Relaxing these assumptions  Model the user’s degradation in memory  Model the dependency in query term selection  Ongoing work  Graph-based representation of documents  Query terms can be chosen by random walk

Thank you for your attention!  Special thanks to my advisor, coauthors, and all of you here!  Are we closer to the superhuman now?

One More Slide: What I Learned…  Start from what’s happening from user’s mind  Field relevance / query generation, …  Balance user input and algorithmic support  Generating suggestions for associative browsing  Learn from your peers & make contributions  Query generation method / DocTrack game  Simulated test collections & workshop

Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar.

Similar presentations

Presentation on theme: "Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar.

Similar presentations

Presentation on theme: "Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar."— Presentation transcript:

Similar presentations

About project

Feedback