Finding and Re-Finding Through Personalization Jaime Teevan MIT, CSAIL David Karger (advisor), Mark Ackerman, Sue Dumais, Rob Miller (committee), Eytan Adar, Christine Alvarado, Eric Horvitz, Rosie Jones, and Michael Potts
Thesis Overview Supporting Finding –How people find –Individual differences affect finding –Personalized finding tool Supporting Re-Finding –How people re-find –Finding and re-finding conflict –Personalized finding and re-finding tool
Old New
Thesis Overview Supporting Finding –How people find –How individuals find –Personalized finding tool Supporting Re-Finding –How people re-find –Finding and re-finding conflict –Personalized finding and re-finding tool
Supporting Re-Finding How people re-find –People repeat searches –Look for old and new Finding and re-finding conflict –Result changes cause problems Personalized finding and re-finding tool –Identify what is memorable –Merge in new information
Supporting Re-Finding How people find –People repeat searches –Look for old and new Finding and re-finding conflict –Result changes cause problems Personalized finding and re-finding tool –Identify what is memorable –Merge in new information Query log analysis Memorability study Re:Search Engine
Related Work How people re-find –Know a lot of meta-information [Dumais] –Follow known paths [Capra] Changes cause problems re-finding –Dynamic menus [Shneiderman] –Dynamic search result lists [White] Relevance relative to expectation [Joachims]
Query Log Analysis Previous log analysis studies –People re-visit Web pages [Greenberg] –Query logs: Sessions [Jones] Yahoo! log analysis –114 people over the course of a year –13,060 queries and their clicks Can we identify re-finding behavior? What happens when results change?
Re-Finding Common Repeat query Repeat click Unique click 40%86% 33% 87% 38% 26% of queries of repeat queries
Change Reduces Re-Finding Results change rank Change reduces probability of repeat click –No rank change: 88% chance –Rank change: 53% chance Why? –Gone? –Not seen? –New results are better?
Change Slows Re-Finding Look at time to click as proxy for Ease Rank change slower repeat click –Compared with initial search to click –No rank change: Re-click is faster –Rank change: Re-click is slower Changes interfere with re-finding ?
Old New
“Pick a card, any card.”
Case 1Case 2Case 3Case 4Case 5Case 6
Your Card is GONE!
People Forget a Lot
Change Blindness
Old New
We still need magic!
Memorability Study Participants issued self-selected query After an hour, asked to fill out a survey 129 people remembered something
Memorability a Function of Rank
Remembered Results Ranked High
Old New
result list 1 result list 2 … result list n Re:Search Engine Architecture User client Web browser MergeIndex of past queries Result cache Search engine User interaction cache queryresult list query 1 query 2 … query n score 1 score 2 … score n result list
Components of Re:Search Engine Index of Past Queries Result Cache User Interaction Cache Merge Algorithm Index of past queries query query 1 query 2 … query n score 1 score 2 … score n result list 1 result list 2 … result list n Result cache query 1 query 2 … query n User interaction cache result list 1 result list 2 … result list n Merge result list
Index of Past Queries Studied how queries differ –Log analysis –Survey of how people remember queries Unimportant: case, stop words, word order Likelihood of re-finding deceases with time Get the user to tell us if they are re-finding –Encourage recognition, not recall Index of past queries query query 1 query 2 … query n score 1 score 2 … score n
Merge Algorithm Benefit of New Information score –How likely new result is to be useful… –…In a particular rank Memorability score –How likely old result is to be remembered… –…In a particular rank Chose list maximizes memorability and benefit of new information result list 1 result list 2 … result list n Merge result list
Benefit of New Information Ideal: Use search engine score Approximation: Use rank Results that are ranked higher are more likely to be seen –Greatest benefit given to highly ranked results being ranked highly
Memorability Score How memorable is a result? How likely is it to be remembered at a particular rank?
Choose Best Possible List Consider every combination Include at least three old and three new Min-cost network flow problem … … … … m2m2 m1m1 m 10 b 10 b2b2 b1b1 s t Old New Slots
Old New
Evaluation Does merged list look unchanged? –List recognition study Does merging make re-finding easier? –List interaction study Is search experience improved overall? –Longitudinal study
List Interaction Study 42 participants Two sessions a day apart –12 tasks each session Tasks based on queries Queries selected based on log analysis –Session 1 –Session 2 Re-finding New-finding (“stomach flu”) (“Symptoms of stomach flu?”) (“What to expect at the ER?”)
List Interaction Study
New 1 New 2 New 3 New 4 New 5 New 6 Old 5 New 1 Old 1 Old 7 New 2 New 3 New 4 Old 4 New 5 New 6 Old New Experimental Conditions Six re-finding tasks –Original result list –Dumb merging –Intelligent merging Six new-finding tasks –New result list –Dumb merging –Intelligent merging
Old New Experimental Conditions Six re-finding tasks –Original result list –Dumb merging –Intelligent merging Six new-finding tasks –New result list –Dumb merging –Intelligent merging Old 1 Old 2 Old 4 New 1 New 2 New 3 New 4 New 5 New 6 Old 10 Old 1 Old 2 Old 4 Old 10
Measures Performance –Correct –Time Subjective –Task difficulty –Result quality
Experimental Conditions Six re-finding tasks –Original result list –Dumb merging –Intelligent merging Six new-finding tasks –New result list –Dumb merging –Intelligent merging Faster, fewer clicks, more correct answers, and easier! Similar to Session 1
Results: Re-Finding PerformanceOriginalDumbIntelligent % correct96% Time (seconds) 99%88%
Results: Re-Finding SubjectiveOriginalDumbIntelligent % correct99%88%96% Time (seconds) Task difficulty1.57 Result quality
Results: Re-Finding OriginalDumbIntelligent % correct99%88%96% Time (seconds) Task difficulty Result quality List same? Intelligent merging better than Dumb Almost as good as the Original list Similarity 60%76%
Results: New-Finding PerformanceNewDumbIntelligent % correct73%74%84% Time (seconds)
Results: New-Finding SubjectiveNewDumbIntelligent % correct73%74%84% Time (seconds) Task difficulty Result quality
Results: New-Finding NewDumbIntelligent % correct73%74%84% Time (seconds) Task difficulty Result quality List same? Knowledge re-use can help No difference between New and Intelligent Similarity 38%50%61%
Results: Summary Re-finding –Intelligent merging better than Dumb –Almost as good as the Original list New-finding –Knowledge re-use can help –No difference between New and Intelligent Intelligent merging best of both worlds
Conclusion How people re-find –People repeat searches –Look for old and new Finding and re-finding conflict –Result changes cause problems Personalized finding and re-finding tool –Identify what is memorable –Merge in new information
Future Work Improve and generalize model –More sophisticated measures of memorability –Other types of lists (inboxes, directory listings) Effectively use model –Highlight change as well as hide it Present change at the right time –This talk’s focus: what and how –What about when to display new information?
Thesis Overview Supporting Finding –How people find –How individuals find –Personalized finding tool Supporting Re-Finding –How people re-find –Finding and re-finding conflict –Personalized finding and re-finding tool
David Karger (advisor), Mark Ackerman, Sue Dumais, Rob Miller (committee), Eytan Adar, Christine Alvarado, Eric Horvitz, Rosie Jones, and Michael Potts Thank You! Jaime Teevan