1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.

1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research

2 User Interactions Goal: Harness rich user interactions with search results to improve quality of search Goal: Harness rich user interactions with search results to improve quality of search Millions of users submit queries daily and interact with the search results Millions of users submit queries daily and interact with the search results –Clicks, query refinement, dwell time User interactions with search engines are plentiful, but require careful interpretation User interactions with search engines are plentiful, but require careful interpretation We will predict user preferences for results We will predict user preferences for results

3 Related Work Linking implicit interactions and explicit judgments Linking implicit interactions and explicit judgments –Fox et al. [TOIS 2005] Predict explicit satisfaction rating Predict explicit satisfaction rating –Joachims [SIGIR 2005 ] Predict preference (gaze studies, interpretation strategies) Predict preference (gaze studies, interpretation strategies) More broad overview of analyzing implicit interactions: Kelly & Teevan [SIGIR Forum 2003] More broad overview of analyzing implicit interactions: Kelly & Teevan [SIGIR Forum 2003]

4 Outline Distributional model of user interactions Distributional model of user interactions –User Behavior = Relevance + “Noise” Rich set of user interaction features Rich set of user interaction features Learning framework to predict user preferences Learning framework to predict user preferences Large-scale evaluation Large-scale evaluation

5 Interpreting User Interactions Clickthrough and subsequent browsing behavior of individual users influenced by many factors Clickthrough and subsequent browsing behavior of individual users influenced by many factors –Relevance of a result to a query –Visual appearance and layout –Result presentation order –Context, history, etc. General idea: General idea: –Aggregate interactions across all users and queries –Compute “expected” behavior for any query/page –Recover relevance signal for a given query

6 Case Study: Clickthrough Clickthrough frequency for all queries in sample Clickthrough frequency for all queries in sample Clickthrough (query q, document d, result position p) = expected (p) + relevance (q, d)

7 Clickthrough for Queries with Known Position of Top Relevant Result Relative clickthrough for queries top relevant result known to be at position 1

8 Clickthrough for Queries with Known Position of Top Relevant Result Relative clickthrough for queries with known relevant results in position 1 and 3 respectively Higher clickthrough at top non-relevant than at top relevant document

9 Deviation from Expected Relevance component: deviation from “expected”: Relevance component: deviation from “expected”: Relevance(q, d)= observed - expected (p)

10 Beyond Clickthrough: Rich User Interaction Space Observed and Distributional features Observed and Distributional features –Observed features: aggregated values over all user interactions for each query and result pair –Distributional features: deviations from the “expected” behavior for the query Represent user interactions as vectors in “Behavior Space” Represent user interactions as vectors in “Behavior Space” –Presentation: what a user sees before click –Clickthrough: frequency and timing of clicks –Browsing: what users do after the click

11 Some User Interaction Features Presentation ResultPosition Position of the URL in Current ranking QueryTitleOverlap Fraction of query terms in result Title Clickthrough DeliberationTime Seconds between query and first click ClickFrequency Fraction of all clicks landing on page ClickDeviation Deviation from expected click frequency Browsing DwellTime Result page dwell time DwellTimeDeviation Deviation from expected dwell time for query

12 Outline Distributional model of user interactions Distributional model of user interactions Rich set of user interaction features Rich set of user interaction features Models for predicting user preferences Models for predicting user preferences Experimental results Experimental results

13 Predicting Result Preferences Task: predict pairwise preferences Task: predict pairwise preferences –A user will prefer Result A > Result B Models for preference prediction Models for preference prediction –Current search engine ranking –Clickthrough –Full user behavior model

14 Clickthrough Model SA+N: “Skip Above” and “Skip Next” SA+N: “Skip Above” and “Skip Next” –Adapted from Joachims’ et al. [SIGIR’05] –Motivated by gaze tracking Example Example –Click on results 2, 4 –Skip Above: 4 > (1, 3), 2>1 –Skip Next: 4 > 5, 2>3 1 2 3 4 5 6 7 8

15 Distributional Model CD: distributional model, extends SA+N CD: distributional model, extends SA+N –Clickthrough considered iff frequency > ε than expected Click on result 2 likely “by chance” Click on result 2 likely “by chance” 4>(1,2,3,5), but not 2>(1,3) 4>(1,2,3,5), but not 2>(1,3) 1 2 3 4 5 6 7 8

16 User Behavior Model Full set of interaction features Full set of interaction features –Presentation, clickthrough, browsing Train the model with explicit judgments Train the model with explicit judgments –Input: behavior feature vectors for each query-page pair in rated results –Use RankNet (Burges et al., [ICML 2005]) to discover model weights –Output: a neural net that can assign a “relevance” score to a behavior feature vector

17 RankNet for User Behavior RankNet: general, scalable, robust Neural Net training algorithms and implementation RankNet: general, scalable, robust Neural Net training algorithms and implementation Optimized for ranking – predicting an ordering of items, not scores for each Optimized for ranking – predicting an ordering of items, not scores for each Trains on pairs (where first point is to be ranked higher or equal to second) Trains on pairs (where first point is to be ranked higher or equal to second) –Extremely efficient –Uses cross entropy cost (probabilistic model) –Uses gradient descent to set weights –Restarts to escape local minima

18 Outline Distributional model of user interactions Distributional model of user interactions Rich set of user interaction features Rich set of user interaction features Models for predicting user preferences Models for predicting user preferences Experimental evaluation Experimental evaluation

19 Evaluation Metrics Task: predict user preferences Task: predict user preferences Pairwise agreement: Pairwise agreement: –For comparison with previous work –Useful for ranking and other applications Precision for a query: Precision for a query: –Fraction of pairs predicted that agree with preferences derived from human ratings Recall for a query: Recall for a query: –Fraction of human-rated preferences predicted correctly Average Precision and Recall across all queries Average Precision and Recall across all queries

20 Datasets Explicit judgments Explicit judgments –3,500 queries, top 10 results, relevance ratings converted to pairwise preferences for each query User behavior data User behavior data –Opt-in client-side instrumentation –Anonymized UserID, time, visited page Detect queries submitted to MSN Search engine Detect queries submitted to MSN Search engine Subsequent visited pages Subsequent visited pages 120,000 instances of these 3,500 queries submitted at least 2 times over 21 days 120,000 instances of these 3,500 queries submitted at least 2 times over 21 days

21 Methods Compared Preferences inferred by: Current search engine ranking: Baseline Current search engine ranking: Baseline –Result i > Result j iff i > j Clickthrough model: SA+N Clickthrough model: SA+N Clickthrough distributional model: CD Clickthrough distributional model: CD Full user behavior model: UserBehavior Full user behavior model: UserBehavior

22 Results: Predicting User Preferences Baseline < SA+N < CD << UserBehavior Rich user behavior features result in dramatic improvement

23 Contribution of Feature Types Presentation features not helpful Browsing features: higher precision, lower recall Clickthrough features > CD: due to learning

24 Amount of Interaction Data Prediction accuracy for varying amount of user interactions per query Slight increase in Recall, substantial increase in Precision

25 Learning Curve Minimum precision of 0.7 Recall increases substantially with more days of user interactions

26 Experiments Summary Clickthrough distributional model: more accurate than previously published work Clickthrough distributional model: more accurate than previously published work Rich user behavior features: dramatic accuracy improvement Rich user behavior features: dramatic accuracy improvement Accuracy increases for frequent queries and longer observation period Accuracy increases for frequent queries and longer observation period

27 Some Applications Web search ranking (next talk): Web search ranking (next talk): –Can use preference predictions to re-rank results –Can integrate features into ranking algorithms Identifying and answering navigational queries Identifying and answering navigational queries –Can tune model to focus on top 1 result –Supports classification or ranking methods –Details in Agichtein & Zheng, [KDD 2006] Automatic evaluation: augment explicit relevance judgments Automatic evaluation: augment explicit relevance judgments

28 Conclusions General framework for training rich user interaction models General framework for training rich user interaction models Robust techniques for inferring user relevance preferences Robust techniques for inferring user relevance preferences High-accuracy preference prediction in a large scale evaluation High-accuracy preference prediction in a large scale evaluation

29 Thank you Text Mining, Search, and Navigation group: http://research.microsoft.com/tmsn/ Adaptive Systems and Interaction group: http://research.microsoft.com/adapt/ Microsoft Research

30 Presentation Features Query terms in Title, Summary, URL Query terms in Title, Summary, URL Position of result Position of result Length of URL Length of URL Depth of URL Depth of URL …

31 Clickthrough Features Fraction of clicks on URL Fraction of clicks on URL Deviation from “expected” given result position Deviation from “expected” given result position Time to click Time to click Time to first click in “session” Time to first click in “session” Deviation from average time for query Deviation from average time for query

32 Browsing Features Time on URL Time on URL Cumulative time on URL (CuriousBrowser) Cumulative time on URL (CuriousBrowser) Deviation from average time on URL Deviation from average time on URL –Averaged over the “user” –Averaged over all results for the query Number of subsequent non-result URLs Number of subsequent non-result URLs

33 An Intelligent Baseline

1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.

Similar presentations

Presentation on theme: "1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.

Similar presentations

Presentation on theme: "1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research."— Presentation transcript:

Similar presentations

About project

Feedback