Alexander Kotov, UIUC Paul N. Bennett, Microsoft Research * Ryen W. White * Susan T. Dumais * Jaime Teevan * 1.

Slides:

Advertisements

Similar presentations

You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…

Advertisements

Angstrom Care 培苗社 Quadratic Equation II

Kapitel 21 Astronomie Autor: Bennett et al. Galaxienentwicklung Kapitel 21 Galaxienentwicklung © Pearson Studium 2010 Folie: 1.

Chapter 1 The Study of Body Function Image PowerPoint

Cognitive Radio Communications and Networks: Principles and Practice By A. M. Wyglinski, M. Nekovee, Y. T. Hou (Elsevier, December 2009) 1 Chapter 12 Cross-Layer.

Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.

1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.

STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.

1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.

Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13

Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13

Title Subtitle.

Process a Customer Chapter 2. Process a Customer 2-2 Objectives Understand what defines a Customer Learn how to check for an existing Customer Learn how.

DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.

FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.

Year 6 mental test 5 second questions

1 Contact details Colin Gray Room S16 (occasionally) address: Telephone: (27) 2233 Dont hesitate to get in touch.

The 5S numbers game..

Break Time Remaining 10:00.

Randomized Algorithms Randomized Algorithms CS648 1.

(This presentation may be used for instructional purposes)

PP Test Review Sections 6-1 to 6-6

1 IMDS Tutorial Integrated Microarray Database System.

ABC Technology Project

A Quest for an Internet Video Quality-of-Experience Metric

15. Oktober Oktober Oktober 2012.

1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.

Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.

Struggling or Exploring? Disambiguating Long Search Sessions

1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.

Lecture 3 Validity of screening and diagnostic tests

CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.

© 2012 National Heart Foundation of Australia. Slide 2.

Adding Up In Chunks.

Understanding Generalist Practice, 5e, Kirst-Ashman/Hull

GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.

Before Between After.

Benjamin Banneker Charter Academy of Technology Making AYP Benjamin Banneker Charter Academy of Technology Making AYP.

25 seconds left…...

Subtraction: Adding UP

Rational Functions and Models

Januar MDMDFSSMDMDFSSS

Analyzing Genes and Genomes

1 Cross-Correlations and Cleaning Up Data Jessica Ferguson.

We will resume in: 25 Minutes.

©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.

Essential Cell Biology

Ch 14 實習(2).

Clock will move after 1 minute

PSSA Preparation.

Essential Cell Biology

Physics for Scientists & Engineers, 3rd Edition

Energy Generation in Mitochondria and Chlorplasts

Select a time to count down from the clock above

CO-AUTHOR RELATIONSHIP PREDICTION IN HETEROGENEOUS BIBLIOGRAPHIC NETWORKS Yizhou Sun, Rick Barber, Manish Gupta, Charu C. Aggarwal, Jiawei Han 1.

1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.

From Devices to People: Attribution of Search Activity in Multi-User Settings Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz Microsoft Research,

Qi Guo Emory University Ryen White, Susan Dumais, Jue Wang, Blake Anderson Microsoft Presented by Tetsuya Sakai, Microsoft Research.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz

Presentation transcript:

Alexander Kotov, UIUC Paul N. Bennett, Microsoft Research * Ryen W. White * Susan T. Dumais * Jaime Teevan * 1

Simple vs. Cross-Session Tasks Simple search tasks: Composed of one/two continuous queries Contain short time intervals between related queries Completed within one search session Cross-session search tasks: Composed of multiple non-continuous queries Cover several aspects Contain long time intervals between related queries Continue across several search sessions Constitute 10% of all sessions, 25% of all querie s [Donato et al. WWW10] 2

Typical Cross-Session Tasks Event planning (e.g. wedding, vacation) Shopping research (e.g. electronics, real estate) Academic research Political/personality research How-to (How-do-I) research (fix a car, etc.) Medical self-diagnosis and treatment 3

TimeQuery 1/22/2011 1:10pmpeanut butter recipes 1/22/2011 1:13pmpeanut butter cookies 1/22/2011 1:25pmcalories peanut butter cookies 1/22/2011 3:10pmweather nyc 1/22/2011 3:11pmpeanut butter sandwiches 1/22/2011 3:15pmnyc 10-day weather forecast 1/22/2011 3:16pmpb&j 1/22/2011 3:18pmfluffanutter 1/22/2011 3:19pmfluffernutter 1/22/2011 6:15pmsigir /22/2011 6:17pmsigir 2010 schedule 1/23/2011 3:17pmnytimes 1/24/2011 3:00pmflight status united 123 1/25/2011 3:29pmfoodtv 1/25/2011 3:31pmfamous pb&j drop recipe Example of Cross-Session Tasks 4 Long time gap between related queries Interleaved with other tasks Task continuation Long periods focused on other tasks No shared terms with previous related queries

Supporting Cross-Session Tasks Motivation Relieve the cognitive burden of maintaining the context of search tasks Improve search results by reflecting the long-term intent Improve efficiency by pre-fetching the relevant content Provide support for task resumption Same Task: given a query, identify all previous queries on the same cross-session search task Different models proposed for Same Task before [Jones&Klinkner CIKM08; Donato et al. WWW10]; Task Continuation: given a search task for a user and the last query of the user on the search task, predict whether the user will return to it in the future Solution: classification framework Classifiers: Logistic Regression and MART 5

TimeQueryAutomatic LabelAutoDomHumDom 1/22/2011 1:10pmpeanut butter recipes xx 1/22/2011 1:13pmpeanut butter cookiespeanut butter recipesxx 1/22/2011 1:25pmcalories peanut butter cookiespeanut butter recipesxx 1/22/2011 3:10pmweather nyc 1/22/2011 3:11pmpeanut butter sandwichespeanut butter recipesxx 1/22/2011 3:15pmnyc 10-day weather forecastweather nyc 1/22/2011 3:16pmpb&jx 1/22/2011 3:18pmfluffanutterx 1/22/2011 3:19pmfluffernutterx 1/22/2011 6:15pmsigir /22/2011 6:17pmsigir 2010 schedulesigir /23/2011 3:17pmnytimes 1/24/2011 3:00pmflight status united 123 1/25/2011 3:29pmfoodtvx 1/25/2011 3:31pmfamous pb&j drop recipex Labeling 6 Use query refinement clusters and query graph with similarity threshold to produce automatic labels

TimeQueryAutomatic LabelAutoDomHumDom 1/22/2011 1:10pmpeanut butter recipes xx 1/22/2011 1:13pmpeanut butter cookiespeanut butter recipesxx 1/22/2011 1:25pmcalories peanut butter cookiespeanut butter recipesxx 1/22/2011 3:10pmweather nyc 1/22/2011 3:11pmpeanut butter sandwichespeanut butter recipesxx 1/22/2011 3:15pmnyc 10-day weather forecastweather nyc 1/22/2011 3:16pmpb&jx 1/22/2011 3:18pmfluffanutterx 1/22/2011 3:19pmfluffernutterx 1/22/2011 6:15pmsigir /22/2011 6:17pmsigir 2010 schedulesigir /23/2011 3:17pmnytimes 1/24/2011 3:00pmflight status united 123 1/25/2011 3:29pmfoodtvx 1/25/2011 3:31pmfamous pb&j drop recipex Labeling 7 Focus on early dominant tasks: two distinct queries labeled with the same task in the first two days.

TimeQueryAutomatic LabelAutoDomHumDom 1/22/2011 1:10pmpeanut butter recipes xx 1/22/2011 1:13pmpeanut butter cookiespeanut butter recipesxx 1/22/2011 1:25pmcalories peanut butter cookiespeanut butter recipesxx 1/22/2011 3:10pmweather nyc 1/22/2011 3:11pmpeanut butter sandwichespeanut butter recipesxx 1/22/2011 3:15pmnyc 10-day weather forecastweather nyc 1/22/2011 3:16pmpb&jx 1/22/2011 3:18pmfluffanutterx 1/22/2011 3:19pmfluffernutterx 1/22/2011 6:15pmsigir /22/2011 6:17pmsigir 2010 schedulesigir /23/2011 3:17pmnytimes 1/24/2011 3:00pmflight status united 123 1/25/2011 3:29pmfoodtvx 1/25/2011 3:31pmfamous pb&j drop recipex Labeling 8 Human annotators correct automatic labels for the dominant task (Cohens kappa ranges from 0.86 to 0.92)

Dataset10k3kHuman Number of users10,8523,3761,218 Users returning to dominant1,6941, Number of queries119,81466,21928,474 Query pairs1,486,492866,860660,120 Datasets 9 Sample 10k users from one week of browser-based logs of browsing and searching episodes 10k: 15% return to dominant 3k: 50% return to dominant Downsample negatives Editorial labels for a random sample

Prediction Tasks 10

Same Task Features Query-based: Descriptiveness: query length (terms/chars), Engagement: # clicks on top 10 results, Examination: min/max position of clicked results Session-based: Activity/Engagement: # queries/clicks/time since the beginning of a session, Similarity: presence of same/subset/superset query in session History-based: Activity/Engagement: # sessions/queries/clicks in history Similarity: presence of same/subset/superset query in history Pair-wise: Similarity: # overlapping terms/Jaccard coefficient/Levenshtein edit distance, equal/subset/superset, co-clicked URLs Time: time between two queries, same session 11

Same Task Results Baseline: LR using only Levenshtein distance Two classifiers show similar levels of accuracy on auto-labeled data Performance decreases on human-labeled data LR notably dominates MART for human-labeled data 12 3k10kHuman BASELRMARTBASELRMARTBASELRMART Macro statistics Recall Prec Acc F

Same Task P-R Curves LR dominates MART at the low recall/high precision end of the curves LR has outperforms MART on the human- labeled data in the area of optimal F 1 13

Same Task Feature Importance Pair-wise features are more prevalent Term overlap features are among the strongest signals Same task queries are morphologically similar Long, descriptive query terms are indicative of cross-session tasks 14 FeatureWeight Q UERY T ERMS J AC 1.44 N UM Q UERY C HARS N UM T ERMS O VER 0.93 N UM Q UERY C HARS S AME S ESS 0.52 H AVE C O C LICK D OM 0.40 N UM Q UERIES S ESS S UB Q UERY S ESS N UM Q UERY T ERMS N UM Q UERIES H IST N UM Q UERY T ERMS L EVEN D IST -0.84

Task Continuation Features Same query, session and history features as Same Task Session-based and History-based versions of: Engagement: avg. time between pairs of queries Satisfaction: # queries with dwell time more than 30 secs. # clicked queries / # queries, Complexity: avg. number of unique terms per query, Task relatedness: # co-clicked URLs with the same domain 15

Task Continuation Results 16 3k10kHuman BASELRMARTBASELRMARTBASELRMART Recall Prec Acc F Baseline: LR using only N UM Q UERIES H IST Two classifiers perform similarly on all datasets Recall and precision substantially decrease when moving from smaller balanced auto-labeled dataset ( 3k ) to larger unbalanced one ( 10k ) Recall significantly improves for both classifiers on manually corrected labels

Task Continuation P-R Curves Performance significantly decreases on 10 k dataset For human-labeled data MART has a slight advantage in the low recall/high precision region 17

Task Continuation Feature Importance Re-finding is common Dominant task related features are important Complexity of the information need and close examination of results are indicative of task continuation Users who search frequently and deeply likely use search for complex tasks 18 FeatureWeight S AME Q UERY H IST 1.11 N UM S ESS H IST 0.60 N UM D OM Q UERIES H IST 0.39 A VG I NTER QT IME H IST 0.24 F REQ D OM Q UERIES H IST 0.24 N UM D WELL 30H IST 0.22 N UM Q UERY H IST 0.21 N UM T OP 10C LICKS N UM C LICKS H IST N UM Q UERY C HARS S UB Q UERY H IST S UP Q UERY S ESS -0.40

Summary Addressed an important problem of predicting cross-session task continuation Designed large feature sets, reflecting: Query descriptiveness, user engagement, examination depth, user activity, query similarity to previous history, time dependency, task complexity, and user satisfaction Developed feature representation and learning techniques that could be used to accurately predict: Whether two queries are on the same task Whether a user will resume a task in a future session Analyzed feature contributions. Of particular note: Long, descriptive query terms are indicative of cross-session tasks The complexity of the information need and close examination of results are indicative of task resumption 19