Understanding Query Ambiguity Jaime Teevan, Susan Dumais, Dan Liebling Microsoft Research.

Slides:



Advertisements
Similar presentations
Enhancing Personalized Search by Mining and Modeling Task Behavior
Advertisements

A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Evaluating Novelty and Diversity Charles Clarke School of Computer Science University of Waterloo two talks in one!
Struggling or Exploring? Disambiguating Long Search Sessions
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi Advisors: Dinesh Govindaraj SVN Vishwanathan* Group:
Optimizing search engines using clickthrough data
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
Personalization and Search Jaime Teevan Microsoft Research.
WSCD INTRODUCTION  Query suggestion has often been described as the process of making a user query resemble more closely the documents it is expected.
1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.
Evaluating Search Engine
Click Evidence Signals and Tasks Vishwa Vinay Microsoft Research, Cambridge.
Learning to Rank: New Techniques and Applications Martin Szummer Microsoft Research Cambridge, UK.
Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.
Context-Aware Query Classification Huanhuan Cao 1, Derek Hao Hu 2, Dou Shen 3, Daxin Jiang 4, Jian-Tao Sun 4, Enhong Chen 1 and Qiang Yang 2 1 University.
Ryen W. White, Microsoft Research Jeff Huang, University of Washington.
COMP 630L Paper Presentation Javy Hoi Ying Lau. Selected Paper “A Large Scale Evaluation and Analysis of Personalized Search Strategies” By Zhicheng Dou,
University of Kansas Department of Electrical Engineering and Computer Science Dr. Susan Gauch April 2005 I T T C Dr. Susan Gauch Personalized Search Based.
Cohort Modeling for Enhanced Personalized Search Jinyun YanWei ChuRyen White Rutgers University Microsoft BingMicrosoft Research.
Adapting Deep RankNet for Personalized Search
Information Re-Retrieval Repeat Queries in Yahoo’s Logs Jaime Teevan (MSR), Eytan Adar (UW), Rosie Jones and Mike Potts (Yahoo) Presented by Hugo Zaragoza.
SIGIR’09 Boston 1 Entropy-biased Models for Query Representation on the Click Graph Hongbo Deng, Irwin King and Michael R. Lyu Department of Computer Science.
From Devices to People: Attribution of Search Activity in Multi-User Settings Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz Microsoft Research,
Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.
Topics and Transitions: Investigation of User Search Behavior Xuehua Shen, Susan Dumais, Eric Horvitz.
A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA
Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.
Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.
Fan Guo 1, Chao Liu 2 and Yi-Min Wang 2 1 Carnegie Mellon University 2 Microsoft Research Feb 11, 2009.
Evaluating Search Engines in chapter 8 of the book Search Engines Information Retrieval in Practice Hongfei Yan.
Ruirui Li, Ben Kao, Bin Bi, Reynold Cheng, Eric Lo Speaker: Ruirui Li 1 The University of Hong Kong.
Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search.
CIS 430 November 6, 2008 Emily Pitler. 3  Named Entities  1 or 2 words  Ambiguous meaning  Ambiguous intent 4.
Understanding and Predicting Personal Navigation Date : 2012/4/16 Source : WSDM 11 Speaker : Chiu, I- Chih Advisor : Dr. Koh Jia-ling 1.
Implicit Acquisition of Context for Personalization of Information Retrieval Systems Chang Liu, Nicholas J. Belkin School of Communication and Information.
Crowdsourcing for Spoken Dialogue System Evaluation Ling 575 Spoken Dialog April 30, 2015.
Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.
Personalizing Search on Shared Devices Ryen White and Ahmed Hassan Awadallah Microsoft Research, USA Contact:
YZUCSE SYSLAB A Study of Web Search Engine Bias and its Assessment Ing-Xiang Chen and Cheng-Zen Yang Dept. of Computer Science and Engineering Yuan Ze.
Discovering and Using Groups to Improve Personalized Search Jaime Teevan, Merrie Morris, Steve Bush Microsoft Research.
Information Retrieval Effectiveness of Folksonomies on the World Wide Web P. Jason Morrison.
Analysis of Topic Dynamics in Web Search Xuehua Shen (University of Illinois) Susan Dumais (Microsoft Research) Eric Horvitz (Microsoft Research) WWW 2005.
Personalizing Search Jaime Teevan, MIT Susan T. Dumais, MSR and Eric Horvitz, MSR.
Qi Guo Emory University Ryen White, Susan Dumais, Jue Wang, Blake Anderson Microsoft Presented by Tetsuya Sakai, Microsoft Research.
CiteSight: Contextual Citation Recommendation with Differential Search Avishay Livne 1, Vivek Gokuladas 2, Jaime Teevan 3, Susan Dumais 3, Eytan Adar 1.
Lecture 3: Retrieval Evaluation Maya Ramanath. Benchmarking IR Systems Result Quality Data Collection – Ex: Archives of the NYTimes Query set – Provided.
COLLABORATIVE SEARCH TECHNIQUES Submitted By: Shikha Singla MIT-872-2K11 M.Tech(2 nd Sem) Information Technology.
Performance Measures. Why to Conduct Performance Evaluation? 2 n Evaluation is the key to building effective & efficient IR (information retrieval) systems.
Understanding and Predicting Personal Navigation.
Post-Ranking query suggestion by diversifying search Chao Wang.
THE WEB CHANGES EVERYTHING Jaime Teevan, Microsoft
1 What Makes a Query Difficult? David Carmel, Elad YomTov, Adam Darlow, Dan Pelleg IBM Haifa Research Labs SIGIR 2006.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Predicting Consensus Ranking in Crowdsourced Setting Xi Chen Mentors: Paul Bennett and Eric Horvitz Collaborator: Kevyn Collins-Thompson Machine Learning.
Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.
To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.
Usefulness of Quality Click- through Data for Training Craig Macdonald, ladh Ounis Department of Computing Science University of Glasgow, Scotland, UK.
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008 Annotations by Michael L. Nelson.
1 Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan, MIT Susan T. Dumais, Microsoft Eric Horvitz, Microsoft SIGIR 2005.
Jaime Teevan MIT, CSAIL The Re:Search Engine. “Pick a card, any card.”
Potential for Personalization Transactions on Computer-Human Interaction, 17(1), March 2010 Data Mining for Understanding User Needs Jaime Teevan, Susan.
SEARCH AND CONTEXT Susan Dumais, Microsoft Research INFO 320.
Evaluation Anisio Lacerda.
Evaluation of IR Systems
Amazing The Re:Search Engine Jaime Teevan MIT, CSAIL.
Personalizing Search on Shared Devices
Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz
Date: 2012/11/15 Author: Jin Young Kim, Kevyn Collins-Thompson,
Interactive Information Retrieval
Presentation transcript:

Understanding Query Ambiguity Jaime Teevan, Susan Dumais, Dan Liebling Microsoft Research

“grand copthorne waterfront”

“singapore”

How Do the Two Queries Differ? grand copthorne waterfront v. singapore Knowing query ambiguity allow us to: – Personalize or diversify when appropriate – Suggest more specific queries – Help people understand diverse result sets

Understanding Ambiguity Look at measures of query ambiguity – Explicit – Implicit Explore challenges with the measures – Do implicit predict explicit? – Other factors that impact observed variation? Build a model to predict ambiguity – Using just the query string, or also the result set – Using query history, or not

Related Work Predicting how a query will perform – Clarity [Cronen-Townsend et al. 2002] – Jensen-Shannon divergence [Carmel et al. 2006] – Weighted information gain [Zhou & Croft 2007]  Performance for individual versus aggregate Exploring query ambiguity – Many factors affect relevance [Fidel & Crandall 1997] – Click entropy [Dou et al. 2007]  Explicit and implicit data, build predictive models

Measuring Ambiguity Inter-rater reliability (Fleiss’ kappa) – Observed agreement (P a ) exceeds expected (P e ) – κ = (P a -P e ) / (1-P e ) Relevance entropy – Variability in probability result is relevant (P r ) – S = -Σ P r log P r Potential for personalization – Ideal group ranking differs from ideal personal – P4P = 1 - nDCG group

Collecting Explicit Relevance Data Variation in explicit relevance judgments – Highly relevant, relevant, or irrelevant – Personal relevance (versus generic relevance) 12 unique queries, 128 users – Challenge: Need different people, same query – Solution: Given query list, choose most interesting 292 query result sets evaluated – 4 to 81 evaluators per query

Collecting Implicit Relevance Data Variation in clicks – Proxy (click = relevant, not clicked = irrelevant) – Other implicit measures possible – Disadvantage: Can mean lots of things, biased – Advantage: Real tasks, real situations, lots of data 44k unique queries issued by 1.5M users – Minimum 10 users/query 2.5 million result sets “evaluated”

How Good are Implicit Measures? Explicit data is expensive Implicit good substitute? Compared queries with – Explicit judgments and – Implicit judgments Significantly correlated: – Correlation coefficient = 0.77 (p<.01)

Which Has Lower Click Entropy? v. federal government jobs find phone number v. msn live search singapore pools v. singaporepools.com Click entropy = 1.5Click entropy = 2.0 Result entropy = 5.7Result entropy = 10.7 Results change

Which Has Lower Click Entropy? v. federal government jobs find phone number v. msn live search singapore pools v. singaporepools.com tiffany v. tiffany’s nytimes v. connecticut newspapers Click entropy = 2.5Click entropy = 1.0 Click position = 2.6Click position = 1.6 Results change Result quality varies

v. federal government jobs find phone number v. msn live search singapore pools v. singaporepools.com tiffany v. tiffany’s nytimes v. connecticut newspapers campbells soup recipes v. vegetable soup recipe soccer rules v. hockey equipment Which Has Lower Click Entropy? Click entropy = 1.7Click entropy = 2.2 Click /user = 1.1Clicks/user = 2.1 Result quality varies Task affects # of clicks Results change

Challenges with Using Click Data Results change at different rates Result quality varies Task affects the number of clicks We don’t know click data for unseen queries  Can we predict query ambiguity?

Predicting Ambiguity Information Query Query length Contains URL Contains advanced operator Time of day issued Number of results (df) Number of query suggests Results Query clarity ODP category entropy Number of ODP categories Portion of non-HTML results Portion of results from.com/.edu Number of distinct domains

Predicting Ambiguity History NoYes Information Query Query length Contains URL Contains advanced operator Time of day issued Number of results (df) Number of query suggests Reformulation probability # of times query issued # of users who issued query Avg. time of day issued Avg. number of results Avg. number of query suggests Results Query clarity ODP category entropy Number of ODP categories Portion of non-HTML results Portion of results from.com/.edu Number of distinct domains Result entropy Avg. click position Avg. seconds to click Avg. clicks per user Click entropy Potential for personalization

Prediction Quality History NoYes Information Query Results All features = good prediction 81% accuracy (↑ 220%) Just query features promising 40% accuracy (↑ 57%) No boost adding result or history URLVery LowAdsHighLengthLowMedium Yes No 3+ <3 =1 2+

Summarizing Ambiguity Looked at measures of query ambiguity – Implicit measures approximate explicit – Confounds: result entropy, result quality, task Built a model to predict ambiguity These results can help search engines – Personalize when appropriate – Suggest more specific queries – Help people understand diverse result sets Looking forward: What about the individual?

THANK YOU Questions?