To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.

Slides:

Advertisements

Similar presentations

Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.

Advertisements

A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.

Evaluating the Robustness of Learning from Implicit Feedback Filip Radlinski Thorsten Joachims Presentation by Dinesh Bhirud

Evaluating Novelty and Diversity Charles Clarke School of Computer Science University of Waterloo two talks in one!

Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.

Temporal Query Log Profiling to Improve Web Search Ranking Alexander Kotov (UIUC) Pranam Kolari, Yi Chang (Yahoo!) Lei Duan (Microsoft)

1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.

Optimizing search engines using clickthrough data

Center for E-Business Technology Seoul National University Seoul, Korea Socially Filtered Web Search: An approach using social bookmarking tags to personalize.

Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.

An Unsupervised Framework for Extracting and Normalizing Product Attributes from Multiple Web Sites Center for E-Business Technology Seoul National University.

Personalization and Search Jaime Teevan Microsoft Research.

Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.

1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.

Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)

Evaluating Search Engine

Ryen W. White, Microsoft Research Jeff Huang, University of Washington.

INFO 624 Week 3 Retrieval System Evaluation

Data Mining with Decision Trees Lutz Hamel Dept. of Computer Science and Statistics University of Rhode Island.

The Relevance Model  A distribution over terms, given information need I, (Lavrenko and Croft 2001). For term r, P(I) can be dropped w/o affecting the.

Finding Advertising Keywords on Web Pages Scott Wen-tau YihJoshua Goodman Microsoft Research Vitor R. Carvalho Carnegie Mellon University.

Information Re-Retrieval Repeat Queries in Yahoo’s Logs Jaime Teevan (MSR), Eytan Adar (UW), Rosie Jones and Mike Potts (Yahoo) Presented by Hugo Zaragoza.

SIGIR’09 Boston 1 Entropy-biased Models for Query Representation on the Click Graph Hongbo Deng, Irwin King and Michael R. Lyu Department of Computer Science.

Dependency Network Based Real-time Query Expansion Jiaqi Zou, Xiaojie Wang Center for Intelligence Science and Technology, BUPT.

Performance of Recommender Algorithms on Top-N Recommendation Tasks RecSys 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering.

Automatically Identifying Localizable Queries Center for E-Business Technology Seoul National University Seoul, Korea Nam, Kwang-hyun Intelligent Database.

A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA

Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.

Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.

Query Routing in Peer-to-Peer Web Search Engine Speaker: Pavel Serdyukov Supervisors: Gerhard Weikum Christian Zimmer Matthias Bender International Max.

Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search.

Understanding Query Ambiguity Jaime Teevan, Susan Dumais, Dan Liebling Microsoft Research.

윤언근 DataMining lab.  The Web has grown exponentially in size but this growth has not been isolated to good-quality pages.  spamming and.

Center for E-Business Technology Seoul National University Seoul, Korea BrowseRank: letting the web users vote for page importance Yuting Liu, Bin Gao,

Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.

Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign.

Ryen W. White, Dan Morris Microsoft Research, Redmond, USA {ryenw,

Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.

Discovering and Using Groups to Improve Personalized Search Jaime Teevan, Merrie Morris, Steve Bush Microsoft Research.

Diversifying Search Result WSDM 2009 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University Center for E-Business.

Analysis of Topic Dynamics in Web Search Xuehua Shen (University of Illinois) Susan Dumais (Microsoft Research) Eric Horvitz (Microsoft Research) WWW 2005.

Qi Guo Emory University Ryen White, Susan Dumais, Jue Wang, Blake Anderson Microsoft Presented by Tetsuya Sakai, Microsoft Research.

Adish Singla, Microsoft Bing Ryen W. White, Microsoft Research Jeff Huang, University of Washington.

Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.

Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia- Molina Stanford University SIGIR 2008.

Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.

COLLABORATIVE SEARCH TECHNIQUES Submitted By: Shikha Singla MIT-872-2K11 M.Tech(2 nd Sem) Information Technology.

Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR.

Understanding and Predicting Personal Navigation.

Post-Ranking query suggestion by diversifying search Chao Wang.

Bloom Cookies: Web Search Personalization without User Tracking Authors: Nitesh Mor, Oriana Riva, Suman Nath, and John Kubiatowicz Presented by Ben Summers.

Context-Aware Query Classification Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang, Jian-Tao Sun, Enhong Chen, Qiang Yang Microsoft Research Asia SIGIR.

Query Suggestions in the Absence of Query Logs Sumit Bhatia, Debapriyo Majumdar,Prasenjit Mitra SIGIR’11, July 24–28, 2011, Beijing, China.

26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.

1 What Makes a Query Difficult? David Carmel, Elad YomTov, Adam Darlow, Dan Pelleg IBM Haifa Research Labs SIGIR 2006.

Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.

1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.

Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.

Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:

Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.

Potential for Personalization Transactions on Computer-Human Interaction, 17(1), March 2010 Data Mining for Understanding User Needs Jaime Teevan, Susan.

University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G

Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia-Molina Department of Computer Science Stanford University SIGIR 2008 Presentation.

Evaluation of IR Systems

Personalizing Search on Shared Devices

Intent-Aware Semantic Query Annotation

Presentation transcript:

To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft Research Redmond, WA USA SIGIR, Summarized by Jaeseok Myung Intelligent Database Systems Lab School of Computer Science & Engineering Seoul National University, Seoul, Korea

Copyright  2009 by CEBT Introduction  In most previous work on personalized search algorithms The results for all queries are personalized in the same manner  However, there is a lot of variation across queries For some queries, everyone who issues the query is looking for the same thing For other queries, different people want very different results even though they express their need in the same way => Query ambiguity  As found by Dou et al[5], Personalization only improves the results for some queries, and can actually harm other queries Center for E-Business Technology

Copyright  2009 by CEBT Introduction (2)  Knowing query ambiguity allows us to Understand users’ intent deeply Personalize when appropriate Center for E-Business Technology microsoft earthstreet maps

Copyright  2009 by CEBT Building a Model  How can we build the model? This paper discusses for which data can be used Center for E-Business Technology Model (Classifier) Bayesian Network Logistic Regression SVM … Model (Classifier) Bayesian Network Logistic Regression SVM … Ambiguous Query Unambiguous Query street maps microsoft earth To personalize Not to personalize

Copyright  2009 by CEBT Building a Model with Explicit Data  To build a model, let’s consider explicit training data Expensive! Lack of Training Data! Center for E-Business Technology Model Bayesian Network Logistic Regression SVM … Model Bayesian Network Logistic Regression SVM … Explicit Data Train Ambiguous Query Unambiguous Query

Copyright  2009 by CEBT Building a Model with Implicit Data  To build a model, let’s consider explicit training data Expensive! Lack of Training Data!  What if we use implicit data to predict query ambiguity? Inexpensive! => robust, reliable Do implicit predict explicit well? – Need to prove Center for E-Business Technology Model Bayesian Network Logistic Regression SVM … Model Bayesian Network Logistic Regression SVM … Implicit Data Train Ambiguous Query Unambiguous Query

Copyright  2009 by CEBT Comparing Explicit & Implicit - Methods  If implicit measures are correlated to explicit measures, we can use implicit data instead of explicit data in order to build a predictive model Center for E-Business Technology Explicit Relevance Judgments Explicit Relevance Judgments Large-Scale User Logs (Implicit Features) Large-Scale User Logs (Implicit Features) Measures of Query Ambiguity Using Explicit Data Measures of Query Ambiguity Using Explicit Data Measures of Query Ambiguity Using Implicit Data Measures of Query Ambiguity Using Implicit Data Correlation between Measures Correlation between Measures

Copyright  2009 by CEBT Collecting Data Sets  Queries issued to the Live Search from October 4, 2007 to October 11, 2007 For each query, the results displayed to the users and the results that were clicked were extracted from the logs In total, 2,400,645 query instances, covering 44,002 distinct queries By 1,532,022 distinct users  Explicit Relevance Judgments 128 people for 12 of the distinct queries For each query, between 4~81 people judged the top 50 results (presented in random order) as highly relevant, relevant, not relevant In total, 292 sets of judgments were collected Center for E-Business Technology

Copyright  2009 by CEBT Query Ambiguity on Explicit Relevance Judgments Center for E-Business Technology Query 12 Highly Relevant RelevantNot Relevant URL10 users 10 users URL25 users3 users2 users ………… URL508 users2 users0 users Query … Highly Relevant RelevantNot Relevant URL10 users 10 users URL25 users3 users2 users ………… URL508 users2 users0 users Query 2 Highly Relevant RelevantNot Relevant URL13 users4 users3 users URL24 users5 users1 users ………… URL504 users 2 users Query 1 Highly Relevant RelevantNot Relevant URL10 users 10 users URL20 users3 users7 users ………… URL508 users2 users0 users Unambiguous Ambiguous

Copyright  2009 by CEBT Measures for Explicit Data (1)  Inter-rater Reliability The degree of agreement among raters Gives a score of how much homogeneity, or consensus, there is in the ratings given by judges There are a number of statistics which can be used to determine inter-rater reliability – Joint probability of agreement, Kappa statistics, Correlation coefficients  Fleiss’ Kappa [7] Kappa measures the extent to which the observed probability of agreement (P) exceeds the expected probability of agreement (P e ) if all raters were to make their ratings randomly: Center for E-Business Technology

Copyright  2009 by CEBT Measures for Explicit Data (2)  The Potential for Personalization Curve For a group of size one, the best list is one that returns the results that the individual considers relevant first => nDCG = 1 For larger group sizes, a single ranked list can no longer satisfy all individuals perfectly => the average quality drops Center for E-Business Technology User Groups How well a single result can satisfy each group member in a group of that size How well a single result can satisfy each group member in a group of that size A wide gap means big P4P P4P = 1 - nDCG

Copyright  2009 by CEBT Comparing Explicit & Implicit - Methods  If implicit measures are correlated to explicit measures, we can use implicit data instead of explicit data in order to build a predictive model Center for E-Business Technology Explicit Relevance Judgments Explicit Relevance Judgments Large-Scale User Logs (Implicit Features) Large-Scale User Logs (Implicit Features) Measures of Query Ambiguity Using Explicit Data Measures of Query Ambiguity Using Explicit Data Measures of Query Ambiguity Using Implicit Data Measures of Query Ambiguity Using Implicit Data Correlation between Measures Correlation between Measures

Copyright  2009 by CEBT Measures for Implicit Data (1)  The Implicit Potential for Personalization Curve Constructed using clicks as an approximation for relevance, with clicked results treated as results that were judged relevant It shows that people clicked on the same results for “microsoft earth”, but different results for “street maps”. Center for E-Business Technology

Copyright  2009 by CEBT Measures for Implicit Data (2)  Click Entropy Measures the variability in clicked results across individuals A large click entropy means many pages were clicked for the query, while a small click entropy means only a few were Center for E-Business Technology where p(c u |q) is the probability that URL u was clicked following query q microsoft earthstreet maps ? <

Copyright  2009 by CEBT Measures of Query Ambiguity To PersonalizeMeasuresNot to Personalize LowKappaHigh P4P(using explicit data)Low HighP4P(using implicit data)Low HighClick entropyLow Center for E-Business Technology A wide gap means big P4P P4P = 1 - nDCG microsoft earthstreet maps <

Copyright  2009 by CEBT Comparing Explicit & Implicit Measures  The value of implicit P4P at a group size of four is plotted against the explicit P4P at the same group size Correlation coefficient = 0.77  We can use implicit measures in order to predict query ambiguity Center for E-Business Technology

Copyright  2009 by CEBT Features Used to Predict Ambiguity Center for E-Business Technology Ok... Now we have measures that can be used for predicting the impact of implicit query features. Then, what kinds of implicit features do we have? History NoYes Information Query Query length Contains URL Contains advanced operator Time of day issued Number of results (df) Number of query suggests Reformulation probability # of times query issued # of users who issued query Avg. time of day issued Avg. number of results Avg. number of query suggests Results Query clarity ODP category entropy Number of ODP categories Portion of non-HTML results Portion of results from.com/.edu Number of distinct domains Result entropy Avg. click position Avg. seconds to click Avg. clicks per user Click entropy Potential for personalization

Copyright  2009 by CEBT Correlating Features & Implicit Measures Click entropyP4P Query length(words) Query length(chars) URL fragment Location mentioned Advanced query-0.01 # of query suggestions # of times issued # of distinct users Avg. # of results0.03 % issued during work Query clarity Category entropy # of distinct categories # of URLs in ODP Top level domain entropy # of distinct hosts Click entropy P4P Result entropy Avg. clicks per users Avg. click position Avg. seconds to click Center for E-Business Technology Small correlation Medium correlation Large correlation No history, No results History, No results No history, Results History, Results

Copyright  2009 by CEBT Building a Model  To model query ambiguity, Bayesian dependency networks is employed Center for E-Business Technology URL Word count Very Low Low Medium Yes No =1 2+ Ads High 3+ <3 Model Bayesian Network Logistic Regression SVM … Model Bayesian Network Logistic Regression SVM … Ambiguous Query Unambiguous Query

Copyright  2009 by CEBT Prediction Quality  All Features 81% Accuracy  No History, No Results 40% Accuracy  No boost adding result or history 40% Accuracy Center for E-Business Technology History NoYes Information Query Result s

Copyright  2009 by CEBT Conclusion  This paper explored using the variation in search result clickthrough to identify queries that can benefit from personalization  This paper reported that several click-based measures(click entropy and potential for personalization) reliably indicate when different people will find different results relevant to the same query  This paper also examined many additional features of the query, including features of the query string, the result set, and history information about the query Features of the query string alone were able to help us predict variation in clicks Additional information about the result set or query history did not add much value except when taken in conjunction Center for E-Business Technology

Copyright  2009 by CEBT Paper Evaluation  Pros Interesting topic Mentioning many practical measures  Cons Lack of explanation about how to build a model Low accuracy Center for E-Business Technology