Cohort Modeling for Enhanced Personalized Search Jinyun YanWei ChuRyen White Rutgers University Microsoft BingMicrosoft Research.

Slides:

Advertisements

Similar presentations

Predicting User Interests from Contextual Information

Advertisements

Enhancing Personalized Search by Mining and Modeling Task Behavior

A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.

Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi Advisors: Dinesh Govindaraj SVN Vishwanathan* Group:

Optimizing search engines using clickthrough data

SIGIR 2013 Recap September 25, 2013.

Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.

Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.

Personalization and Search Jaime Teevan Microsoft Research.

GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.

1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.

Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.

Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)

Evaluating Search Engine

Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan Susan T.Dumains Eric Horvitz MIT,CSAILMicrosoft Researcher Microsoft.

Seesaw Personalized Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Ryen W. White, Microsoft Research Jeff Huang, University of Washington.

Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.

WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.

COMP 630L Paper Presentation Javy Hoi Ying Lau. Selected Paper “A Large Scale Evaluation and Analysis of Personalized Search Strategies” By Zhicheng Dou,

Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

University of Kansas Department of Electrical Engineering and Computer Science Dr. Susan Gauch April 2005 I T T C Dr. Susan Gauch Personalized Search Based.

An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,

Maryam Karimzadehgan (U. Illinois Urbana-Champaign)*, Ryen White (MSR), Matthew Richardson (MSR) Presented by Ryen White Microsoft Research * MSR Intern,

12 -1 Lecture 12 User Modeling Topics –Basics –Example User Model –Construction of User Models –Updating of User Models –Applications.

Online Search Evaluation with Interleaving Filip Radlinski Microsoft.

Adapting Deep RankNet for Personalized Search

TwitterSearch : A Comparison of Microblog Search and Web Search

Modern Retrieval Evaluations Hongning Wang

From Devices to People: Attribution of Search Activity in Multi-User Settings Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz Microsoft Research,

Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.

Topics and Transitions: Investigation of User Search Behavior Xuehua Shen, Susan Dumais, Eric Horvitz.

Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.

Searching the Web Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.

A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.

Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.

Fan Guo 1, Chao Liu 2 and Yi-Min Wang 2 1 Carnegie Mellon University 2 Microsoft Research Feb 11, 2009.

Improving Web Spam Classification using Rank-time Features September 25, 2008 TaeSeob,Yun KAIST DATABASE & MULTIMEDIA LAB.

CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.

Understanding Query Ambiguity Jaime Teevan, Susan Dumais, Dan Liebling Microsoft Research.

Giorgos Giannopoulos (IMIS/”Athena” R.C and NTU Athens, Greece) Theodore Dalamagas (IMIS/”Athena” R.C., Greece) Timos Sellis (IMIS/”Athena” R.C and NTU.

Personalized Search Xiao Liu

Personalizing Search on Shared Devices Ryen White and Ahmed Hassan Awadallah Microsoft Research, USA Contact:

Discovering and Using Groups to Improve Personalized Search Jaime Teevan, Merrie Morris, Steve Bush Microsoft Research.

Deep Learning Powered In- Session Contextual Ranking using Clickthrough Data Xiujun Li 1, Chenlei Guo 2, Wei Chu 2, Ye-Yi Wang 2, Jude Shavlik 1 1 University.

Analysis of Topic Dynamics in Web Search Xuehua Shen (University of Illinois) Susan Dumais (Microsoft Research) Eric Horvitz (Microsoft Research) WWW 2005.

Qi Guo Emory University Ryen White, Susan Dumais, Jue Wang, Blake Anderson Microsoft Presented by Tetsuya Sakai, Microsoft Research.

Adish Singla, Microsoft Bing Ryen W. White, Microsoft Research Jeff Huang, University of Washington.

COLLABORATIVE SEARCH TECHNIQUES Submitted By: Shikha Singla MIT-872-2K11 M.Tech(2 nd Sem) Information Technology.

Named Entity Recognition in Query Jiafeng Guo 1, Gu Xu 2, Xueqi Cheng 1,Hang Li 2 1 Institute of Computing Technology, CAS, China 2 Microsoft Research.

Post-Ranking query suggestion by diversifying search Chao Wang.

Context-Aware Query Classification Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang, Jian-Tao Sun, Enhong Chen, Qiang Yang Microsoft Research Asia SIGIR.

ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.

Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.

TO Each His Own: Personalized Content Selection Based on Text Comprehensibility Date: 2013/01/24 Author: Chenhao Tan, Evgeniy Gabrilovich, Bo Pang Source:

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.

Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR.

Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.

Potential for Personalization Transactions on Computer-Human Interaction, 17(1), March 2010 Data Mining for Understanding User Needs Jaime Teevan, Susan.

SEARCH AND CONTEXT Susan Dumais, Microsoft Research INFO 320.

1 Cross Market Modeling for Query- Entity Matching Manish Gupta, Prashant Borole, Praful Hebbar, Rupesh Mehta, Niranjan Nayak.

User Characterization in Search Personalization

Learning to Personalize Query Auto-Completion

Adaptive, Personalized Diversity for Visual Discovery

Topics and Transitions: Investigation of User Search Behavior

Personalizing Search on Shared Devices

Author: Kazunari Sugiyama, etc. (WWW2004)

Mining Query Subtopics from Search Log Data

Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz

Presentation transcript:

Cohort Modeling for Enhanced Personalized Search Jinyun YanWei ChuRyen White Rutgers University Microsoft BingMicrosoft Research

Personalized Search Many queries have multiple intents – e.g., [H2O] can be a beauty product, wireless, water, movie, band, etc. Personalized search – Combines relevance and the searcher’s intent – Relevant to the user’s interpretation of query

Challenge Existing personalized search – Relies on the access to personal history Queries, clicked URLs, locations, etc. Re-finding common, but not common enough – Approx. 1/3 of queries are repeats from same user [Teevan et al 2007, Dou et al 2007] – Similar statistics for [Shen et al 2012] 2/3 queries new in 2 mo. - ‘cold start’ problem

Motivation for Cohorts When encountering new query by a user – Turn to other people who submitted the query – e.g., Utilize global clicks Drawback – No personalization Cohorts – A group of users similar along 1+ dimensions, likely to share search interests or intent – Provide useful cohort search history

Situating Cohorts Global Individual Hard to Handle New Queries Hard to Handle New Documents Sparseness (Low Coverage) Cohort Conjoint Analysis Learning across Users Collaborative Grouping/Clustering Cohorts … Not personalized

Related Work Explicit groups/cohorts – Company employees [Smyth 2007] – Collaborative search tools [Morris & Horvitz 2007] Implicit cohorts – Behavior based, k-nearest neighbors [Dou et al. 2007] – Task-based / trait-based groups [Teevan et al. 2009] Drawbacks – Costly to collect or small n – Uses information unavailable to search engines – Some offer little relevance gain

Problem Given search logs with, can we design a cohort model that can improve the relevance of personalized search results?

Concepts Cohort: A cohort is a group of users with shared characteristics – E.g., a sports fan Cohort cohesion: A cohort has cohesive search and click preferences – E.g., search [fifa]  click fifa.com Cohort membership: A user may belong to multiple cohorts – Both a sports fan and a video game fan

Our Solution Cohort Generation Cohort Membership Cohort Behavior Cohort Preference Cohort Model User Preference Identify particular cohorts of interest Find people who are part of this cohort Mine cohort search behavior (clicks for queries) Identify cohort click preferences Build models of cohort click preferences Apply that cohort model to build richer representation of searchers’ individual preferences

Cohort Generation Proxies – Location (U.S. state) – Topical interests (Top-level categories in Open Directory Project) – Domain preference (Top-level domain, e.g.,.edu,.com,.gov) – Inferred from search engine logs Reverse IP address to estimate location Queries and clicked URLs to estimate search topic interest and domain preference for each user

Cohort Membership Multinomial distribution – Smoothed – Example: SATClicks = [0, 1, 2, 5] (clicks w/ dwell ≥ 30s) Smoothing parameter

Cohort Preference Query q d1 d2

Cohort Model Estimate individual click preference by cohort preference Ranked List Personal features Cohort features Personalized Ranking Search results Universal rank Re-rank Machine -learned model

Experiments Setup – Randomly sampled 3% of users – 2-month search history for cohort profiling: cohort membership, cohort CTR – 1 week for evaluation: 3 days training, 2 days validation, 2 days testing – 5,352,460 query impressions in testing Baseline – Personalized ranker used in production on Bing – With global CTR, and personal model

Experiments Evaluation metric: – Mean Reciprocal Rank of first SAT click (MRR)* ΔMRR = MRR(cohort model) – MRR(baseline) Labels: Implicit, users’ satisfied clicks – Clicks w/ dwell ≥ 30 secs or last click in session – 1 if SAT click, 0 otherwise * ΔMAP was also tried. Similar patterns to MRR.

Results Cohort-enhanced model beats baseline – Positive MRR gain over personalized baseline Average over many queries, with many ΔMRR = 0 Gains are highly significant (p < 0.001) – ALL has lower performance, could be noisier: Re-ranks more often, Combining different signals Group TypeΔMRR ±SEM ODP (Topic interest) ± % TLD (Top level domain) ± % Location (State) ± % ALL (ODP + TLD + Location) ± %

Performance on Query Sets New queries – Unseen queries in training/validation 2× MRR gain vs. all queries Queries with high click-entropy 5× MRR gain vs. all queries Ambiguous queries – 10k acronym queries, all w/ multiple meanings 10× MRR gain vs. all queries

Cohort Generation: Learned Cohorts Distance between user vector and cohort vector

Finding Best K

Summary Cohort model enhanced personalized search – Enrich models of individual intent using cohorts – Automatically learn cohorts from user behavior Future work: – More experiments, e.g., parameter sweeps – More cohorts: Age, gender, domain expertise, political affiliation, etc. – More queries: Long-tail queries, task-based and fuzzy matching rather than exact match

Thanks Questions?