COMP 630L Paper Presentation Javy Hoi Ying Lau. Selected Paper “A Large Scale Evaluation and Analysis of Personalized Search Strategies” By Zhicheng Dou,

Slides:

Advertisements

Similar presentations

Accurately Interpreting Clickthrough Data as Implicit Feedback Joachims, Granka, Pan, Hembrooke, Gay Paper Presentation: Vinay Goel 10/27/05.

Advertisements

Evaluating the Robustness of Learning from Implicit Feedback Filip Radlinski Thorsten Joachims Presentation by Dinesh Bhirud

1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.

Personalized Query Classification Bin Cao, Qiang Yang, Derek Hao Hu, et al. Computer Science and Engineering Hong Kong UST.

Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.

1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.

Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.

Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)

Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan Susan T.Dumains Eric Horvitz MIT,CSAILMicrosoft Researcher Microsoft.

PROBLEM BEING ATTEMPTED Privacy -Enhancing Personalized Web Search Based on:  User's Existing Private Data Browsing History s Recent Documents 

Time-dependent Similarity Measure of Queries Using Historical Click- through Data Qiankun Zhao*, Steven C. H. Hoi*, Tie-Yan Liu, et al. Presented by: Tie-Yan.

Context-Aware Query Classification Huanhuan Cao 1, Derek Hao Hu 2, Dou Shen 3, Daxin Jiang 4, Jian-Tao Sun 4, Enhong Chen 1 and Qiang Yang 2 1 University.

Ryen W. White, Microsoft Research Jeff Huang, University of Washington.

Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.

Investigation of Web Query Refinement via Topic Analysis and Learning with Personalization Department of Systems Engineering & Engineering Management The.

1 Web Query Classification Query Classification Task: map queries to concepts Application: Paid advertisement 问题：百度 /Google 怎么赚钱？

Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.

University of Kansas Department of Electrical Engineering and Computer Science Dr. Susan Gauch April 2005 I T T C Dr. Susan Gauch Personalized Search Based.

CONTENT-BASED BOOK RECOMMENDING USING LEARNING FOR TEXT CATEGORIZATION TRIVIKRAM BHAT UNIVERSITY OF TEXAS AT ARLINGTON DATA MINING CSE6362 BASED ON PAPER.

Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.

Cohort Modeling for Enhanced Personalized Search Jinyun YanWei ChuRyen White Rutgers University Microsoft BingMicrosoft Research.

Quality-Aware Collaborative Question Answering: Methods and Evaluation Maggy Anastasia Suryanto, Ee-Peng Lim, Aixin Sun, and Roger H. L. Chiang. In Proceedings.

SIGIR’09 Boston 1 Entropy-biased Models for Query Representation on the Click Graph Hongbo Deng, Irwin King and Michael R. Lyu Department of Computer Science.

1 CS 178H Introduction to Computer Science Research What is CS Research?

Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.

1 Context-Aware Search Personalization with Concept Preference CIKM’11 Advisor ： Jia Ling, Koh Speaker ： SHENG HONG, CHUNG.

1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,

Topics and Transitions: Investigation of User Search Behavior Xuehua Shen, Susan Dumais, Eric Horvitz.

Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.

A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.

Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.

PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.

UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.

CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.

Personalized Search Cheng Cheng (cc2999) Department of Computer Science Columbia University A Large Scale Evaluation and Analysis of Personalized Search.

Personalized Web Search by Mapping User Queries to Categories Fang Liu Presented by Jing Zhang CS491CXZ February 26, 2004.

Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor ： Jia Ling, Koh Speaker ： SHENG HONG, CHUNG.

« Pruning Policies for Two-Tiered Inverted Index with Correctness Guarantee » Proceedings of the 30th annual international ACM SIGIR, Amsterdam 2007) A.

Understanding Query Ambiguity Jaime Teevan, Susan Dumais, Dan Liebling Microsoft Research.

Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.

Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.

Giorgos Giannopoulos (IMIS/”Athena” R.C and NTU Athens, Greece) Theodore Dalamagas (IMIS/”Athena” R.C., Greece) Timos Sellis (IMIS/”Athena” R.C and NTU.

UCAIR Project Xuehua Shen, Bin Tan, ChengXiang Zhai

Improving Web Search Results Using Affinity Graph Benyu Zhang, Hua Li, Yi Liu, Lei Ji, Wensi Xi, Weiguo Fan, Zheng Chen, Wei-Ying Ma Microsoft Research.

BEHAVIORAL TARGETING IN ON-LINE ADVERTISING: AN EMPIRICAL STUDY AUTHORS: JOANNA JAWORSKA MARCIN SYDOW IN DEFENSE: XILING SUN & ARINDAM PAUL.

1 Web-Page Summarization Using Clickthrough Data* JianTao Sun, Yuchang Lu Dept. of Computer Science TsingHua University Beijing , China Dou Shen,

Personalizing Web Search using Long Term Browsing History Nicolaas Matthijs, Cambridge Filip Radlinski, Microsoft In Proceedings of WSDM

Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.

Adish Singla, Microsoft Bing Ryen W. White, Microsoft Research Jeff Huang, University of Washington.

Retroactive Answering of Search Queries Beverly Yang Glen Jeh.

COLLABORATIVE SEARCH TECHNIQUES Submitted By: Shikha Singla MIT-872-2K11 M.Tech(2 nd Sem) Information Technology.

Scientific Paper Recommendation Emphasizing Each Researcher’s Most Recent Research Topic Kazunari Sugiyama 8 th January, 2010.

1 1 COMP5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified based on the slides provided by Lawrence Page, Sergey Brin, Rajeev Motwani.

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

Post-Ranking query suggestion by diversifying search Chao Wang.

More Than Relevance: High Utility Query Recommendation By Mining Users' Search Behaviors Xiaofei Zhu, Jiafeng Guo, Xueqi Cheng, Yanyan Lan Institute of.

Bloom Cookies: Web Search Personalization without User Tracking Authors: Nitesh Mor, Oriana Riva, Suman Nath, and John Kubiatowicz Presented by Ben Summers.

1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:

Context-Aware Query Classification Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang, Jian-Tao Sun, Enhong Chen, Qiang Yang Microsoft Research Asia SIGIR.

26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.

11 A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 1, Michael R. Lyu 1, Irwin King 1,2 1 The Chinese.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.

Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.

To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.

CS791 - Technologies of Google Spring A Webbased Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.

1 Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan, MIT Susan T. Dumais, Microsoft Eric Horvitz, Microsoft SIGIR 2005.

Recommender Systems & Collaborative Filtering

Retrieval Performance Evaluation - Measures

Presentation transcript:

COMP 630L Paper Presentation Javy Hoi Ying Lau

Selected Paper “A Large Scale Evaluation and Analysis of Personalized Search Strategies” By Zhicheng Dou, Ruihua Song, Ji-Rong Wen Published In International World Wide Web Conference, Proceedings of the 16th international conference on World Wide Web

Motivation Criticisms on Performance of Personalized Search Strategies Query dependent “mouse” vs “ Google” Neglecting the search context sports fan submits the query “Office” Short-term vs Long-term Profile Search for Docs to satisfy short-term needs Current web search ranking is sufficient for definite queries [1] Recent and remote histories are equally important [Recent history, Fresh queries] vs [Remote history, Recurring queries] [2] Does personalized search give promising performance under varied setups (e.g. queries, users, search context) [1] J. Teevan, S.T. Dumais and E Horvitz. Beyond the commons: Investigating the value of personalizing web search. In Proceedings of PIA ’05,2005 [2] B. Tan, X. Shen, and C. Zhai. Mining long-term search history to improve search accuracy. In Proceedings of KDD ’06, pages 718–723, 2006.

Contributions Proposed Methodology for Evaluation on a Large Scale Main idea: utilize clicking decisions as relevance judgments to evaluate search accuracy Using click-through data recorded from 12 days MSN query logs Strategies studied: 2 click-based and 3 profile-based strategies Preliminary Conclusion on Performance of Varied Strategies PS ~ Common Web Search Performance is query dependent (e.g. click entropy of queries) Straight forward click-based > profile based Long-term and short-term contexts are important to profile-based strategies

Evaluation Methodology Typical vs Proposed Evaluation Framework Evaluation Metric

Typical Evaluation Method Methodology A group of participants in PS system over several days Profile Specification Specified by users manually Automatically learnt from search histories Evaluation Participants determine the relevancy of the re-ranked result of some test queries Advantages Relevancy is explicitly defined by participants Drawbacks Only limited no. of participants Test queries may bias the reliability and accuracy of evaluation

Proposed Evaluation Framework Data stored in MSN search engine “Cookie GUID”: user identifier “Browser GUID”: session identifier Logs the query terms, clicked web pages and their ranks Methodology Download the top 50 search results and ranking list l 1 from MSN search engine for test query Computer the personalized score of each webpage and generate a new list l 2 sorted by P score Combine l 1 and l 2 by Borda’s ranking fusion method to get the relevance scores from MSN search engine Use the measurement metric to evaluate performance

Proposed Evaluation Framework Problems of this framework Unfair to evaluate a reordering of the original search results using the original click data Fail to evaluation the ranking alternation of documents that are relevant but not clicked by users

Evaluation Metric A – Rank Scoring Metric Evaluate the effectiveness of the collaborative filtering systems Equations: 1. Expected utility of a ranked list of web pages for query s 2. Utility of all test queries j: rank of a page in the list s: test query Alpha: parameter sets to 5 = 1if page j is clicked = 0 if j is not clicked Max possible utility: when the list returns all clicked webpage at the top of the ranked list Weighted by Ranking

Evaluation Metric B – Average Rank Metric Equations: 1) Average Rank of query s 2) Final Average Rank on test query set S s: test query P s : set of clicked web pages on query s R(p): Rank of page p

Personalization Strategies Introduction of Personalization Strategies

Personalized Search Strategies Under Studies Profile-based (User Interests) L-Profile S-Profile LS-Profile Click-based (Historical Clicks) P-click G-click Personal Level Re-Ranking Group Level Re-Ranking

Background – Specifying User Profile 1) General interests specified by users explicitly 2) Learn users’ preference automatically Profiles built using users’ search history Hierarchical category tree based on ODP (Open Directory Projects) Vectors of distinct terms which has accumulated past preference User profiles used by this paper Weighted topic categories Learned from users’ past clicked web pages

L-Profile c(·): Weighting vector of 67 pre-defined topic categories provided by KDD Cup [1] 1. Similarity between category vectors of user’s interest and webpage 2. User’s profile learnt from his past clicked webpages User u’s interest profile [1] The KDD-Cup 2005 Knowledge Discovery and Data Mining competition will be held in conjunction with the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining [2] D. Shen, R. Pan, J.-T. Sun, J. J. Pan, K. Wu, J. Yin, and Q. Yang. our winning solution to query classification in kddcup SIGKDD Explor. Newsl., 7(2):100–110, Category vector of webpage p: with confidences of top 6 categories which p belongs to [2] Prob that user u clicked webpage p before Set of webpages visited by user u in the past Impact weight: inversely prop to popularity

S-Profile and LS-Profile S-Profile 1) User’s short term profile 2) Score of webpage p: LS-Profile Set of webpages visited by user u in current session

P-Click Personalized score on page p Disadvantage: Only works for reoccurred queries Click number on web page p by user u on query q in the past

G-Click 1) Similarity between two users calculated from their long-term profiles 2) Set of K-Nearest neighbors 3) Score of webpage

Statistics of Experiments Dataset Queries Test Users Query Repetition Query Session Click Entropies

Statistics of Experiments Dataset MSN query logs for 12 days in 08/06 Randomly sample distinct users in US on 19/08/06

Statistics of Experiments Queries Statistics similar to other papers 80% 47% user

Statistics of Experiments Test Users

Statistics of Experiments Query Repetition ~46% of the test queries are repeated ones in training days (72% of repeated ones and 33% of test queries) are repeated by the same user Query Sessions

Statistics of Experiments Query Click Entropies Measure the degree of variation of query among each user Reasons for large entropies Informational queries Ambiguous queries collection of webpages clicked on query q % of the clicks on webpage p among all the clicks on q

Statistics of Experiments Query Click Entropies Majorities of the popular queries have low entropies

Experimental Results Overall Performance of Varied Strategies Performance on Different Click Entropies Performance on Repeated Queries Performance on Variant Search Histories Analysis on Profile-based Strategies

Results: Overall Performance 1) Click-based > Web 2) G-Click ~ P-Click Varying the size of K shows no significant enhancement on performance Reasons: high user query sparsity (similar users have few search histories on the queries submitted by test user) 3) Profile-based < Click-based

Results: Overall Performance 1) Click-based > Web 2) G-Click ~ P-Click Varying the size of K shows no significant enhancement on performance Reasons: high user query sparsity (similar users have few search histories on the queries submitted by test user) 3) Profile-based < Click-based

Results: Performance on Different Click Entropies For low entropies, original web search has done a good job Click-based strategies have great improvement as entropies increase Profile-based under perform in general Conclusion: Queries with small click entropy, personalization is unnecessary

Results: Performance on Repeated Queries 46% of test queries are repeated 33% of queries are repeated by the same users Conclusion: Refinding behavior is common\ High repetition ratio in real world makes click-based strategies work well Suggestions: We should provide convenient ways for users to review their search histories

Results: Performance on Variant Search Histories 1. For click-based approach, users with high search activities do not benefit more than other who do less search (higher variability on queries) 2. Long term profile improves performance as histories accumulate, but it also becomes more unstable (more noise) Suggestions: We should consider user’s real information need and select only appropriate search histories to build up user profiles.

Analysis on Profile based Strategies Reasons for under-performance of profile-based strategies Rough implementation Rich history contains noisy information which is irrelevant to current search LS-Profile is more stable than each of the separate profiles

Comments √ Statistics of the dataset Justifying the experimental results (biased set?) Providing more information on strategies analysis (dataset vs strategies) √ Big coverage of conventional personalization strategies √ Capture user’s web searching behavior since no predefined test queries set × Most of the distinct queries are optimal ones Performance of Clicked-based ~ Profile-based for “N” × 12 days are too short for building user’s profiles Most of the users only give sparse queries among which most are optimal and definite queries) => User’s true interest profile is not learnt The setup of the experiments is biased to click-based approach × For some experimental results, the performance of different strategies are close and irregular, it is not very convincing to draw a conclusion over their performance based on these results

Questions?