Towards Context-Aware Search by Learning A Very Large Variable Length Hidden Markov Model from Search Logs Huanhuan Cao 1, Daxin Jiang 2, Jian Pei 3, Enhong.

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
CWS: A Comparative Web Search System Jian-Tao Sun, Xuanhui Wang, § Dou Shen Hua-Jun Zeng, Zheng Chen Microsoft Research Asia University of Illinois at.
Exploring Traversal Strategy for Web Forum Crawling Yida Wang, Jiang-Ming Yang, Wei Lai, Rui Cai, Lei Zhang and Wei-Ying Ma Chinese Academy of Sciences.
Google News Personalization: Scalable Online Collaborative Filtering
Learning to Suggest: A Machine Learning Framework for Ranking Query Suggestions Date: 2013/02/18 Author: Umut Ozertem, Olivier Chapelle, Pinar Donmez,
Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.
DQR : A Probabilistic Approach to Diversified Query recommendation Date: 2013/05/20 Author: Ruirui Li, Ben Kao, Bin Bi, Reynold Cheng, Eric Lo Source:
Chapter 5: Introduction to Information Retrieval
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.
Experiments on Query Expansion for Internet Yellow Page Services Using Log Mining Summarized by Dongmin Shin Presented by Dongmin Shin User Log Analysis.
WSCD INTRODUCTION  Query suggestion has often been described as the process of making a user query resemble more closely the documents it is expected.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem.
1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.
Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.
Evaluating Search Engine
Context-Aware Query Classification Huanhuan Cao 1, Derek Hao Hu 2, Dou Shen 3, Daxin Jiang 4, Jian-Tao Sun 4, Enhong Chen 1 and Qiang Yang 2 1 University.
1 Hidden Markov Model Instructor : Saeed Shiry  CHAPTER 13 ETHEM ALPAYDIN © The MIT Press, 2004.
Efficient Data Mining for Path Traversal Patterns CS401 Paper Presentation Chaoqiang chen Guang Xu.
Presented by Zeehasham Rasheed
J. Chen, O. R. Zaiane and R. Goebel An Unsupervised Approach to Cluster Web Search Results based on Word Sense Communities.
Query Log Analysis Naama Kraus Slides are based on the papers: Andrei Broder, A taxonomy of web search Ricardo Baeza-Yates, Graphs from Search Engine Queries.
Incomplete Graphical Models Nan Hu. Outline Motivation K-means clustering Coordinate Descending algorithm Density estimation EM on unconditional mixture.
Advisor: Hsin-Hsi Chen Reporter: Chi-Hsin Yu Date:
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
1 Context-Aware Search Personalization with Concept Preference CIKM’11 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
APPLYING EPSILON-DIFFERENTIAL PRIVATE QUERY LOG RELEASING SCHEME TO DOCUMENT RETRIEVAL Sicong Zhang, Hui Yang, Lisa Singh Georgetown University August.
 An important problem in sponsored search advertising is keyword generation, which bridges the gap between the keywords bidded by advertisers and queried.
1 A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu, Chris Jermaine University of Florida September 2007.
Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.
Streaming Predictions of User Behavior in Real- Time Ethan DereszynskiEthan Dereszynski (Webtrends) Eric ButlerEric Butler (Cedexis) OSCON 2014.
Implicit An Agent-Based Recommendation System for Web Search Presented by Shaun McQuaker Presentation based on paper Implicit:
Fan Guo 1, Chao Liu 2 and Yi-Min Wang 2 1 Carnegie Mellon University 2 Microsoft Research Feb 11, 2009.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.
Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Analysis of Topic Dynamics in Web Search Xuehua Shen (University of Illinois) Susan Dumais (Microsoft Research) Eric Horvitz (Microsoft Research) WWW 2005.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Jiafeng Guo(ICT) Xueqi Cheng(ICT) Hua-Wei Shen(ICT) Gu Xu (MSRA) Speaker: Rui-Rui Li Supervisor: Prof. Ben Kao.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.
Lecture 6 Spring 2010 Dr. Jianjun Hu CSCE883 Machine Learning.
CS Statistical Machine learning Lecture 24
ECE 8443 – Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem Proof EM Example – Missing Data Intro to Hidden Markov Models.
Google News Personalization Big Data reading group November 12, 2007 Presented by Babu Pillai.
More Than Relevance: High Utility Query Recommendation By Mining Users' Search Behaviors Xiaofei Zhu, Jiafeng Guo, Xueqi Cheng, Yanyan Lan Institute of.
Date: 2012/11/29 Author: Chen Wang, Keping Bi, Yunhua Hu, Hang Li, Guihong Cao Source: WSDM’12 Advisor: Jia-ling, Koh Speaker: Shun-Chen, Cheng.
Context-Aware Query Classification Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang, Jian-Tao Sun, Enhong Chen, Qiang Yang Microsoft Research Asia SIGIR.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.
Why Decision Engine Bing Demos Search Interaction model Data-driven Research Problems Q & A.
1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
1 Context-Aware Ranking in Web Search (SIGIR 10’) Biao Xiang, Daxin Jiang, Jian Pei, Xiaohui Sun, Enhong Chen, Hang Li 2010/10/26.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem.
Ariel Fuxman, Panayiotis Tsaparas, Kannan Achan, Rakesh Agrawal (2008) - Akanksha Saxena 1.
Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Online Multiscale Dynamic Topic Models
LECTURE 10: EXPECTATION MAXIMIZATION (EM)
Hidden Markov Models Part 2: Algorithms
Mining Query Subtopics from Search Log Data
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Ping LUO*, Fen LIN^, Yuhong XIONG*, Yong ZHAO*, Zhongzhi SHI^
Presentation transcript:

Towards Context-Aware Search by Learning A Very Large Variable Length Hidden Markov Model from Search Logs Huanhuan Cao 1, Daxin Jiang 2, Jian Pei 3, Enhong Chen 1 and Hang Li 2 1 University of Science and Technology of China, 2 Microsoft Research Asia, 3 Simon Fraser University

Context of User Queries A user usually raises multiple queries and conducts multiple rounds of interactions for an information need. User query Current Query click Context One round of Interaction

An Example Suppose Ada plans to buy a new car and need some cars reviews. But she doesnt know to formulate an effective query. Consequently, she raises a series of queries about different cars. No surprisingly, for each query, the review web sites are ranked low and not easy to be noticed.

Why Context is Useful? Suppose we have such a search log: SIDSearch session S1Ford => Toyota => GMC => Allstate S2Ford cars => Toyota cars => GMC cars => Allstate S3Ford cars => Toyota cars => Allstate S4 GMC => GMC dealers

Patterns in The Search Log Pattern1: – 50% users clicked a cars review web site after asking a series of cars. Ada will have better experience if the search engine knows pattern1.

Pattern2: – 75% users searched for car insurances after a series of queries about different cars. The search engine will provide more appropriate query suggestions and URL recommendations if it knows pattern2. Idea: Learning from search log to provide context-aware ranking, query suggestion and URL recommendation.

Related Work Mining wisdom of the crowds from search logs Improve ranking Use click-through data as implicit feedback Query suggestion Mining click-through data Mining session data Mixture: CACB URL recomendationMining search trials Only CACB considers context, but: 1. CACB constraints a query to one search intent 2. CACB doesnt use click information as context 3. CACB can only be used for query suggestion

Modeling Context by vlHMM (variable length Hidden Markov Model)

Overview of Technique Details Definition of vlHMM Parameters Estimation Challenges and Strategies Applications

Formal Definition Given: – A set of hidden states {s 1 … s Ns }; – A set of queries {q 1 … q Nq }; – A set of URLs {u 1 … u Nu }; – The maximal length T max of state sequences A vlHMM is a probability model defined as follows: – The transition probability distribution Δ = {P(s i |S j )}; – The initial state distribution Ψ = {P(s i )}; – The emission probability distribution for each state sequence Λ = {P(q, U|S j )};

Parameter Estimation Let X = {O 1 …O N } be the set of training sessions, where: – O n is a sequence of pairs (q n,1,U n,1 ) … (q n,Tn,U n,Tn ) – q n,t and U n,t are the t-th query and the set of clicked URLs, respectively – Moreover, we use u n,t,k to denote the k-th URL in U n,t. The task is to find Θ * such that

EM The original problem is in a complex form which may not have a closed-form solution. Alternatively, we use an iterative method: EM (Expectation Maximum). Objective function:

E-step: M-step:

Challenges for Training A Large vlHMM Challenge1: – The EM algorithm needs a user-specified number of hidden states. – However, in our problem, the hidden states correspond to users' search intents, whose number is unknown. Strategy: – We apply the mining techniques developed by our previous work as a prior process to the parameter learning process.

Challenge2: – Search logs may contain hundreds of millions of training sessions. – It is impractical to learn a vlHMM from such a huge training data set using a single machine. Strategy: – We deploy the learning task on a distributed system under the map-reduce programming model

Challenge3: – Each machine needs to hold the values of all parameters. – Since the log data usually contains millions of unique queries and URLs, the space of parameters is extremely large. Strategy: – we develop a special initialization strategy based on the clusters mined from the click-through bipartite

Applications Given a observation O consists of q 1 … q t and U 1 … U t Document re-ranking: – Rank by P(u|O) = P(u|s t ) P(s t |O) Query suggestion & URL-recommendation: – Suggest top k queries with P(q|O) = P(q|s t+1 ) P(s t+1 |O) – Recommend top k URLs with P(u|O) = P(u|s t+1 ) P(s t+1 |O) The advantages of our model: unification and power of prediction.

Experiments A large-scale search log from Live Search – Web searches in English from the US market Training Data – 1,812,563,301 search queries, – 2,554,683,191 clicks – 840,356,624 sessions – 151,869,102 unique queries – 114,882,486 unique URLs. Test Data – 100,000 sessions extracted from another search log

Coverage For each test session, the vlHMM deals with each q i. When i > 1, is used as a context. The total coverage is 58.3%. Denote the set of test cases without context as Test0 and the other as Test1. For the covered cases in Test1, 25.5% contexts are recognized.

Re-ranking Baseline: – Boost the URLs with high click times given the query. Evaluation: – Sample 500 re-ranking URL pairs from Test0 and from the cases whose context are recognized in Test1, respectively. – Each re-ranking URL pair is judged as Improved or Degraded or Unsure by 3 experts.

The effectiveness of re-ranking by the vlHMM and Baseline1 on (a) Test0 and (b) Test1.

Examples of Re-ranking Search for gamesUp the URL about game Visit the homepage of Ask Jeeves Up the URL which introduces the history of Ask Jeeves

URL Recommendation Baseline: – Recommend the URLs with high click times following the current query. Evaluation: – Leave-one-out" method: given, we use q T-1 as the test query and consider U T as the ground truth. – Suppose the set of recommended URLs is R, the precision is |RU T |/|R| and the recall is |R U T |/|U T |.

The precision and recall of the URLs recommended by the vlHMM and Baseline2.

An Example of URL Recommendation Search for online store about electronics Online store about equipments Online store about electronics

Query Suggestion Baseline: – CACB, a context-aware concept based approach of query suggestion. Evaluation: – The results of two approaches are comparable since they both consider contexts. – However, the ratio of recognizing contexts is increased by 55% by vlHMM.

Summary We propose a general approach to context-aware search by learning a vlHMM from log data. We tackle the challenges of learning a large vlHMM with millions of states from hundreds of millions of search sessions. The experimental results on a large real data set clearly show that our context-aware approach is both effective and efficient.

Our recent works on context-aware search: Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang, Jian-tao Sun, Enhong Chen and Qiang Yang. Context-aware query classification. To appear in SIGIR09. Huanhuan Cao, Daxin Jiang, Jian Pei, Enhong Chen and Hang Li. Towards context-aware search by learning a large variable length Hidden Markov Model from search logs. To appear in WWW09. Huanhuan Cao, Daxin Jiang, Jian Pei, Qi He, Zhen Liao, Enhong Chen and Hang Li. Context-aware query suggestion by mining click-through and session data. KDD08, pages , (This paper won the Best Application Paper Award of KDD08)

Thanks