Presentation on theme: "Exploring the Query-Flow Graph with a Mixture Model for Query Recommendation Lu Bai, Jiafeng Guo, Xueqi Cheng, Xiubo Geng, Pan Du Institute of Computing."— Presentation transcript:
Exploring the Query-Flow Graph with a Mixture Model for Query Recommendation Lu Bai, Jiafeng Guo, Xueqi Cheng, Xiubo Geng, Pan Du Institute of Computing Technology, CAS
Introduction Query recommendation – Generated from web query log – Different types of information are considered, including search results, clickthrough data, search sessions.
Introduction Recently, query-flow graph was introduced into query recommendation. 360 Xbox 360 kinect 360 Xbox 360 Xbox 720 Yahoo 360 Kinect Xbox 720 1 1 2 1 1 Yahoo Yahoo mail Yahoo mail Yahoo messenger Yahoo messenger Yahoo 1 1 1 apple Yahoo apple apple tree 1 1
Introduction Traditionally, personalized random walk over query- flow graph was used for recommendation. Dangling queries – No out links – Nearly 9% of whole queries Ambiguous queries – Mixed recommendation Hard to read – Dominant recommendation Cannot satisfy different needs Query = 360 Xbox 360 Xbox 720 Kinect 1 1 2 11 1 1 1 1 1 Query = apple Yahoo apple tree Yahoo mail
Our Work Explore query-flow graph for better recommendation – Apply a novel mixture model over query-flow graph to learn the intents of queries. – Perform an intent-biased random walk on the query-flow graph for recommendation.
Probabilistic model of generating query-flow graph Model the generation of the query-flow graph with a novel mixture model Assumptions – Queries are triggered by query intents. – Consecutive queries in one search session are from the same intent.
Probabilistic model of generating query-flow graph Process of generating a directed edge – Draw an intent indicator from the multinomial distribution. – Draw query nodes from the same multinomial intent distribution, respectively. – Draw the directed edge from a binomial distribution Likelihood function
Probabilistic model of generating query-flow graph EM algorithm is used to estimate parameters – E step – M step
Intent-biased random walk Based on the learned query intents, we apply intent- biased random walk for query recommendation. – Dangling queries: back off to its intents – Ambiguous queries: recommend under the each intent A row vector of query distribution of intent r transition probability matrix preference vector All entries are zeroes, except that the i-th is 1 row normalized weight matrix
Experiments Data Set – A 3-month query log generated from a commercial search engine. – Sessions are split by 30 minutes. – No stemming and no stop words removing. – The biggest connected graph is extracted for experiments, which is consisted of 16,980 queries and 51,214 edges.
Experiments Learning performance on different intent number.
Experiments Learned query intents: lyrics cars poems lyrics bmw poems song lyrics lexus love poems lyrics com audi poetry a z lyrics toyota friendship poems music lyrics acurafamous love poems azlyrics nissan love quotes lyric inﬁniti sad poems az lyricsmercedes benz quotes rap lyrics volvomother s day poems country lyrics mercedesmothers day poems
Experiments Dangling query suggestion Ambiguous query suggestion Query = yamaha motor BaselineOurs mapquestyamaha american idolhonda yahoo mailsuzuki home depotkawasaki bank of americayamaha motorcycles targetyamaha motorcycle Query = hilton BaselineOurs marriott [hotel] expediamarriott holiday inn hyattsheraton hotelhampton inn mapquestembassy suites hampton innhotels com sheraton [celebrity] hilton comparis hilton hotels commichelle wie embassy suitesnicole richie residence innjessica simpson choice hotelspamela anderson marriotdaniel dipiero hilton honorsrichard hatch
Experiments Performance improvement based on user click behaviors Baseline methodOur approach Average Hit Number4.094.21(+2.9%) Average Hit Score0.5980.652(+9.0%) Average Score0.1810.194(+7.1%)
Conclusion and Future work conclusion – We explore the query-ﬂow graph with a novel probabilistic mixture model for learning query intents. – An intent-biased random walk is introduced to integrate the learned intents for recommendation. Future work – Learn query intents with more auxiliary information: clicks, URLs, words etc.