Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1

Slides:



Advertisements
Similar presentations
Collaborative Tagging in Recommender Systems AE-TTIE JI1, CHEOL YEON1, HEUNG-NAM KIM1, AND GEUN-SIK JO2 1 Intelligent E-Commerce Systems Laboratory,
Advertisements

Google News Personalization: Scalable Online Collaborative Filtering
By Venkata Sai Pulluri ( ) Narendra Muppavarapu ( )
Collaborative QoS Prediction in Cloud Computing Department of Computer Science & Engineering The Chinese University of Hong Kong Hong Kong, China Rocky.
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
Exploring Latent Features for Memory- Based QoS Prediction in Cloud Computing Yilei Zhang, Zibin Zheng, and Michael R. Lyu
Patch to the Future: Unsupervised Visual Prediction
Geographical and Temporal Similarity Measurement in Location-based Social Networks Chongqing University of Posts and Telecommunications KTH – Royal Institute.
Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.
Learning to Recommend Hao Ma Supervisors: Prof. Irwin King and Prof. Michael R. Lyu Dept. of Computer Science & Engineering The Chinese University of Hong.
CIKM’2008 Presentation Oct. 27, 2008 Napa, California
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
Context-Aware Query Classification Huanhuan Cao 1, Derek Hao Hu 2, Dou Shen 3, Daxin Jiang 4, Jian-Tao Sun 4, Enhong Chen 1 and Qiang Yang 2 1 University.
1 Collaborative Filtering and Pagerank in a Network Qiang Yang HKUST Thanks: Sonny Chee.
Relevance Feedback based on Parameter Estimation of Target Distribution K. C. Sia and Irwin King Department of Computer Science & Engineering The Chinese.
1 Collaborative Filtering: Latent Variable Model LIU Tengfei Computer Science and Engineering Department April 13, 2011.
SIGIR’09 Boston 1 Entropy-biased Models for Query Representation on the Click Graph Hongbo Deng, Irwin King and Michael R. Lyu Department of Computer Science.
Mao Ye, Peifeng Yin, Wang-Chien Lee, Dik-Lun Lee Pennsylvania State Univ. and HKUST SIGIR 11.
Friends and Locations Recommendation with the use of LBSN
Performance of Recommender Algorithms on Top-N Recommendation Tasks RecSys 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering.
Data Mining and Machine Learning Lab Exploring Temporal Effects for Location Recommendation on Location-Based Social Networks Huiji Gao, Jiliang Tang,
Focused Matrix Factorization for Audience Selection in Display Advertising BHARGAV KANAGAL, AMR AHMED, SANDEEP PANDEY, VANJA JOSIFOVSKI, LLUIS GARCIA-PUEYO,
Probabilistic Question Recommendation for Question Answering Communities Mingcheng Qu, Guang Qiu, Xiaofei He, Cheng Zhang, Hao Wu, Jiajun Bu, Chun Chen.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
Intelligent DataBase System Lab, NCKU, Taiwan Josh Jia-Ching Ying 1, Eric Hsueh-Chan Lu 2 and Vincent S. Tseng 1 1 Institute of Computer Science and Information.
BFTCloud: A Byzantine Fault Tolerance Framework for Voluntary-Resource Cloud Computing Yilei Zhang, Zibin Zheng, and Michael R. Lyu
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Wang-Chien Lee i Pervasive Data Access ( i PDA) Group Pennsylvania State University Mining Social Network Big Data Intelligent.
User Interests Imbalance Exploration in Social Recommendation: A Fitness Adaptation Authors : Tianchun Wang, Xiaoming Jin, Xuetao Ding, and Xiaojun Ye.
Online Learning for Collaborative Filtering
ICML2004, Banff, Alberta, Canada Learning Larger Margin Machine Locally and Globally Kaizhu Huang Haiqin Yang, Irwin King, Michael.
Friends and Locations Recommendation with the use of LBSN By EKUNDAYO OLUFEMI ADEOLA
Learning Geographical Preferences for Point-of-Interest Recommendation Author(s): Bin Liu Yanjie Fu, Zijun Yao, Hui Xiong [KDD-2013]
Location-Based Topic Evolution Haiqin Yang, Shouyuan Chen, Michael R. Lyu, Irwin King The Chinese University of Hong Kong 1.
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
Predictive Ranking -H andling missing data on the web Haixuan Yang Group Meeting November 04, 2004.
Probabilistic Models for Discovering E-Communities Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, Hongyuan Zha The Pennsylvania State University WWW.
1 Heat Diffusion Classifier on a Graph Haixuan Yang, Irwin King, Michael R. Lyu The Chinese University of Hong Kong Group Meeting 2006.
A User Experience-based Cloud Service Redeployment Mechanism KANG Yu Yu Kang, Yangfan Zhou, Zibin Zheng, and Michael R. Lyu {ykang,yfzhou,
Xutao Li1, Gao Cong1, Xiao-Li Li2
Intelligent DataBase System Lab, NCKU, Taiwan Josh Jia-Ching Ying, Eric Hsueh-Chan Lu, Wen-Ning Kuo and Vincent S. Tseng Institute of Computer Science.
Exploring in the Weblog Space by Detecting Informative and Affective Articles Xiaochuan Ni, Gui-Rong Xue, Xiao Ling, Yong Yu Shanghai Jiao-Tong University.
Bayesian Travel Time Reliability
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
Hongbo Deng, Michael R. Lyu and Irwin King
Recommender Systems with Social Regularization Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu The Chinese University of Hong Kong Irwin.
GeoMF: Joint Geographical Modeling and Matrix Factorization for Point-of-Interest Recommendation Defu Lian, Cong Zhao, Xing Xie, Guangzhong Sun, EnhongChen,
11 A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 1, Michael R. Lyu 1, Irwin King 1,2 1 The Chinese.
KAIST TS & IS Lab. CS710 Know your Neighbors: Web Spam Detection using the Web Topology SIGIR 2007, Carlos Castillo et al., Yahoo! 이 승 민.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
ICONIP 2010, Sydney, Australia 1 An Enhanced Semi-supervised Recommendation Model Based on Green’s Function Dingyan Wang and Irwin King Dept. of Computer.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.
Analyzing and Predicting Question Quality in Community Question Answering Services Baichuan Li, Tan Jin, Michael R. Lyu, Irwin King, and Barley Mak CQA2012,
Poster Spotlights Conference on Uncertainty in Artificial Intelligence Catalina Island, United States August 15-17, 2012 Session: Wed. 15 August 2012,
Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.
Hao Ma, Dengyong Zhou, Chao Liu Microsoft Research Michael R. Lyu
Experience Report: System Log Analysis for Anomaly Detection
A Collaborative Quality Ranking Framework for Cloud Components
WSRec: A Collaborative Filtering Based Web Service Recommender System
Asymmetric Correlation Regularized Matrix Factorization for Web Service Recommendation Qi Xie1, Shenglin Zhao2, Zibin Zheng3, Jieming Zhu2 and Michael.
Chen Cheng Haiqin Yang Irwin King Michael R. Lyu
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Jinhong Jung, Woojung Jin, Lee Sael, U Kang, ICDM ‘16
Pinjia He, Jieming Zhu, Jianlong Xu, and
RECOMMENDER SYSTEMS WITH SOCIAL REGULARIZATION
WorkShop on Community Question Answering on the Web
Socialized Word Embeddings
A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 22, Feb, 2010 Department of Computer.
Presentation transcript:

Fused Matrix Factorization with Geographical and Social Influence in Location-based Social Networks Chen Cheng1, Haiqin Yang1, Irwin King1,2 and Michael R. Lyu1 1Department of Computer Science and Engineering The Chinese University of Hong Kong & 2ATT Labs, Research ccheng@cse.cuhk.edu.hk AAAI 2012, Toronto, Canada

Check-in becomes a life style… In recent years, Location-based Social Networks such as Fourquare, Gowalla, Facebook place have attracted millions of users. We can easily share our experiences about locations with our friends just through the apps on our mobile phone. For example, this figure shows that I’ve just checked in at the engineering building through foursquare apps. AAAI 2012, Toronto, Canada

Check-in becomes a life style… Now the number of users surpasses 20 million corresponding to 2 billion check-ins1! This figure shows the growth trend of of foursquare users by Jan. 2011. It grows very fast in the past few years. Now the number surpasses 20 million corresponding to 2 billion check-ins. This is quite a large number. 1http://statspotting.com/2012/04/foursquare-statistics-20-million-users-2-billion-check-ins/ AAAI 2012, Toronto, Canada

Graph illustration of Location-based Social Networks (LBSNs) Checked in POI ( lat, lng ) Friend link Check in ? Community detection Link prediction POI recommendation Next place prediction This is the graph illustration of LBSNs. In LBSNs, we have millions of users and POIs. We can obtain social information between users, and each POI we have its latitude and longitude information, from which we can calculate the distance between two POIs. The connection between users and POIs is through check in . An interesting problem we want to focus on is that, given a new place, will the user be interested in this POI? Can we provide accurate POI recommendation for users in LBSNs? Travel sequence detection Trip recommendation AAAI 2012, Toronto, Canada

Our focus: POI recommendation Help users explore their surroundings Provide personalized travel recommendation Help 3rd-party developers provide personalized services Advertisements Coupons Traffic statistics POI recommendation is a very significant task. First, it can help users explore new places and know their city better. For example, if we want to find somewhere new to eat, it would certainly help. And foursquare has already offered such kind of services. Second, it can also help 3rd-party developers to provide personalized services. For example, if we know a user would like to check-in a restaurant, the advertiser can provide the restaurant advertisement for the user. AAAI 2012, Toronto, Canada

Challenges Large dataset Only positive data is seen Crawled from Gowalla from Feb. 2009 to Sep. 2011 4,128,714 check-ins from 53,944 users on 367,149 locations Only positive data is seen Sparsity : density of our dataset is only 0.0208% There are several challenges for POI recommendation in LBSNs. First the dataset is very large. Recall that there are millions of users in LBSNs. Second only positive data is seen. We can infer a user like a location from his check-ins, however, we donot the locations which he dislikes. Third, the dataset is very sparse which makes POI recommendation very tough. AAAI 2012, Toronto, Canada

POI recommendation in LBSNs Matrix Factorization can be a promising tool However… Geographical influence is ignored! A promising tool is matrix factorization due to its success in traditional recommender systems. However, geographical influence is ignored. AAAI 2012, Toronto, Canada

POI recommendation in LBSNs Er… a little far.. For example, the foursquare recommend me this restaurants, however, sometimes we would like to choose a nearby place due to the distance. AAAI 2012, Toronto, Canada

Multi-centers and normal distribution We further explore user’s check-in behavior and find that users tend to check in around several centers. Different from Cho 2011, they assume there are only 2 centers, home and office, however, we found that other centers count at least 10% of all the check-ins. These centers can be braches of large companies or airports. Two centers (home & office) in [Cho et al 2011] Several centers proposed in our paper AAAI 2012, Toronto, Canada

Multi-centers and normal distribution Similar to [Brockmann 2006, Gonzalez 2008] , we assume each center follow the norm distribution Many previous papers have used normal distributions to model human movement around a particular point, and we adopt this and assume each center follow the norm distribution. AAAI 2012, Toronto, Canada

Inverse distance rule We also plot the relationship between check-in probability between the distance to the nearest center, and find that although each user has his personalized taste for locations, the probability he will visit a location is inversely proportional to the distance between the location and its nearest center. AAAI 2012, Toronto, Canada

Social influence On average, overlap of a user’s check-ins to his friends only about 9.6% 90% users have only 20% common check-ins On average, the overlap of a user’s check-ins to his friends is only about 9.6%, and we plot the CCDF of the fraction of a user’s check-ins that are also visited by his friends, and find that almost 90% users only have 20% check-ins in common with their friends, which indicates limited social influence in POI recommendation in LBSNs, which is illustrated in our experiments. AAAI 2012, Toronto, Canada

Our proposal Multi-center Gaussian Model (MGM) to capture geographical influence Propose a generalized fused matrix factorization framework to include social and geographical influences Conduct thorough experiments on large-scale Gowalla dataset Based on the above observations, we first proposed a Multi-center Gaussian Model to capture geographical influence. Next we proposed a generalized fused matrix factorization framework including social and geographical influence. Finally, we conduct thorough experiments conducted on large-scale Gowalla dataset AAAI 2012, Toronto, Canada

Multi-center Gaussian model Recall check-in locations are located around several centers The probability a user visiting a location is inversely proportional to the distance from its nearest center MGM is proposed to model users’ check-in behavior Recall that check-in locations are located around several centers, and the probability a user visiting a location is inversely proportional to the distance from its nearest center, MGM is proposed to model users’ check-in behavior. AAAI 2012, Toronto, Canada

Multi-center Gaussian model Notation : multi-center set for user u : total frequency at center for user u is : the pdf of Gaussian distribution, and denote the mean and covariance matrices of regions around center The probability a user u visiting a location l given defined as: Here is the notation list. C_u is the multi-center set, and fcu is the frequency at a certain center, and this is the pdf of Gaussian distribution. Mu and sigma are the mean and covariance respectively. This first term denotes the probability l belongs to a certain center, the second term is norm effect of check-in freq at a certain center cu, for a center with large total frequency such as home , the probability should be higher than other centers. The third term is the normalized probability of l will be checked-in by the user. The whole term denotes the probability the l belongs to center cu and also visited by the user. And we sum up the probability at all centers, we get the probablity the user u will visit l.     AAAI 2012, Toronto, Canada

Multi-center discovering algorithm A greedy clustering algorithm is proposed due to Pareto principle (top 20 locations cover about 80% check-ins) 0.2 Next we just need to find the centers for each user. We proposed a greedy clustering algorithm for it. We find that top 20 locations cover about 80% check-ins as known as Pareto principle. We first rank all locations according the frequency. Then we scan to find centers. If the location doesnot belong to any centers, and search locations within d km to it and also not added to other centers to form a new center, and if the total freq is larger than a thrshold, a new center region is formed. 20 search centers AAAI 2012, Toronto, Canada

encode user preference Fused framework Traditional Matrix Factorization (MF) only model users’ preference on locations MGM only models geographical influence We can fuse both of them prob. user u visit location l Traditional MF only models users’ preference on locations, we denotes as P(Ful), and our proposed MGM only models geographical influce, actually, the probablity a user will visit a location controlled by his personalized taste for it as well as the geographical constraints that whether it is close to his centers. So can fused them together to get the fused framework. encode user preference based on MF calculated by MGM AAAI 2012, Toronto, Canada

Setup and metric Split the dataset into 2 non-overlapping sets Randomly select x% for each user as training data and the rest (1-x)% as the test data, x set to 70 and 80 Carried out 5 times independently, we report the average POI recommendation Return top-N POIs for each user Find out # of locations in test dataset are recovered Metric We split the dataset into training and test data set, and carried out 5 time independently and we use the traditional precision and recall metric. AAAI 2012, Toronto, Canada

Comparison Methods MGM PMF: [Salakhutdinov and Mnih 2007] Assume Gaussian distribution on observed data Gaussian prior on latent feature vector PMF with Social Regularization (PMFSR): [Ma et al. 2011b] Social regularization term added to PMF Probabilistic Factor Model (PFM): [Ma et al. 2011a] Model frequency data, Gamma prior on latent feature vector and Poisson distribution on the frequency data Fused MF with MGM (FMFMGM): our proposed method Here is the list of the comparison methods. Next three are well-known matrix factorization methods we introduced before. And the last is our fused method. AAAI 2012, Toronto, Canada

Results 70% 80% Precision Recall Here is the comparison result. From the figure we can see that, MGM and our fused framework consistently outperforms other MF methods without considering geographical influence, which indicates that GI plays a significant role in POI recommendation. Second, our fused framework performs at least 50% better than MGM, which also verified that the probability a user visit a location is controlled by user preference and GI. Last, when PMFSR only performs a little better than PMF, which coincides the conclusion that social influence is limited. 80% AAAI 2012, Toronto, Canada

User check-in distribution One challenge for POI recommendation is that it is difficult to provide recommendation for users with very few check-ins. In order to compare our methods thoroughly with others, we group the users into 6 classes according to their number of check-in locations in the training dataset. This figure shows the distributions on different range of check-in locations. AAAI 2012, Toronto, Canada

Performance on different users This two figures show the results. We can see that we AAAI 2012, Toronto, Canada

Conclusion Extract characteristics of a large dataset crawled from Gowalla Propose a novel Multi-center Gaussian Model (MGM) to model geographical influence Propose a fused MF framework which outperforms state-of-the-art methods AAAI 2012, Toronto, Canada

Future work To better model one-class frequency data To include other information: location category, activity, etc. To incorporate temporal effect AAAI 2012, Toronto, Canada

Thanks Q&A Chen Cheng ccheng@cse.cuhk.edu.hk AAAI 2012, Toronto, Canada