Learning Location Correlation From GPS Trajectories Yu Zheng Microsoft Research Asia March 16, 2010
Background 2 Locations are correlated in the space of human behavior These location might not belong to the same business categories They would not be co-located Cafe Cinema Different categories Jewel shop A Jewel shop B Jewel shop C Far away
What We Do Mine the correlation between locations from GPS trajectories The relation between locations in the space of human behavior Enable a location recommendation system 3
Challenges The correlation between locations depends on Sequence between locations being visited The travel experience (knowledge) of a user accessing these locations 4 ≠ e.g., One-way, accessibility Cor(A, B)>Cor(A, C)>Cor(A,D) Tourist Local expert CorExpert(A, B)>CorTourist(A, B) Could be random access
Methodology 5 Modeling human location history Inferring user experiences Computing location correlation Personalized location recommender
Solution – Step 1: Modeling human location history GPS logs P and GPS trajectory Stay points S={s 1, s 2,…, s n }. Stands for a geo-region where a user has stayed for a while Carry a semantic meaning beyond a raw GPS point Location history: represented by a sequence of stay points with transition intervals
1. Stay point detection 2. Hierarchical clustering 3.Graph Building
Solution – 2. Infer a user’s experience Mutual reinforcement relationship A user with rich travel knowledge are more likely to visit more interesting locations A interesting location would be accessed by many users with rich travel knowledge A HITS-based inference model Users are hub nodes Locations are authority nodes Topic is the geo-region 8
9 Users: Hub nodes Locations: Authority nodes The HITS-based inference model
Solution – 3. Mining the location correlation The correlation between locations can be represented by the sum of the experiences of the users taking this sequence 10 Trip 1: Trip 2: Trip 3:
Personalized Recommendation Integrate the location correlation into a CF model User-location matrix Slope-One: an item-based CF model 11 Slope-One model Our method
Experimental Settings 60 Devices and 136 users From May 2007 ~ present 12
A large-scale GPS dataset (by Feb. 18, 2009) – 10+ million GPS points – 260+ million kilometers – 36 cities in China and a few city in the USA, Korea and Japan
Results Ours The Pearson Correlation- Based CF model The Weighted Slope One Algorithm MAP Effectiveness Perform a user study-based evaluation Metric: NDCG & MAP More effective than the slop-one-based method Same performance with the Pearson correlation-based CF
Results Efficiency – Faster than the Pearson-based one – Almost have the same efficiency as the slop one 15
Conclusion The correlation between locations in the space of human behavior Sequence property User experience Conduct a personalized location recommender based on the correlation The recommender is Efficient than the Pearson correlation-based method and Effective than the slop one based approach 16
Thanks! 17