Learning Transportation Mode from Raw GPS Data for Geographic Applications on the Web Yu Zheng, Like Liu, Xing Xie Microsoft Research Asia
Outline Introduction Framework Methodology Experiment Conclusion & future work
Outline Introduction Framework Methodology Experiment Conclusion & future work
Background Percentage of GPS-enabled handset among mobile phone (Gartner Dataqueste: Forecast: GPS-enabled device )
Introduction What we do : Infer transportation modes from users’ GPS logs GPS log Infer model
Introduction – Motivation Differentiate GPS trajectory of different transportation modes Learning knowledge from raw GPS data – enable people to absorb more knowledge from others’ life experience – Trigger people’s memory about their past – Understand people’s life pattern Understanding user behavior – Context-aware computing – Modeling traffic condition – Discover social pattern – … – Difficulty A trajectory may contain more than two kinds of transportation modes Pure velocity-based method may suffer from congestion
Introduction Distribution of mean velocity (m/s) of different transportation modes Distribution of maximum velocity (m/s) of different transportation modes
Introduction Contributions – We propose A change point-based segmentation method An inference model based on supervised learning A post-processing algorithm based on conditional probability – Significance A step toward mining knowledge from raw GPS data for geographic applications on the Web A step toward understanding user behavior based on GPS data – Evaluation results Large-scale data collected by 45 people over a period of 6 months Almost 70 percent accuracy
Outline Introduction Framework Methodology Experiment Conclusion & future work
Framework Preliminary
Framework Inference strategy
Framework Segment[i].P(Bike) = Segment[i].P(Bike) * P(Bike|Car) Segment[i].P(Walk) = Segment[i].P(Walk) * P(Walk|Car) Post-Processing
Framework CRF-Based Inference
Outline Introduction Framework Methodology Experiment Conclusion & future work
Methodology Commonsense knowledge from real world – Typically, people need to walk before transferring transportation modes – Typically, people need to stop and then go when transferring modes Transportation modes WalkCarBusBike Walk/53.4%32.8%13.8% Car95.4%/2.8%1.8% Bus95.2%3.2%/1.6% Bike98.3%1.7%0%/ Transition matrix of transportation modes
Methodology Change point-based Segmentation Algorithm – Step 1: distinguish all possible Walk Points, non-Walk Points. – Step 2: merge short segment composed by consecutive Walk Points or non-Walk points – Step 3: merge consecutive Uncertain Segment to non-Walk Segment. – Step 4: end point of each Walk Segment are potential change points
Outline Introduction Framework Methodology Experiment Conclusion & future work
Experiments Framework of experiment Feature Extraction – length – mean velocity – expectation of velocity – variance of velocity – top three velocities – top three accelerations
Experiment Devices Data
Experiment Evaluation method – Precision of inference a segment Accuracy by Length Accuracy by Duration – Change Point Precision of change point Recall of change point
Experiment: Result Inferring accuracy of transportation mode over change point-based segmentation method Inference performance
Experiment Recall of change point using change point based segmentation method Precision of change point using change point based segmentation method Inference performance of change point
Experiment: Result change point uniform duration (120 s) uniform length (100 m) Accuracy by Length Accuracy by Duration Recall/change point Precision/change point Comparison of different segmentation methods using Decision Tree
Experiment: Result Comparison of inference results of CRF over different segmentation methods change point uniform duration (90 s) uniform length (150 m) Accuracy by Length Accuracy by Duration Recall/ change point Precision /change point
Conclusion Change Point based Uniform Duration based Uniform Length based SVM Bayesian Net Decision Tree CRF Segmentation method Inference method
Future work Identify more valuable features Location-constraint conditional probability Improving prediction performance of CRF-based approach
Thanks! Q&A Yu Microsoft