Knowledge Discovery and Delivery Lab (ISTI-CNR & Univ. Pisa)‏ www-kdd.isti.cnr.it Anna Monreale Fabio Pinelli Roberto Trasarti Fosca Giannotti A. Monreale,

Slides:



Advertisements
Similar presentations
Heuristic Functions By Peter Lane
Advertisements

An Interactive-Voting Based Map Matching Algorithm
Human Mobility Modeling at Metropolitan Scales Sibren Isaacman, Richard Becker, Ramón Cáceres, Margaret Martonosi, James Rowland, Alexander Varshavsky,
On Map-Matching Vehicle Tracking Data
Delay bounded Routing in Vehicular Ad-hoc Networks Antonios Skordylis Niki Trigoni MobiHoc 2008 Slides by Alex Papadimitriou.
Fault-Tolerant Target Detection in Sensor Networks Min Ding +, Dechang Chen *, Andrew Thaeler +, and Xiuzhen Cheng + + Department of Computer Science,
PRIVACY AND SECURITY ISSUES IN DATA MINING P.h.D. Candidate: Anna Monreale Supervisors Prof. Dino Pedreschi Dott.ssa Fosca Giannotti University of Pisa.
Constructing Popular Routes from Uncertain Trajectories Authors of Paper: Ling-Yin Wei (National Chiao Tung University, Hsinchu) Yu Zheng (Microsoft Research.
Constructing Popular Routes from Uncertain Trajectories Ling-Yin Wei 1, Yu Zheng 2, Wen-Chih Peng 1 1 National Chiao Tung University, Taiwan 2 Microsoft.
Trajectory Pattern Mining NTU IM Hsieh, Hsun-Ping Trajectory Pattern Mining Reporter : Hsieh, Hsun-Ping 解巽評 (R ) Fosca Giannotti Mirco Nanni Dino.
CENTRE Cellular Network’s Positioning Data Generator Fosca GiannottiKDD-Lab Andrea MazzoniKKD-Lab Puntoni SimoneKDD-Lab Chiara RensoKDD-Lab.
Privacy Preserving Publication of Moving Object Data Joey Lei CS295 Francesco Bonchi Yahoo! Research Avinguda Diagonal 177, Barcelona, Spain 6/10/20151CS295.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Week 9 Data Mining System (Knowledge Data Discovery)
The Data Mining Visual Environment Motivation Major problems with existing DM systems They are based on non-extensible frameworks. They provide a non-uniform.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
University of Athens, Greece Pervasive Computing Research Group Predicting the Location of Mobile Users: A Machine Learning Approach 1 University of Athens,
Dieter Pfoser, LBS Workshop1 Issues in the Management of Moving Point Objects Dieter Pfoser Nykredit Center for Database Research Aalborg University, Denmark.
Tracking Moving Objects in Anonymized Trajectories Nikolay Vyahhi 1, Spiridon Bakiras 2, Panos Kalnis 3, and Gabriel Ghinita 3 1 St. Petersburg State University.
1 A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier.
Indexing Spatio-Temporal Data Warehouses Dimitris Papadias, Yufei Tao, Panos Kalnis, Jun Zhang Department of Computer Science Hong Kong University of Science.
Radial Basis Function Networks
Ubiquitous Advertising: the Killer Application for the 21st Century Author: John Krumm Presenter: Anh P. Nguyen
1 Preserving Privacy in GPS Traces via Uncertainty-Aware Path Cloaking by: Baik Hoh, Marco Gruteser, Hui Xiong, Ansaf Alrabady ACM CCS '07 Presentation:
Speed and Direction Prediction- based localization for Mobile Wireless Sensor Networks Imane BENKHELIFA and Samira MOUSSAOUI Computer Science Department.
GPS Trajectories Analysis in MOPSI Project Minjie Chen SIPU group Univ. of Eastern Finland.
GeoPKDD Geographic Privacy-aware Knowledge Discovery and Delivery Kick-off meeting Pisa, March 14, 2005.
Data Mining Techniques
Friends and Locations Recommendation with the use of LBSN
Issues with Data Mining
Privacy Preserving Data Mining on Moving Object Trajectories Győző Gidófalvi Geomatic ApS Center for Geoinformatik Xuegang Harry Huang Torben Bach Pedersen.
Mirco Nanni, Roberto Trasarti, Giulio Rossetti, Dino Pedreschi Efficient distributed computation of human mobility aggregates through user mobility profiles.
Time-focused density-based clustering of trajectories of moving objects Margherita D’Auria Mirco Nanni Dino Pedreschi.
Indiana GIS Conference, March 7-8, URBAN GROWTH MODELING USING MULTI-TEMPORAL IMAGES AND CELLULAR AUTOMATA – A CASE STUDY OF INDIANAPOLIS SHARAF.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Trajectory Pattern Mining
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
CENTRE CEllular Network Trajectories Reconstruction Environment F. Giannotti, A. Mazzoni, S. Puntoni, C. Renso KDDLab, Pisa.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Trajectory Pattern Mining Fosca Giannotti, Mirco Nanni, Dino Pedreschi, Fabio Pinelli KDD Lab (ISTI-CNR & Univ. Pisa) Presented by: Qiming Zou.
Spatial-Temporal Models in Location Prediction Jingjing Wang 03/29/12.
3. Rough set extensions  In the rough set literature, several extensions have been developed that attempt to handle better the uncertainty present in.
Data Mining Jim King. What is Data Mining?  A.k.a. knowledge discovery The search for previously unknown relationships in large data setsThe search for.
Shape-based Similarity Query for Trajectory of Mobile Object NTT Communication Science Laboratories, NTT Corporation, JAPAN. Yutaka Yanagisawa Jun-ichi.
Geographic Information Systems Temporal GIS Lecture 8 Eng. Osama Dawoud.
A new Ad Hoc Positioning System 컴퓨터 공학과 오영준.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
NextPlace: A Spatio-Temporal Prediction Framework for Pervasive Systems Salvatore Scellato1, Micro Musolesi, Cecilia Mascolo1, Vito Latora, and Andrew.
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
Intelligent DataBase System Lab, NCKU, Taiwan Josh Jia-Ching Ying 1, Wang-Chien Lee 2, Tz-Chiao Weng 1 and Vincent S. Tseng 1 1 Department of Computer.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Jin Yan Embedded and Pervasive Computing Center
Copyright © 2001, SAS Institute Inc. All rights reserved. Data Mining Methods: Applications, Problems and Opportunities in the Public Sector John Stultz,
Gesture recognition techniques. Definitions Gesture – some type of body movement –a hand movement –Head movement, lips, eyes Depending on the capture.
Location Privacy Protection for Location-based Services CS587x Lecture Department of Computer Science Iowa State University.
Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.
Efficient OLAP Operations in Spatial Data Warehouses Dimitris Papadias, Panos Kalnis, Jun Zhang and Yufei Tao Department of Computer Science Hong Kong.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
VizTree Huyen Dao and Chris Ackermann. Introducing example
黃福銘 (Angus F.M. Huang) ANTS Lab, IIS, Academia Sinica Exploring Spatial-Temporal Trajectory Model for Location.
ParkNet: Drive-by Sensing of Road-Side Parking Statistics Irfan Ullah Department of Information and Communication Engineering Myongji university, Yongin,
Data Mining Jim King.
Mining Spatio-Temporal Reachable Regions over Massive Trajectory Data
Spatio-temporal Pattern Queries
K Nearest Neighbor Classification
Anomaly Detection in Crowded Scenes
Using Clustering to Make Prediction Intervals For Neural Networks
Presentation transcript:

Knowledge Discovery and Delivery Lab (ISTI-CNR & Univ. Pisa)‏ www-kdd.isti.cnr.it Anna Monreale Fabio Pinelli Roberto Trasarti Fosca Giannotti A. Monreale, F. Pinelli, R. Trasarti, F. Giannotti. WhereNext: a Location Predictor on Trajectory Pattern Mining. KDD 2009

Wireless networks as mobility data collectors  Wireless networks infrastructures are the nerves of our territory  besides offering their services, they gather highly informative traces about the human mobile activities  Miniaturization, wearability, pervasiveness will produce traces of increasing positioning accuracy semantic richness

 From the analysis of the traces of our mobile phones it is possible to reconstruct our mobile behaviour, the way we collectively move  This knowledge may help us improving decision-making in many mobility-related issues: Planning traffic and public mobility systems in metropolitan areas; Planning physical communication networks Forecasting traffic-related phenomena Organizing logistics systems Prediction

Predicting the next location of a trajectory can improve a large set of services such as:  Navigational services.  Tra ffi c management.  Location-based advertising.  Services Pre-fetching.  Simulation. ? ? ?

How to realize this idea:  Extract patterns from all the available movements in a certain area instead of on the individual history of an object;  Using these Local movement patterns as predictive rules.  Build a prediction tree as global model. Trajectory dataset Local patterns Prediction Tree

Select the set of interesting trajectories Extract T-Patterns (A set of Local models)Merge T-Patterns (Global model)Use the Condensed model as predictor Validation Evaluation

The local pattern we use is the T-Pattern. It describes the common behavior of a group of users in space and time. F. Giannotti, M. Nanni, F. Pinelli, and D. Pedreschi. Trajectory pattern mining. KDD 2007:

Generating all rules from each T-pattern and using them to build a classifier is too expensive. R1R1 R2R2 R3R3 R4R4 T-Pattern Rules R1R1 R2R2 R3R3 R4R4 R1R1 R2R2 R3R3 R4R4 R1R1 R2R2 R3R3 R4R4 α1α1 α2α2 α3α3

To avoid the rules generation the T-Pattern set is organized as a prefix tree. For Each node v Id identi fi es the node v Region a spatial component of the T-Pattern Support is the support of the T-pattern For Each edge j [a,b] correspond to the time interval α n of the T-Pattern

Three steps: 1. Search for best match 2. Candidate generation 3. Make predictions Best Match Prediction How to compute the Best Match?

The spatio-temporal distance computed between the segment of trajectory (bounded in time using the previous transition time) and the current node of the path. Case a: The trajectory segment intersects the region of the node Case b: The enlarged trajectory segment intersects the region Case c: The enlarged trajectory segment doesn’t intersect the region Where the th_t is the time tolerance window defined by the user.

The path score is the aggregation of all punctual scores along a path. The Best Match is the path having: the maximum path score; at least one admissible prediction. 10 min 15 min 8 min 10 min Punctual score: 1 Punctual Score:.58 Punctual Score:.8 11 min 16 min Path score.79

o Average generalizes distances between the trajectory and each node o Sum is based on the concept of depth o Max is the optimistic one, the best punctual score is selected as path score o Context-dependent aggregations can take into consideration other aspects of the problem.

The WhereNext algorithm can be tuned using its parameters: - th_t : time window tolerance - th_s: space window tolerance - th_score: minimum prediction score threshold - th_agg: the aggregation function used to compute the path score (Avg, Sum or Max)

It is very hard to understand which is the best set of T-patterns we can use to build the our model:  a big set of T-patterns  very slow prediction.  a small set of T-patterns  coverage leaks For this reason we have defined a way to measure the prediction power of a T-Pattern set.

An evaluating function is defined to estimate the predicting power of a T-Pattern set.  SpatialCoverage: the space coverage of the regions contained in the T-Patterns set;  DatasetCoverage: measures how much the T-Pattern set represents the trajectories  RegionSeparation: the precision of the regions in the T-Pattern set. Model 1 Model 2 Testing the a priori evaluation

You are here

The results are evaluated using the following measures:  Accuracy: rate of the correctly predicted locations (space and time) divided by the total number of trajectories to be predicted.  Average Error: the average distance between the real trajectories in the predicted interval and the region predicted.  Prediction rate: the number of trajectories which have a prediction divided by the total number of trajectories to be predicted. Predicted Location Cut Original Predicted Location Cut Original Error

We used real life GPS dataset obtained from 17,000 vehicles in the urban area of the city of Milan. Training set: 4000 trajectories between 7am and 10 am on Wednesday Test set: 500 trajectories between 7am and 10 am on Thursday.

Predicted vs th_score Average Error vs th_space

Accuracy vs Average Error Single Users Accuracy and Prediction rate

A visual example of the application on Milan mobility data. The context is traffic management and we want to predict how the traffic will move in the city center. We have built a predictor on a “good” set of T-patterns which include the city gates of Milan. Part of the GeoPKDD integrated platform. F. Giannotti, D. Pedreschi, and et al. Geopkdd: Geographic privacy-aware knowledge discovery and delivery (european project), 2008.

- A new technique to predict the next locations of a trajectory based on previous movements of all the objects without considering any information about the users. - The time information is used not only to order the events but is intrinsically equipped in the T-Patterns used to build the Prediction tree. - The user can tune the method to obtain a good accuracy and prediction rate. - We are experimenting the method in real world applications.

Trajectorie s Dataset Regions of Interest T- PATTERN S

T-Patterns for trajectories

 The same exact spatial location (x,y) usually never occurs twice  The same exact transition times usually do not occur twice  Solution: allow approximation a notion of spatial neighborhood a notion of temporal tolerance

 Two points match if one falls within a spatial neighborhood N() of the other  Two transition times match if their temporal difference is ≤ τ  Example:

 Two points match if one falls within a spatial neighborhood N() of the other  Two transition times match if their temporal difference is ≤ τ  Example:

 Two points match if one falls within a spatial neighborhood N() of the other  Two transition times match if their temporal difference is ≤ τ  Example:

 T-pattern mining can be mapped to a density estimation problem over R 3n-1 2 dimensions for each (x,y) in the pattern (2n) ‏ 1 dimension for each transition (n-1) ‏  Density computed by mapping each sub-sequence of n points of each input trajectory to R 3n-1 drawing an influence area for each point (composition of N() and τ )  Too computationally expensive, heuristics needed  Our solution: a combination of sequential pattern mining and density-based clustering