Trajectory Pattern Mining

Slides:



Advertisements
Similar presentations
Frequent Itemset Mining Methods. The Apriori algorithm Finding frequent itemsets using candidate generation Seminal algorithm proposed by R. Agrawal and.
Advertisements

Ch2 Data Preprocessing part3 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Hierarchical Clustering, DBSCAN The EM Algorithm
Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
Mining Frequent Spatio-temporal Sequential Patterns
2001/12/18CHAMELEON1 CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling Paper presentation in data mining class Presenter : 許明壽 ; 蘇建仲.
More on Clustering Hierarchical Clustering to be discussed in Clustering Part2 DBSCAN will be used in programming project.
Experiments on Query Expansion for Internet Yellow Page Services Using Log Mining Summarized by Dongmin Shin Presented by Dongmin Shin User Log Analysis.
Segmentation (2): edge detection
Trajectory Pattern Mining NTU IM Hsieh, Hsun-Ping Trajectory Pattern Mining Reporter : Hsieh, Hsun-Ping 解巽評 (R ) Fosca Giannotti Mirco Nanni Dino.
CENTRE Cellular Network’s Positioning Data Generator Fosca GiannottiKDD-Lab Andrea MazzoniKKD-Lab Puntoni SimoneKDD-Lab Chiara RensoKDD-Lab.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Analysis of Algorithms1 Estimate the running time Estimate the memory space required. Time and space depend on the input size.
Data mining and statistical learning - lecture 14 Clustering methods  Partitional clustering in which clusters are represented by their centroids (proc.
Aho-Corasick String Matching An Efficient String Matching.
What is Cluster Analysis?
Scalable Network Distance Browsing in Spatial Database Samet, H., Sankaranarayanan, J., and Alborzi H. Proceedings of the 2008 ACM SIGMOD international.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
CHAPTER 7: SORTING & SEARCHING Introduction to Computer Science Using Ruby (c) Ophir Frieder at al 2012.
February 17, 2015Applied Discrete Mathematics Week 3: Algorithms 1 Double Summations Table 2 in 4 th Edition: Section th Edition: Section th.
Data Mining Techniques
Analysis of Algorithm Lecture 3 Recurrence, control structure and few examples (Part 1) Huma Ayub (Assistant Professor) Department of Software Engineering.
Assembler Efficient Discovery of Spatial Co-evolving Patterns in Massive Geo-sensory Data Sheng QIAN SIGKDD 2015.
Mirco Nanni, Roberto Trasarti, Giulio Rossetti, Dino Pedreschi Efficient distributed computation of human mobility aggregates through user mobility profiles.
Time-focused density-based clustering of trajectories of moving objects Margherita D’Auria Mirco Nanni Dino Pedreschi.
Indiana GIS Conference, March 7-8, URBAN GROWTH MODELING USING MULTI-TEMPORAL IMAGES AND CELLULAR AUTOMATA – A CASE STUDY OF INDIANAPOLIS SHARAF.
Knowledge Discovery and Delivery Lab (ISTI-CNR & Univ. Pisa)‏ www-kdd.isti.cnr.it Anna Monreale Fabio Pinelli Roberto Trasarti Fosca Giannotti A. Monreale,
Gapped BLAST and PSI- BLAST: a new generation of protein database search programs By Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui.
黃福銘 (Angus F.M. Huang) ANTS Lab, IIS, Academia Sinica TrajPattern: Mining Sequential Patterns from Imprecise Trajectories.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
Trajectory Pattern Mining Fosca Giannotti, Mirco Nanni, Dino Pedreschi, Fabio Pinelli KDD Lab (ISTI-CNR & Univ. Pisa) Presented by: Qiming Zou.
Fall 2002CMSC Discrete Structures1 Enough Mathematical Appetizers! Let us look at something more interesting: Algorithms.
MCA-2012Data Structure1 Algorithms Rizwan Rehman CCS, DU.
Chapter 10, Part II Edge Linking and Boundary Detection The methods discussed in the previous section yield pixels lying only on edges. This section.
Implementation of “A New Two-Phase Sampling Based Algorithm for Discovering Association Rules” Tokunbo Makanju Adan Cosgaya Faculty of Computer Science.
Chapter 10 Algorithm Analysis.  Introduction  Generalizing Running Time  Doing a Timing Analysis  Big-Oh Notation  Analyzing Some Simple Programs.
New Mexico Computer Science For All Algorithm Analysis Maureen Psaila-Dombrowski.
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
Christoph F. Eick Questions and Topics Review November 11, Discussion of Midterm Exam 2.Assume an association rule if smoke then cancer has a confidence.
Ch. Eick: Introduction to Hierarchical Clustering and DBSCAN 1 Remaining Lectures in Advanced Clustering and Outlier Detection 2.Advanced Classification.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Other Clustering Techniques
Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.
Clustering High-Dimensional Data. Clustering high-dimensional data – Many applications: text documents, DNA micro-array data – Major challenges: Many.
1. Searching The basic characteristics of any searching algorithm is that searching should be efficient, it should have less number of computations involved.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Cluster Analysis This lecture node is modified based on Lecture Notes for Chapter.
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
Parameter Reduction for Density-based Clustering on Large Data Sets Elizabeth Wang.
Incremental Reduced Support Vector Machines Yuh-Jye Lee, Hung-Yi Lo and Su-Yun Huang National Taiwan University of Science and Technology and Institute.
SORTING Sorting is storage of data in some order, it can be in ascending or descending order. The term Sorting comes along-with the term Searching. There.
Ch03-Algorithms 1. Algorithms What is an algorithm? An algorithm is a finite set of precise instructions for performing a computation or for solving a.
Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
1 Passive Reinforcement Learning Ruti Glick Bar-Ilan university.
Complexity Analysis (Part I)
Data Mining: Basic Cluster Analysis
More on Clustering in COSC 4335
Sequential Pattern Mining Using A Bitmap Representation
Enough Mathematical Appetizers!
Computation.
Spatio-temporal Pattern Queries
Applied Discrete Mathematics Week 6: Computation
CSE572, CBS572: Data Mining by H. Liu
Enough Mathematical Appetizers!
Enough Mathematical Appetizers!
CSE572: Data Mining by H. Liu
Complexity Analysis (Part I)
Complexity Analysis (Part I)
Presentation transcript:

Trajectory Pattern Mining Fosca Giannotti Dino Pedreschi Mirco Nanni Fabio Pinelli Chris Andrews Georgia Institute of Technology B.S. Computer Science 5th Year Undergraduate

Concepts Analyze trajectory of moving objects A 3mins B 5mins C 10mins D Trajectory Patterns – description of frequent behavior relating to space and time Frequent Sequence Pattern (FSP) Determine if trajectory sequence matches any trajectory patterns in a given set Study different methods of preparing a Temporally Annotated Sequence (TAS) for data mining

Trajectory Patterns (T-Patterns) sequence of time-stamped locations S = { ( x0, y0, t0 ) , … , ( xn, yn, tn ) } Temporal Annotation set of times relating to trajectories A = { a1 , a2, … an } Temporally Annotated Sequence (S,A) = (x0,y0) a1 (x1,y1) a2 … an (xn,yn)

Neighborhood Function Neighborhood Function N : R2 -> P (R2) Calculates spatial containment of regions Input point to find enclosing Region of Interest Defines the necessary proximity to fall into a region Parameters: e – radius or necessary proximity of points

Regions of Interest (RoI) Performing these comparisons on points is costly A simple preprocessing step can alleviate this Utilize the Neighborhood Function NR() Translate each set of points into regions Timestamp is selected from when the trajectory first entered the region Now compare sequence of regions and timestamps using the TAS mining algorithm presented in [2].

Static RoI Neighborhood Function NR() Initially receives set of R disjoint spatial regions R regions are predefined based on prior knowledge Each represents relevant place for processing Static NR() simplifies problem of mining patterns Sequence of points become grouped Result: sequence of regions (x,y) a1 (x’,y’) becomes X a1 Y

Dynamic RoI Data sets often do not possess predetermined regions Instead need to formulate regions based on criteria of density of the trajectories Preprocessing now must determine set R of popular regions from the data set R is now the set of Region of Interests from used by the Neighborhood Function NR() to translate points into Regions of Interest

Popular Regions Grid G of n x m cells Density Threshold d Each cell with density G(i,j) Set R of popular regions Each region in R forms rectangular region Sets in R are pair wise distinct Dense cells always contained in some region in R All regions in R have average density above d All regions in R cannot expand without their average density decreasing below d

Grid Density Preparation Split space into n x m grid with small cells Increment cells where trajectory passes Neighborhood Function NR() determines which surrounding cells Regression - increment continuously along trajectory

Popular Regions Algorithm Algorithm: PopularRegions( G, d ) Complexity: O ( |G| log |G| ) Iteratively consider each dense cell For each: Expands in all four directions Select expansion that maximizes density Repeat until expansion would decrease below density threshold

Results

Evaluating the T-Patterns Compute density of each cell of grid Compute set of RoI’s by determining Popular Regions Translate the input trajectories into sequence of RoI’s and timestamps for the transitions Input the trajectories and times into TAS mining algorithm[2]

Experiments GPS Data Performance Analysis Fleet of 273 trucks in Athens, Greece 112,203 total points recorded Running both static & dynamic pattern algorithms Various parameter settings Performance Analysis Synthetic Data by CENTRE synthesizer 50% random & 50% predetermined

Pattern Mining Results Static found: A t1 B t2 B Dynamic found: A t1 B’ t2 B’’

Execution Time Results Increase linearly with increasing number of input trajectories (both algorithms) Grow when density threshold decreases Static performs better with extreme threshold Static does not perform with middle threshold

Additional Results Increasing radius of spatial neighborhood obtains irregular performance and large values lead to poor execution times Changing time tolerance (t) obtains results similar to TAS’s Increasing the number of points in each trajectory causes linear growth of execution times

Works Cited [1] Trajectory pattern mining, Fosca Giannotti, Mirco Nanni, Fabio Pinelli, Dino Pedreschi, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining KDD. ACM, 2007. [2] Efficient Mining of Sequences with Temporal Annotations. F. Giannotti, M. Nanni, and D. Pedreschi. In Proc. SIAM Conference on Data Mining, pages 346–357. SIAM, 2006.