Presentation is loading. Please wait.

Presentation is loading. Please wait.

A conceptual Data Model for Trajectory Data Mining

Similar presentations


Presentation on theme: "A conceptual Data Model for Trajectory Data Mining"— Presentation transcript:

1 A conceptual Data Model for Trajectory Data Mining
Universidade Federal de Santa Catarina, Florianopolis, Brazil Informatics and Statistics Department A conceptual Data Model for Trajectory Data Mining * Prof. Vania Bogorny (INE/UFSC - Brazil) Prof. Carlos Alberto Heuser (II/UFRGS - Brazil) Prof. Luis Otavio Alvares (II/UFRGS-Brazil) 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

2 GIScience 2010 – A conceptual data model for trajectory data mining
Outline Motivation Objective Basic concepts Proposed Model Evaluation Conclusion 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

3 Introduction and Motivation
On the one side (database technology ) Since its origin, database design has the purpose of modeling data for operational purposes only Database designers don't think about data mining during the conceptual database design Data mining (DM) and knowledge discovery (KDD) from databases has become very popular in the last years in many fields and several application domains Dozens of papers with new data mining algorithms have been published in the last decade, but very little has been done for the automatic data preprocessing for (DM), which is the most time consuming step. Statistically it takes between 60 and 80% of the whole discovery process 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 3 3

4 Introduction and Motivation
On the other side (artificial intelligence ) Data mining (DM) or knowledge discovery (KDD) from databases has become very popular in the last years in many fields and several application domains Dozens of new data mining algorithms have been proposed in the last decade, but very little has been done for the automatic data preprocessing, which is the most time consuming step Data mining (DM) and knowledge discovery (KDD) from databases has become very popular in the last years in many fields and several application domains Dozens of papers with new data mining algorithms have been published in the last decade, but very little has been done for the automatic data preprocessing for (DM), which is the most time consuming step. Statistically it takes between 60 and 80% of the whole discovery process 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 4 4

5 Introduction and Motivation
DATABASE Modelling (Normalization) DATA MINING (Disnormalization) One single file 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

6 Introduction and Motivation
Another problem for data mining: data have to be preprocessed and transformed into different granularities Examples: Louvre Museum  Museum  TuristicPlace Instance + type type 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

7 Introduction and Motivation
These problems increase when dealing with trajectories of moving objects, which is the focus of this paper This is specially true for trajectories of moving objects, which is a new kind of spatio-temporal data generated by mobile devices, that in the last few years have become very popular in daily life. 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

8 GIScience 2010 – A conceptual data model for trajectory data mining
Objective We propose a conceptual framework for trajectory database modeling that supports data mining 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

9 GIScience 2010 – A conceptual data model for trajectory data mining
Basic Concepts 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 9

10 GIScience 2010 – A conceptual data model for trajectory data mining
Trajectories are new kind of spatio-temporal data Trajectories have attracted intensive research in both databases and data mining communities This is specially true for trajectories of moving objects, which is a new kind of spatio-temporal data generated by mobile devices, that in the last few years have become very popular in daily life. 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

11 GIScience 2010 – A conceptual data model for trajectory data mining
Trajectory Raw Data Trajectory Data are: Spatio-temporal data Represented by a set of points located in space and time Form: (tid, x,y,t), where tid is the trajectory identifier, (x,y) represent the spatial location at time t Tid position (x,y) time (t) :25 :26 :40 :41 :42 :04 :05 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

12 The Model of Stops and Moves (Spaccapietra 2008)
Important parts of trajectories Where the moving object has stayed for a minimal amount of time Stops are application dependent Tourism application Hotels, touristic places, airport, … Traffic Management Application Traffic lights, roundabouts, big events… MOVES Are the parts that are not stops 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

13 Semantic Trajectories
A semantic trajectory is a set of stops and moves Stops have by a place, a start time and an end time Moves are characterized by two consecutive stops 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

14 STOPS at Multiple-Granularities
Stop at Ibis Hotel from 6:04PM to 7:42PM, september 16, 2010 time space IbisHotel or Hotel or Accommodation Afternoon or Thursday or 6:00PM – 8:00PM or RUSH-HOUR 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

15 ITEMS - the building blocks for semantic pattern discovery
An item is generated either from a stop or a move An item is a set of complex information (space + time), that can be defined in many formats/types and at different granularities 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

16 Building an ITEM for Data Mining
Formats/types for an item: NameOnly: is the name of the stop/move STOPS: name of the spatial feature instance IbisHotel MOVES: name of the two stops which define the move ZurichAirport – IbisHotel NameStart: is the name of the stop/move + start time IbisHotel [morning] stop LouvreMuseum [weekend] stop IbisHotel-ZurichAirport [10:00AM-11:00AM] --move 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

17 Building an ITEM for Data Mining
NameEnd: name of a stop/move + end time IbisHotel[morning] stop IbisHotel-ZurichAirport[10:00AM-11:00AM]  move NameStartEnd: name of a stop/move + start time + end time IbisHotel[08:00AM-11:00AM][1:00pm-6:00pm]  stop LouvreMuseum[morning][afternoon]  stop ZurichAirport– IbisHotel [10:00AM-11:00PM] [10:00AM-6:00PM] 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

18 Semantic Trajectory Patterns
Frequent Patterns Sequential Patterns and Association Rules 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 18

19 Trajectory Frequent Pattern
Is a set of items that occur a minimal number of times (support s) Examples: {LouvreMuseum [08:00-10:00]} (s=0.1) {Airport [morning], hotel [morning]} (s=0.2) {Airport-Hotel, Hotel-Museum} (s=0.15) 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

20 Trajectory Sequential Pattern
Is an ordered list of items that occur a minimal number of times (support s) Examples: <Airport[morning], Hotel[morning], Museum[afternoon] > (s=0.15) <Airport-Hotel, Hotel-Museum> (s= 0.1) 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

21 Trajectory Association Rule
Is a rule where the items occur a minimal number of times (support s) and with a minimal confidence (c) Example Airport[morning], Hotel[morning]  Museum[afternoon] (s=0.1) (c=0.5) 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

22 GIScience 2010 – A conceptual data model for trajectory data mining
The Proposed Model 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 22

23 GIScience 2010 – A conceptual data model for trajectory data mining
The Proposed Model We extend the model of stops and Moves proposed by Spaccapietra with new attributes and methods Add new classes and relationships, with attributes and methods to automatic data preprocessing and multiple-level mining 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

24 The Conceptual Data Model of Stops and Moves (Spaccapietra 2008)
2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

25 GIScience 2010 – A conceptual data model for trajectory data mining
Proposed OO Model Compute and Store the patterns Data Pre-processing The data to be mined will be stops or moves. Therefore, we build the item over stops and moves ITEM has the objective to aggregate and disaggregate data for mining Time in stop and move is a timestamp (specific time), while in BESITEM and MOVEITEM, time is na aggregate time Spaccapietra´s Model 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 25 25

26 GIScience 2010 – A conceptual data model for trajectory data mining
Proposed OO Model Stops and Moves are extended with new attributes (specific time, e.g. 07:10 – 08:05 ) and methods to instatiate stops and moves The data to be mined will be stops or moves. Therefore, we build the item over stops and moves ITEM has the objective to aggregate and disaggregate data for mining Time in stop and move is a timestamp (specific time), while in BESITEM and MOVEITEM, time is na aggregate time Concept Hierarchy for the spatial feature type (e.g.: AccomodationPlace  Hotel) 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 26 26

27 Proposed Model OO Model
Generic class to represent the 3 kinds of patterns Attributes: support, listOfItems Methods: countSupport(), sequentialPattern() Attributes: support, confidence, antecedent (set of items) and consequent (set of items) Methods: countSupport(), associatePattern(), and computeConfidence() Attributes : startT, endT (generic time, e.g. Morning) Methods: getGenericSpatialFeature() – retrieves the hierarchy level timeG() – generalizes time spaceG() – generalizes space based on the hierarchy buildItem() – creates generalized ITEM Frequent Patterns: Attributes: support, setOfItems Methods: countSupport(), frequentPattern() The data to be mined will be stops or moves. Therefore, we build the item over stops and moves ITEM has the objective to aggregate and disaggregate data for mining Time in stop and move is a timestamp (specific time), while in BESITEM and MOVEITEM, time is na aggregate time 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 27 27

28 Example of an Instantiated Model
2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 28

29 Schema of Stops and Moves
STOP (Tid integer, Sid integer, SFTname string, SFid integer, startT timestamp, endT timestamp) Ex.: stop (1,1,Hotel, 3, 10AM, 11AM) MOVE (Tid integer, Mid integer, SFT1name string, SF1id integer, SFT2name string, SF2id integer, startT timestamp, endT timestamp, the_move geometry) 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

30 GIScience 2010 – A conceptual data model for trajectory data mining
Schema of the Patterns FrequentPattern/ SequentialPattern (Pid integer, pattern itemSetType, support real) itemSetType (SFT1name string, SF1id integer, SFT2name string, SF2id integer, startT string, endT string) AssociatePattern (Pid integer, antecedent itemSetType, consequent itemSetType, support real, confidence real) Nested relation 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

31 Instantiating and Querying Patterns
To instantiate the patterns we can use the ST-DMQL proposed in (Bogorny 2009) 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 31

32 Instantiating Stops and Moves
SELECT generateS (method, candidateStops, buffer) FROM trajectory SELECT generateM (method, candidateStops, buffer) IB-SMOT CB-SMOT DB-SMOT ...... 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 32

33 Instatiating Sequential Patterns
Q1 (tourism application): Which are the sequences of moves that occur most frequently in the morning and in the evening? SELECT sequentialPattern (itemType = NameEnd, timeG = [8:00-12:00 AS morning, 18:00-23:00 AS evening], spaceG = instance, minsup=0.03) FROM move Method in the ST-DMQL Ans: {IbisHotel - NotreDame[morning], EiffelTower – IbisHotel [evening]} (s=0.04) 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

34 Example of Pattern Queries
Q: How many moves of sequential patterns cross Pont Neuf bridge? SELECT count(m.*) FROM sequentialPattern s, bridge b, move m WHERE s.pattern.SFT1name=m.SFT1name AND s.pattern.SF1id=m.SF1id AND s.pattern.SFT2name=m.SFT2name AND s.pattern.SF2id=m.SF2id AND b.name='Pont Neuf' AND intersects (m.the_geom,b.the_geom) 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 34

35 GIScience 2010 – A conceptual data model for trajectory data mining
Conclusions Data pre-processing is the most time consuming step for DM and KDD To think about data mining during the conceptual design of a database can significantly reduce these steps 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 35

36 GIScience 2010 – A conceptual data model for trajectory data mining
Conclusions The proposed model: Reduces the pre-processing tasks Supports mining at multiple granularity levels Automatically prepares the data for data mining * Stores the patterns for futures queries Multiple-granularities data patterns Queries 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 36

37 Geometric Patterns X Semantic Patterns (Bogorny 2009)
TP Touristic Place R Restaurant CC CC T3 H Hotel T2 T3 T2 T1 T4 T1 T4 Semantic trajectory Pattern Hotel to Restaurant, passing by CC (b) go to Cinema, passing by CC 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 37

38 GIScience 2010 – A conceptual data model for trajectory data mining
Thank You! 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 38

39 More examples for generating stops
SELECT generateS (CB-SMOT, [Hotel,60,TouristicPlace,15,ShoppingCenter,30], 5) FROM trajectory t, district d WHERE d.name='Bela Vista' and intersects (t.movingpoint.geometry, d.geometry) 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil, 39

40 GIScience 2010 – A conceptual data model for trajectory data mining
Querying Rules Suppose that the user is interested in association patterns which have weekend as the time dimension in the antecedent of the rule SELECT * FROM associatePattern WHERE antecedent.startT='weekend' or antecedent.endT='weekend' 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

41 Basic Concepts: Support
2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

42 Basic Concepts: Semantic Trajectory Patterns
Example Sequential Patterns: Let T be a set of semantic trajectories. A sequence of items X={x1; x2; ... ; xn}, ordered in time, is a trajectory sequential pattern with respect to T and minSup if s(X) >= minSup. Example: Work [morning], ShoppingCenter [afternoon], Gym [afternoon] (s=0.08%) 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

43 Basic Concepts: Semantic Trajectory Patterns
Example Home [night], Work [afternoon]  Gym [afternoon] (s=0.10%) (c=0.50) 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

44 Basic Concepts: Semantic Trajectory Patterns
Example ReligiousPlace [weekend], Restaurant [weekend] (s=0. 07) 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

45 GIScience 2010 – A conceptual data model for trajectory data mining
Related Works Mining Trajectory Samples – Extract Geometric Patterns Mining Semantic Trajectories or Trajectory pre-processing for mining Attempts to reduce the gap between databases and data mining Laube 2002, 2005 Giannotti 2007 Lee 2007 Cao 2006, 2007 Li 2010 Alvares 2007 Zhou 2007 Palma 2008 Bogorny 2009 Manso 2010 Data mining query languages, but not for trajectories (Wang 2003, Malerba 2004, Han 1995) b 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,

46 Example of Frequent Pattern Instantiation
Q2: Which are the types of places most frequently visited by tourists on weekdays and weekends? SELECT frequentPattern (itemType =NameStart, timeG = WEEKEND-WEEKDAY, spaceG = [type, GenericHotel = 1], minsup = 0.15) FROM stop Ans: {4StarsHotel[weekend], Museum[weekend], Restaurant[weekend] } (s=0.16) Method in the ST-DMQL 2/22/2019 GIScience 2010 – A conceptual data model for trajectory data mining Vania Bogorny, Universidade Federal de Santa Catarina, Brazil,


Download ppt "A conceptual Data Model for Trajectory Data Mining"

Similar presentations


Ads by Google