August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Spatio-temporal Databases
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
Frequent Closed Pattern Search By Row and Feature Enumeration
Indexing and Range Queries in Spatio-Temporal Databases
Yoshiharu Ishikawa (Nagoya University) Yoji Machida (University of Tsukuba) Hiroyuki Kitagawa (University of Tsukuba) A Dynamic Mobility Histogram Construction.
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
2-dimensional indexing structure
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
--Presented By Sudheer Chelluboina. Professor: Dr.Maggie Dunham.
Spatio-temporal Databases Time Parameterized Queries.
Spatio-Temporal Databases
CPSC 322, Lecture 12Slide 1 CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12 (Textbook Chpt ) January, 29, 2010.
Computer Science Spatio-Temporal Aggregation Using Sketches Yufei Tao, George Kollios, Jeffrey Considine, Feifei Li, Dimitris Papadias Department of Computer.
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
1 SINA: Scalable Incremental Processing of Continuous Queries in Spatio-temporal Databases Mohamed F. Mokbel, Xiaopeng Xiong, Walid G. Aref Presented by.
Introduction to Evolutionary Computation  Genetic algorithms are inspired by the biological processes of reproduction and natural selection. Natural selection.
Hierarchical Constraint Satisfaction in Spatial Database Dimitris Papadias, Panos Kalnis And Nikos Mamoulis.
Dieter Pfoser, LBS Workshop1 Issues in the Management of Moving Point Objects Dieter Pfoser Nykredit Center for Database Research Aalborg University, Denmark.
An Incremental Refining Spatial Join Algorithm for Estimating Query Results in GIS Wan D. Bae, Shayma Alkobaisi, Scott T. Leutenegger Department of Computer.
Based on Slides by D. Gunopulos (UCR)
R-tree Analysis. R-trees - performance analysis How many disk (=node) accesses we’ll need for range nn spatial joins why does it matter?
Spatio-Temporal Databases. Introduction Spatiotemporal Databases: manage spatial data whose geometry changes over time Geometry: position and/or extent.
1 SINA: Scalable Incremental Processing of Continuous Queries in Spatio-temporal Databases Mohamed F. Mokbel, Xiaopeng Xiong, Walid G. Aref Presented by.
1 A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier.
R-Trees 2-dimensional indexing structure. R-trees 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes.
Indexing Spatio-Temporal Data Warehouses Dimitris Papadias, Yufei Tao, Panos Kalnis, Jun Zhang Department of Computer Science Hong Kong University of Science.
Mining Association Rules
Trip Planning Queries F. Li, D. Cheng, M. Hadjieleftheriou, G. Kollios, S.-H. Teng Boston University.
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries.
Fast Subsequence Matching in Time-Series Databases Christos Faloutsos M. Ranganathan Yannis Manolopoulos Department of Computer Science and ISR University.
Improving Min/Max Aggregation over Spatial Objects Donghui Zhang, Vassilis J. Tsotras University of California, Riverside ACM GIS’01.
KNR-tree: A novel R-tree-based index for facilitating Spatial Window Queries on any k relations among N spatial relations in Mobile environments ANIRBAN.
Approximate Frequency Counts over Data Streams Loo Kin Kong 4 th Oct., 2002.
Approximate Frequency Counts over Data Streams Gurmeet Singh Manku, Rajeev Motwani Standford University VLDB2002.
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
Energy-Aware Scheduling with Quality of Surveillance Guarantee in Wireless Sensor Networks Jaehoon Jeong, Sarah Sharafkandi and David H.C. Du Dept. of.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
Presenter: Mathias Jahnke Authors: M. Zhang, M. Mustafa, F. Schimandl*, and L. Meng Department of Cartography, TU München *Chair of Traffic Engineering.
Shape-based Similarity Query for Trajectory of Mobile Object NTT Communication Science Laboratories, NTT Corporation, JAPAN. Yutaka Yanagisawa Jun-ichi.
Reporter : Yu Shing Li 1.  Introduction  Querying and update in the cloud  Multi-dimensional index R-Tree and KD-tree Basic Structure Pruning Irrelevant.
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
Efficient Processing of Top-k Spatial Preference Queries
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
9/2/2005VLDB 2005, Trondheim, Norway1 On Computing Top-t Most Influential Spatial Sites Tian Xia, Donghui Zhang, Evangelos Kanoulas, Yang Du Northeastern.
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
Clustering of Uncertain data objects by Voronoi- diagram-based approach Speaker: Chan Kai Fong, Paul Dept of CS, HKU.
Monitoring k-NN Queries over Moving Objects Xiaohui Yu University of Toronto Joint work with Ken Pu and Nick Koudas.
Continual Neighborhood Tracking for Moving Objects Yoshiharu Ishikawa Hiroyuki Kitagawa Tooru Kawashima University of Tsukuba, Japan
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
R-trees: An Average Case Analysis. R-trees - performance analysis How many disk (=node) accesses we ’ ll need for range nn spatial joins why does it matter?
Dec. 13, 2003W 2 Implementation and Evaluation of an Adaptive Neighborhood Information Retrieval System for Mobile Users Yoshiharu Ishikawa.
Efficient OLAP Operations in Spatial Data Warehouses Dimitris Papadias, Panos Kalnis, Jun Zhang and Yufei Tao Department of Computer Science Hong Kong.
1 Complex Spatio-Temporal Pattern Queries Cahide Sen University of Minnesota.
Spatial Range Querying for Gaussian-Based Imprecise Query Objects Yoshiharu Ishikawa, Yuichi Iijima Nagoya University Jeffrey Xu Yu The Chinese University.
Spatio-Temporal Databases
A Semantic Caching Method Based on Linear Constraints Yoshiharu Ishikawa and Hiroyuki Kitagawa University of Tsukuba
Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.
Spatio-Temporal Databases. Term Project Groups of 2 students You can take a look on some project ideas from here:
Jeremy Iverson & Zhang Yun 1.  Chapter 6 Key Concepts ◦ Structures and access methods ◦ R-Tree  R*-Tree  Mobile Object Indexing  Questions 2.
Spatio-Temporal Databases
T-Share: A Large-Scale Dynamic Taxi Ridesharing Service
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
RE-Tree: An Efficient Index Structure for Regular Expressions
Spatio-temporal Pattern Queries
CSPs: Search and Arc Consistency Computer Science cpsc322, Lecture 12
Spatio-Temporal Databases
Efficient Processing of Top-k Spatial Preference Queries
Presentation transcript:

August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki Kitagawa University of Tsukuba

Outline Background and objectives Markov transition probability Indexing method for moving trajectories Proposed methods naïve algorithm CSP-based algorithm Experimental results Conclusions

Background Moving object databases stores and manages information on a huge number of moving objects supports queries on moving trajectories and/or moving status Research issues spatio-temporal indexes extraction of statistics (e.g., selectivities) Statics in spatio-temporal databases used for query optimization also useful in mobility analysis

Objective: extracting mobility statistics from spatio- temporal databases Target: trajectory data indexed using R-trees Statistics to be extracted : Markov transition probability target space is decomposed in cells estimating transition probabilities between cells using the indexed trajectory data Features search problem is formalized as constraint satisfaction problem (CSP) efficient processing using R-trees Our Approach

Outline Background and objectives Markov transition probability Indexing method for moving trajectories Proposed methods naïve algorithm CSP-based algorithm Experimental results Conclusions

Markov Transition Probability (1) Assumption: target space is decomposed in cells Example 1: What is the estimated probability that an object currently in cell c 0 moves in cell c 1 in a unit time later? First-order Markov transition probability Pr(c 1 |c 0 ) t =τ A t =τ+1 A c1c1 c0c0

Markov Transition Probability (2) Example 2: What is the probability that an object which moves from c 0 to cell c 1 in a unit time moves to cell c 2 in the next unit time? Second-order transition probability Pr(c 2 |c 0, c 1 ) Extension to order-n Markov transition probability Pr(c n |c 0, …, c n-1 ) is easy t =τ A t =τ+1 A t =τ+2 A c1c1 c0c0 c2c2

Markov Transition Probability Conventional technique in traffic data analysis Upton & Fingleton, 1989 [13] Special kind of association rules probability corresponds to the confidence factor difference: existence of order Usage trajectory estimation estimates where a moving object moves to in the next period simulation of movement status given status of moving objects at t = , we can estimate the change of the status at t =  + 1,  + 2, …

Assumptions Movement patterns obeys stationary process movement tendency does not change as time passes Cell decomposition each cell is a rectangle cell size is arbitrary: non-uniform decomposition is allowed cell decomposition can be specified dynamically Unit time length unit time can be specified as arbitrary length (e.g., one minuite, 10 minuites, …) but a unit time length should be a multiple of sampling time length

Formalization of Probability (1) Target data: trajectory data from t = 0 to t = T Definition of first-order Markov transition probability objs(c i, t) : set of objects which were in cell c i at t denominator: no. of objects which were in cell c 0 at arbitrary t (0 ≤ t ≤ T  1) numerator: no. of objects each of which contained in denominator and moved cell c 1 at t + 1

Formalization of Probability (2) Definition of order-n Markov Transition Probability denominator: no. of objects each of which was in cell c 0 at t (0 ≤ t ≤ T  1), in cell c 1 at t + 1, …, and in cell c n  1 at t + n  1 numerator: no. of objects each of which is contained in Dominator and moved cell c n at t + n

Generalized Transition Probability Estimation Problem (1) Given n + 1 cell sets for each of arbitrary cell combinations output Pr(c n |c 0,…,c n-1 ) Derives transition probability according to the specified cell sets at once

Generalized Transition Probability Estimation Problem (2) Example: Given C 0 = {c 0, c 1 }, C 1 = {c 1, c 2 }, C 2 = {c 1, c 2, c 3 }, estimate second-order probabilities Algorithm outputs 12 probabilities Pr(c 1 |c 0, c 1 ), Pr(c 2 |c 0, c 1 ), …, Pr(c 3 |c 1, c 2 ) c0c0 c1c1 c2c2 c3c3

Outline Background and objectives Markov transition probability Indexing method for moving trajectories Proposed methods naïve algorithm CSP-based algorithm Experimental results Conclusions

Indexing Methods for Trajectories R-tree-based approach is assumed Point-based representation: trajectories is represented as a set of points ( d+1 )-dimension R-tree is used (e.g., 3D R-tree) incorporating temporal dimension

0 (=T) x (d +1)-D R-tree-based Representation Sampling-based representation A B root abc 0 (=T) x 1 2 4 5 3 6 a b c root

Outline Background and objectives Markov transition probability Indexing method for moving trajectory data Proposed methods naïve algorithm CSP-based algorithm Experimental results Conclusions

Naïve Algorithm (1) Based on the definition of the Markov transition probability Example: Estimating Pr(c 2 |c 0, c 1 ) Determine objs(c 0,  ) and objs(c 1,  + 1) using the R-tree objs(c i, t) : the set of objects which were in cell c i at time t Take intersection of two sets; the cardinality of the intersection is added to Scount If the intersection is not empty objs(c 2,  + 2) is determined using the R-tree Take intersection of objs(c 0,  ), objs(c 1,  + 1), objs(c 2,  + 2) ; the cardinality of the result is added to Qcount This process is repeated for each  (0 ≤  ≤ T – n) Calculate Pr(c 2 |c 0, c 1 ) based on Scount, Qcount No. of search on R-tree is proportional to T

Naïve Algorithm (2) 0 (=T) x cell c 1 Example: estimation of Qcount += 1 No. of search on R-tree is proportional to T Output = Qcount Scount Scount += 1 cell c 0 cell c 2

Outline Background and objectives Markov transition probability Indexing method for moving trajectories Proposed methods naïve algorithm CSP-based algorithm Experimental results Conclusions

Basic Idea (1) Estimation of Pr(c n |c 0, …, c n-1 ) based on three steps: 1. Count the no. of objects which were in c 0, …, c n-1 at each unit time using an R-tree 2. Count the no. of objects which were in c 0, …, c n at each unit time using an R-tree 3. Compute Pr(c n |c 0, …, c n-1 ) by [result of step 2] / [result of step 1] Benefits step 1 & 2 can be processed using the same algorithm algorithm for step 1 is given by setting n → n – 1 requires only two searches on R-tree

Basic Idea (2) 0 (= T ) x cell c 2 Example: estimation of Pr(c 2 |c 0, c 1 ) cell c 1 cell c 0 Step 1: count objects which moved from c 0 to c 1 within a unit time Scount = 2 Step 2: count objects that moved as c 0, c 1, c 2 at each unit time Qcount = 1 Pr(c 2 |c 0, c 1 ) = ― ―――― Step 3: compute probability

Counting Using R-tree (1) How can we compute no. of objects which were in c 0, …, c n at each unit time? Idea: the problem is formalized as a constraint satisfaction problem (CSP) An object satisfying the constraint fulfills the following constraints for some  it was in cell c 0 at t =  it was in cell c 1 at t =  + 1 … it was in cell c n at t =  + n Search objects that satisfy all n + 1 constraints

Counting Using R-tree (2) Effective use of R-tree is necessary We extend the CSP solution search method using R-trees (Papadias et al, VLDB’98) [7] considers spatial constraints Example: find all spatial objects x, y, z that satisfy overlap(x, y) and north(y, z) search CSP solutions from the root to leaves Use of pruning and backtracks Reduce search space using constraints enumerates all solutions with one R-tree access

Example of Counting (1) 0 (=T) x 1 2 4 5 3 6 a b c root c1c1 c2c2 For C 0 = {c 1 }, C 1 = {c 1, c 2 }, C 2 ={c 2 }, derive probabilities for (C 0, C 1, C 2 ) Derive two probabilities at once Pr(c 2 |c 1, c 1 ) : the probability that an object which have moved as c 1  c 1 next moves to c 2 Pr(c 2 |c 1, c 2 )

Example of Counting (2) root a bc R-tree 0 (=T) x 1 2 4 5 3 6 a b c root c1c1 c2c2

Pruning Method (1) Pruning condition 1: Movement between two R-tree nodes which do not temporary consecutive is impossible Candidates can be deleted 0 (=T) x a c b Example: - movement such as a  b and b  c are allowed - movement a  c is impossible

Pruning Method (2) 0 (=T) x cell c 1 Pruning condition 2: Trajectory is not contained in the target cell Example: When we are counting for c 1  c 1, we should consider only nodes that overlaps with c 1

Pruning Method (3) 0 (=T) x 2 1 distance between MBRs Pruning condition 3: If [max distance an object can move] < [distance between MBRs] then an object cannot move from a node to next node

Query Processing Example cell c 1 cell c 2 cell c 1 cell c 2 tree level = 2 cell c 1 cell c 2 x t root pruning a b c 1 2 tree level = 1 pruning tree level =0 backtrack An object that moved as c 1  c 1  c 2 is found and counted There is no objects that moved as c 1  c 1  c 2 c 1  c 2  c 2 Targets: c 1  c 1  c 2 c 1  c 2  c 2

Outline Background and objectives Markov transition probability Indexing method for moving trajectory data Proposed methods Naïve algorithm CSP-based algorithm Experimental results Conclusions

Dataset (1) Generated using the moving object simulator made by Brinkoff [1] Simulates car movement situation on actual city road network Oldenburg city, Germany (about 2.5km x 2.8km) no. of initial moving objects: 5 5 objects are created in a minute on average 100 objects are moving in the map at a time data is generated for T = 1000 minutes 120K points are stored in 3-D R-tree

Dataset (2) c0 c3 c6 c1 c4 c7 c2 c5 c8 Example for estimating using 3 x 3 cells

Experimental Result (1) Map is decomposed into 30 x 30 cells First-order Markov transition probabilities Randomly 3 x 3 cells are selected

Experimental Result (2) Estimation of second-order transition probabilities Other parameters are same to the former case

Experimental Result (3) Estimation of third-order transition probabilities Other parameters are similar to the former case

Experimental Result (4) The case when CSP-based approach is not effective Target space is decomposed into 20 x 20 cells Estimation of second-order transition probabilities Since cell decomposition is coarse, the pruning cannot reduce candidates

Conclusions and Future Work Conclusions mobility statistics based on Markov transition probability proposals of two algorithms naïve approach CSP-based approach CSP-based approach effectively utilizes R-tree structure Future Work adaptive cell decompositions extension to non-stationary Markov transitions