Presentation is loading. Please wait.

Presentation is loading. Please wait.

On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras.

Similar presentations


Presentation on theme: "On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras."— Presentation transcript:

1 On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras Hong Kong University of Science and Technology

2 What is a Moving Cluster?  Dense clusters of objects that move similarly for a long time period  Not necessarily the same objects during the lifetime of the cluster  Examples Migrating animals Convoy of cars Military applications  Solutions: Efficient exact and approximate algorithms

3 Problem Formulation  Example:  Moving cluster

4 Related Work (Static)  Partition-based clustering (k-medoids)  Hierarchical clustering (BIRCH, CURE)  Density-based clustering (DBSCAN) ε ε MinPts=3

5 Related Work (Moving Objects)  Grouping trajectories [Vlachos et.al, ICDE 02] Trajectory cluster: Constant set of objects through its lifetime Only similar movement; no space proximity  Dense areas over time [Hadjieleftheriou et.al, SSTD 03] Static dense regions No common objects between regions in sequence  Incremental DBSCAN/OPTICS [Ester et.al, VLDB 98] Only a small percentage of objects moves  Maintaining Data Bubbles [Nassar et.al, SIGMOD 04] Redistributes updated objects in existing bubbles

6 MC1: The Straight-forward approach  G: set of moving clusters  Apply clustering to next timeslice S i  Expand moving clusters in G  Add new moving clusters to G  Report ending clusters

7 Hash-based DBSCAN  Memory:  10M objects with 1GB RAM

8 MC1 is inefficient! 1. Checks all possible combination of clusters in consecutive timeslices 2. Performs clustering for every timeslice

9 MC2: Minimizing Redundant Checks  Clustering in every timeslice  Select a random object in c 1  Search the object in S 2  Repeat for remaining objects  Max: (1-θ)|c i | objects c 1 c 2 is a moving cluster

10 Ambiguity Cases: θ<0.5 {c 0 c 1, c 2 } {c 0 c 2, c 1 }

11 MC3: Approximate Moving Clusters  Intuition: Many clusters will remain the same even if objects move  Avoid performing clustering in every timeslice  For an object o If o belongs to cluster c in timeslice S i Assume that o also belongs to c in the next timeslice (notice: objects may have moved)

12 Refine clusters  Hash new clusters in a grid  Legal cluster: Does not meet/intersect with other clusters It is connected (cells meet)  Objects in legal clusters are not considered further  For the rest of the objects, perform clustering  Possible inaccuracies!!!

13 Minimize Error  Perform exact clustering to absorb (may not eliminate) the accumulated error  Period for exact clustering: Grows linearly, drops exponentially  Exact clustering: If more that α|G| clusters have been added/removed

14 Experimental Evaluation  10K-50K objects per timeslice  50-100 timeslices, up to 5M objects  Linux, C++, 1.3GHz CPU, 1.2GB RAM  Generator: Clusters move/rotate, objects appear/disappear

15 Varying data size (10K-50K per timeslice) Avg: 87%  θ=0.9, α=0.1  Larger dataset: larger clusters, more interactions

16 Varying number of clusters (100-800 per timeslice)  5M objects, θ=0.9, α=0.1  Many clusters: Reaches error threshold fast 96% 87% 73%

17 Varying α  5M objects, θ=0.9, 800 clusters  α small: may not recover!!!

18 Varying α for different agilities  Low agility: Fewer errors  faster

19 MC3 for varying θ  5M objects, α=0.1, 800 clusters  θ large: incorrect clusters are pruned for not satisfying the θ criterion

20 Conclusions  Moving clusters Objects may move/change Exact and approximate solutions  Future work Automatic setting of parameter α Better error estimation Constraints (e.g, moving cluster must span at least k timeslices)

21 Questions?


Download ppt "On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras."

Similar presentations


Ads by Google