Presentation is loading. Please wait.

Presentation is loading. Please wait.

Assembler Efficient Discovery of Spatial Co-evolving Patterns in Massive Geo-sensory Data Sheng QIAN 2015-08-01 SIGKDD 2015.

Similar presentations


Presentation on theme: "Assembler Efficient Discovery of Spatial Co-evolving Patterns in Massive Geo-sensory Data Sheng QIAN 2015-08-01 SIGKDD 2015."— Presentation transcript:

1 Assembler Efficient Discovery of Spatial Co-evolving Patterns in Massive Geo-sensory Data Sheng QIAN 2015-08-01 SIGKDD 2015

2 Content 1. Introduction 2. Problem Description 3. The Assembler Method Stage I Detecting Individual Evolutions Stage II SCP Generation Time and space complexity 4. Experiment

3 Introduction Spatial Co-evolving Patterns(SCP) e.g. AQI Sensors in Beijing

4 Introduction Challenge Interesting evolutions are often flooded by trivial fluctuations The pattern search space is extremely large

5 Problem Description Our Interest

6 Problem Description Symbol S = {s 1, s 2,..., s m }Sensors l i Location of s i T = {t 1, t 2,..., t n }Time domain

7 Problem Description Definitions

8 Definitions

9 Definitions

10 Method: I. Detecting Individual Evolutions Haar Wavelet Transformation

11 Method: I. Detecting Individual Evolutions Haar Wavelet Transformation c ij

12 Method: I. Detecting Individual Evolutions Evolving interval extraction

13 Method: I. Detecting Individual Evolutions Mining Frequent Evolutions Segment-and-group approach 1. Segement: bottom-up 2. Mean Shift: divide segements into groups such that the segments in the same group have similar slopes

14 Method: II. SCP Generation The Anti-monotonicity Property

15 Method: II. SCP Generation Find SCP by intersecting matching timestamps

16 Method: II. SCP Generation SCP Search Tree

17 Method: II. SCP Generation Neighbor and Parent

18 Method: II. SCP Generation SCP Search Tree

19 Method: II. SCP Generation Algorithm

20 Mining Frequent Evolutions Segment-and-group approach 1. Segement: bottom-up 2. Mean Shift: divide segements into groups such that the segments in the same group have similar slopes

21 Method: Discussion Time Complexity Segment approach : Segment approach : O(n e · l e · l s ) ≈ O(m) ls is small, ne · le <m Mean Shift : Mean Shift : O(n l · k) ≈ O(m) k: the avg. number of shifting operation Second Stage : Second Stage : O(n G (n|E G | + n p 2 n s )) n G : the number of connected components in G that have SCPs |E G | : the number of edges in G n p : the maximum number of SCPs on a connected component n s : the maximum support of an SCP

22 Method: Discussion Space Complexity Segment & Mean Shift: nearly linear Second Stage: Second Stage: O(n · n p · n s )

23 Method: Discussion Parameters Setting The minimum support θ How many occurrences can be considered frequent enough The distance threshold h What distance makes two sensors reachable The change threshold δ How much change in the reading reflects a significant and unusual behavior The mean shift bandwidth ω

24 Experiment Dataset 1. Air is an air quality data set. 180 air quality sensors are deployed in 16 cities in northern China (Beijing, Tianjin, and 14 cities in the Hebei Province). Each sensor has measured the hourly AQI during the period 2013.02.08 – 2014.08.27. 2. Bike is the Citi Bike rental data set for the 332 rental docks in New York, we record the number of available bikes at each dock every 30 minutes during 2013.07.01 – 2014.08.30. 3. Syn-Sensor is a collection of 4 synthetic data sets used to evaluate the scalability of Assembler w.r.t. the number of sensors n

25 Experiment Illumination

26 Illumination

27 Efficiency Study Varing and h Efficiency Study Varing θ and h

28 Experiment Efficiency Study Varing and w Efficiency Study Varing δ and w

29 Experiments Scalability

30 Thank you


Download ppt "Assembler Efficient Discovery of Spatial Co-evolving Patterns in Massive Geo-sensory Data Sheng QIAN 2015-08-01 SIGKDD 2015."

Similar presentations


Ads by Google