Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining Statistically Significant Co-location and Segregation Patterns.

Similar presentations


Presentation on theme: "Mining Statistically Significant Co-location and Segregation Patterns."— Presentation transcript:

1 Mining Statistically Significant Co-location and Segregation Patterns

2 Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

3 Motivation Finding collocated events provides insightful evidences in decision making and scientific research: –Ecology –Biology –Epidemiology –… Colocation patterns caused by randomness need attention: –Presence of spatial autocorrelation –Abundance of feature instances –…

4 Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

5 Key Concept (1)

6 Key Concept (2) Null hypothesis –A hypothesis that one tries to disprove given the observation from the dataset. Alternative hypothesis –The opposite of null hypothesis, which is true when null hypothesis is rejected.

7 Key Concept (2) Null hypothesis –For a colocation pattern C, a higher participation index can be obtained in a random feature distribution(spatial autocorrelation is considered). –For a segregation pattern C, a lower participation index can be obtained in a random feature distribution.

8 Key Concept (3) Statistical significance –Significance is determined by significance level α (or Type I error), which is the probability of rejecting the null hypothesis given that it is true. –For each observed pattern, this probability is called p-value.

9 Key Concept (4)

10 Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

11 Problem Definition

12 Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

13 Related work Co-location Patterns Segregation Patterns Significance Test Spatial Co-location Patterns Detection √ Spatial Segregation Patterns Detection √ Mining Statistically Significant Co-location and Segregation Patterns √√√

14 Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

15 Challenge The co-location/segregation patterns determined by a manually set threshold will raise false positives and are sensitive to dataset No probability model is available to compute the significance level (p-value) in a closed-form fashion; Computation is expensive to test the significance through Monte Carlo simulation.

16 Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

17 Contributions Incorporates statistical significance test with colocation and segregation pattern detection which reduces spurious patterns caused by randomness; Proposes three approaches for algorithm acceleration: –a subset-based filter –a grid-based sampling framework –a spatial-join based pruning technique

18 Subset-based Filter

19 Grid-based Sampling

20 Spatial-join Based Pruning

21 Outline Motivation Key concepts Problem definition Related works Challenges Contribution Validation

22 Quality of Approximation – Grid-based Participation Index

23 Inhibition (synthetic data set)

24 Auto-correlation (synthetic data set)

25 Mixed Spatial Interactions (synthetic data set)

26 Runtime Comparison (1) Fixed total cluster number of each auto-correlated feature

27 Runtime Comparison (2) Various total cluster number of each auto-correlated feature

28 Experiments (real data set) –Ants –Bramble Canes –Lansing Woods –Toronto address repository

29 Ants Data


Download ppt "Mining Statistically Significant Co-location and Segregation Patterns."

Similar presentations


Ads by Google