Download presentation

Presentation is loading. Please wait.

Published byDashawn Gilham Modified about 1 year ago

1
Spatial Dependency Modeling Using Spatial Auto-Regression Mete Celik 1,3, Baris M. Kazar 4, Shashi Shekhar 1,3, Daniel Boley 1, David J. Lilja 1,2 1 CSE Department @ University of Minnesota, Twin Cities 2 ECE Department @ University of Minnesota, Twin Cities 3 Army High Performance Computing Research Center 4 Oracle USA

2
07/08/2006Spatial Dependency Modeling Using SAR 2 Outline of Today’s Talk Motivation & Background Problem Definition Related Work & Contributions Proposed Approach Experimental Evaluation Conclusion & Future Work

3
07/08/2006Spatial Dependency Modeling Using SAR 3 Motivation Widespread use of spatial databases Mining spatial patterns The 1855 Asiatic Cholera on London [Griffith] Fair Landing [NYT, R. Nader] Correlation of bank locations with loan activity in poor neighborhoods Retail Outlets [NYT, Walmart, McDonald etc.] Determining locations of stores by relating neighborhood maps with customer databases Crime Hot Spot Analysis [NYT, NIJ CML] Explaining clusters of sexual assaults by locating addresses of sex-offenders Ecology [Uygar] Explaining location of bird nests based on structural environmental variables

4
07/08/2006Spatial Dependency Modeling Using SAR 4 Spatial Auto-correlation (SA) Random Distributed Data (no SA): Spatial distribution satisfying assumptions of classical data Cluster Distributed Data: Spatial distribution NOT satisfying assumptions of classical data Pixel property with independent identical distribution Random Nest Locations Pixel property with spatial auto- correlation Cluster Nest Locations

5
07/08/2006Spatial Dependency Modeling Using SAR 5 Execution Trace W allows other neighborhood definitions distance based 8-neighbors Space + 4-neighborhood 6 th row Binary W 6 th row Row-normalized W Given: Spatial framework Attributes

6
07/08/2006Spatial Dependency Modeling Using SAR 6 Linear Regression → SAR Spatial auto-regression (SAR) model has higher accuracy and removes IID assumption of linear regression SDM Provides Better Model!

7
07/08/2006Spatial Dependency Modeling Using SAR 7 Data Structures in SAR Model Vectors: y, β, ε Matrices: W, x W is a large matrix = ++ n-by-1 n-by-n 1-by-1 n-by-k k-by-1 n-by-1

8
07/08/2006Spatial Dependency Modeling Using SAR 8 Computational Challenge Maximum-Likelihood Estimation = MINimizing the log- likelihood Function Solving SAR Model – = 0 → Least Squares Problem – = 0, = 0 → Eigen-value Problem –General case: → Computationally expensive due to the log-det term in the ML Function Log-det term Theorem 1 SSE term

9
07/08/2006Spatial Dependency Modeling Using SAR 9 Outline Motivation & Background Problem Definition Related Work & Contributions Proposed Approach Experimental Evaluation Conclusion & Future Work

10
07/08/2006Spatial Dependency Modeling Using SAR 10 Problem Statement Given: A spatial framework S consisting of sites { s 1, …, s q } for an underlying geographic space G A collection of explanatory functions f x k : S k, k=1,…, K. k is the range of possible values for the explanatory functions A dependent function f y : y A family of F (SAR equation) of learning model functions mapping 1 x … x k y A neighborhood relationship (4 and 8- neighbor) on the spatial framework Find: The SAR parameter and the regression coefficient vector with a desired precision to save log-det computations.

11
07/08/2006Spatial Dependency Modeling Using SAR 11 Problem Statement – Cont’d Objective: Algebraic error ranking of approximate SAR model solutions. Constraints: S is a multi-dimensional Euclidean Space, The values of the explanatory variables x and the dependent function (observed variable) y may not be independent with respect to those of nearby spatial sites, i.e., spatial autocorrelation exists. The domain of x and y are real numbers. The SAR parameter varies in the range [0,1), The error is normally distributed with unit standard deviation and zero mean, i.e., ~N(0, 2I) IID The neighborhood matrix W exhibits sparsity.

12
07/08/2006Spatial Dependency Modeling Using SAR 12 Related Work

13
07/08/2006Spatial Dependency Modeling Using SAR 13 Contributions A new approximate SAR model solution: Gauss- Lanczos approximation method –Key Idea: Do not find all of the eigenvalues of W Error ranking of approximate SAR model solutions

14
07/08/2006Spatial Dependency Modeling Using SAR 14 Outline Motivation & Background Problem Definition Related Work & Contributions Proposed Approach Experimental Evaluation Conclusion & Future Work

15
07/08/2006Spatial Dependency Modeling Using SAR 15 Gauss-Lanczos Approximation Log-det is approximated by transforming the eigenvalue problem to the quadratic form. Finally, Gauss-type quadrature rules are applied using Lanczos procedure

16
07/08/2006Spatial Dependency Modeling Using SAR 16 How does GL Method Work? GL (Algorithm 3.2) is repeated m (i.e., 400) times in our experiments Parameter r varies between 5 and 8 in our experiments. For large problem sizes, the effects of m and r for getting good solution are low.

17
07/08/2006Spatial Dependency Modeling Using SAR 17 Taylor’s Series Approximation Log-det term in terms of Taylor’s Series –Trace is sum of eigen-values & W is symmetrized neighborhood matrix

18
07/08/2006Spatial Dependency Modeling Using SAR 18 Chebyshev Polynomial Approximation Log-det term in terms of Chebyshev Polynomials –Trace is sum of eigen-values, T s are matrix polynomials, c s are Chebyshev polynomial coefficients

19
07/08/2006Spatial Dependency Modeling Using SAR 19 Outline Motivation & Background Problem Definition Related Work & Contributions Proposed Approach Experimental Evaluation Conclusion & Future Work

20
07/08/2006Spatial Dependency Modeling Using SAR 20 Experiment Design Factor NameParameter Domain Problem Size (n)400, 1600, 2500 observation points Neighborhood Structure 2-D with 4-neighbors Candidates Exact Approach (Eigenvalue Based) Taylor's Series Approximation Chebyshev Polynomial Approximation Gauss-Lanczos Approximation Dataset Synthetic Dataset for =0.1, 0.2, ….., 0.9 SAR Parameter [0,1) Programming Language Matlab

21
07/08/2006Spatial Dependency Modeling Using SAR 21 Exact and Approximate Values of Log-det GL gives better approximation while spatial autocorrelation increases

22
07/08/2006Spatial Dependency Modeling Using SAR 22 Absolute Relative Error of Approximations Absolute relative error of approximation goes down as spatial autocorrelation increases (GL Mean error % 0.9, GL max error % 1.78)

23
07/08/2006Spatial Dependency Modeling Using SAR 23 Conclusions GL is slightly more expensive than Taylor series and Chebyshev polynomials. GL gives better approximations when spatial autocorrelation is high and the problem size is large. GL quality depends on the number of iterations and the initial Lanczos vector and the random number generator. No need to compute all eigenvalues.

24
07/08/2006Spatial Dependency Modeling Using SAR 24 `Acknowledgments AHPCRC Minnesota Supercomputing Institute (MSI) Spatial Database Group Members ARCTiC Labs Group Members Dr. Dan Boley Dr. Sanjay Chawla Dr. Vipin Kumar Dr. James LeSage Dr. Kelley Pace Dr. Pen-Chung Yew THANK YOU VERY MUCH Q/A

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google