Presentation is loading. Please wait.

Presentation is loading. Please wait.

Synthesis.

Similar presentations


Presentation on theme: "Synthesis."— Presentation transcript:

1 Synthesis

2 Nature of spatial data Geographical/spatial data
Spatial vs. non-spatial statistical analysis Properties of spatial data spatial dependence spatial heterogeneity

3 Properties of spatial data
spatial dependence The first law of geography: all things are related, but nearby things are more related than distant things

4 Properties of spatial data
spatial heterogeneity The second law of geography (a law of spatial heterogeneity): conditions vary (“smoothly”) over the Earth's surface

5 Properties of spatial data
The properties of geographical data present a fundamental challenge to conventional statistics. They violate classical assumptions of independence and homogeneity (stationarity) and render classical methods inefficient or inappropriate!!!.

6 Exploratory Spatial Data Analysis (ESDA)
ESDA techniques Spatial heterogeneity (homogeneity) Linked histogram Linked box plot Scatter plot Conditional plot Spatial dependence Covariogram, correlogram and variogram

7 What is exploratory spatial data analysis (ESDA)?
detecting the spatial dependence detecting the spatial heterogeneity (homogeneity or stationary)

8 GIS data and Linking GIS data geographical data (maps)
attribute data (tables, graphs, etc.) geographical data (maps) geographical space attribute space

9 Linking: dynamic graphics
visualizing data in the attribute space and geographical space simultaneously useful for exploring spatial stationary (homogeneity) of spatial patterns and processes

10 Conditional scatter plot: example

11 Spatial dependence Providing a description of how the data are related (correlated) with distance (and direction) Three methods: Covariogram Correlogram Variogram (Semi-variogram)

12 Covariogram, Correlogram and Variogram
1.0 σ2 Distance

13

14 Spatial weights Contiguity weights Distance weights
Higher order contiguity (neighbors) Properties of spatial weights

15 Spatial weights Spatial weights define the spatial relationships among spatial objects (e.g., polygons, rasters, points) Spatial weights are used to identify spatial contiguity or neighborhood of a given object Spatial matrix ( for n objects, there will be n × n pairs of relationships)

16 Contiguity weights adjacent cells adjacent and diagonal cells

17 Contiguity weights: Binary Connectivity Matrix
wij = 1 if an the i-th object is adjacent to the j-th object; wij = 0 otherwise j i 1 2 . n w11 w12 w1n w21 w22 w2n m wm1 wm2 wmn If i = j, then wij = 0

18 Distance weights: Distance functions

19 Higher order contiguity (neighbors)
Pure contiguity: does not include objects that were contiguous of a lower order.

20 Higher order contiguity (neighbors)
Cumulative contiguity: includes all lower order neighbors

21 Properties of spatial weights
Connectivity histogram: distribution of the spatial weights Connectivity histogram should have approximately normal distribution Detecting unusual features of the spatial weights distribution islands (unconnected objects) bimodal distribution (some objects have very large and others very many neighbors)

22 Global spatial autocorrelation statistics
Univariate spatial autocorrelation Bivariate spatial autocorrelation

23 Spatial autocorrelation
the first law of geography: events (objects) at near by locations are more correlated than those events (objects) located far apart. attribute values at one location are in part determined by the values at the neighboring locations (spatial dependence)

24 Univariate Spatial autocorrelation
univariate spatial autocorrelation is a method to analyze similarity/dissimilarity of the same variables between corresponding distances (lags) Moran’s I coefficient is the most often used measure for analyzing spatial autocorrelation

25 Global spatial autocorrelation: Moran’s I coefficient
the cross-product of the deviations of the i-th and the j-th observations from the global mean the variance of the data set the ratio of the number of data points (areas) to the total number of connections between the points (areas) I = strong negative spatial autocorrelation I = random pattern I = strong positive spatial autocorrelation (for I near - 1.0, dissimilar attribute values tend to cluster) (for I near -1/(n-1), attribute values tend to be randomly scattered) (for I near +1.0, similar attribute values tend to cluster)

26 Moran’s I coefficient: Test of significant
Moran’s I has a normal distribution the z-statistic can be used to test the significance of the coefficient

27 Moran’s I coefficient in ArcGIS
Since the value of z(I) = 5.1 is greater than 2.58, the difference between I = 0.27 and E(I) ≈ 0 is significant; therefore, the Moran’s I coefficient is statistically significant at the 0.01 level

28 Significance for Moran’s I: example
Mean Reference distribution

29 Significance for Moran’s I : example
Envelope slops

30 Global bivariate spatial autocorrelation: Moran’s I coefficient
the cross-product of the deviations of the i-th and the j-th observations from the global mean for variable x and y, respectively. the variance of the data set the ratio of the number of data points (areas) to the total number of connections between the points (areas)

31 x y i

32 Local spatial autocorrelation statistics
Univariate local indicators spatial autocorrelation (LISA) Multivariate local indicators spatial autocorrelation

33 Univariate local spatial autocorrelation: Moran’s Ii coefficient
the deviations of the i-th and the j-th observations from the global mean. the variance of the data set the sum of spatial weights representing the strength of the linkage between i and j.

34 Univariate local spatial autocorrelation: Moran’s Ii coefficient
Ii < negative local spatial autocorrelation Ii = random pattern Ii > 0.0 positive local spatial autocorrelation

35 Moran’s Ii coefficient: Example

36 Moran’s Ii coefficient: Cluster map

37 Local bivariate spatial autocorrelation: Moran’s Ii coefficient
the deviations of the i-th and the j-th observations from the global mean for variable x and y, respectively. the variance of the data set the sum of spatial weights representing the strength of the linkage between i and j.

38 Bivariate Moran’s Ii coefficient: Example

39 Bivariate Moran’s Ii coefficient: Example

40 Spatial regression Regression and spatial regression
Spatial lag (SL) model Spatial error (SE) model

41 Simple regression a dependent variable, Y, is considered to be a function of a single independent variable, X the functional relation between Y and X is linear; that is, Y = a + bX

42 Multiple regression equation
yi = a + b1x1 + b2x bnxn + ei yi = dependent variable x1, x2... xn = independent variables a = constant (intercept) b1, b2 ... bn = regression coefficients ei = error term (residual or difference between observed and predicted values of yi)

43 Multiple regression equation: assumptions
Multicollinearity: there is no intercorrelation of independent variables. Normality: the residuals are distributed normally. Homoskedasticity (equal variance): the residuals are dispersed randomly throughout the range of the estimated dependent variable Spatial independence: there is no spatial autocorrelation of the residuals.

44 Regression and spatial regression

45 Example

46 Explained variance of the dependent variable
Ho: b1 and b2 = 0 HA: b1 and b2 ≠ 0

47 Sigma-square = Sum squared residual/Degrees of freedom = 6014
Sigma-square ML = Sum squared residual/Number of observations = /49 = S.E. of regression = √ = S.E of regression ML = √ =

48 SC = -2L +K ln(N) = -2 × -187.377 + 3 × ln(49) = 386.43
AIC = -2L +2K = -2 × × 3 = SC = -2L +K ln(N) = -2 × × ln(49) =

49 Example CRIMEi = x x2 + ei

50 Example: Regression Diagnostics
Spatial regression model selection

51 Spatial lag (SL) model y = a + r (spatially lagged dependent variable y) + b1x bnxn + e y = dependent variable x1, x2... xn = independent variables a = constant (intercept) r (rho) = spatial autoregressive coefficient b1 ... bn = regression coefficients e = error term (residual or difference between predicted and observed values of y) The parameters of the spatial lag model are estimated by means of the maximum likelihood (ML) method (that is, the parameters are estimated by maximizing the probability (likelihood) of the sample data).

52 Spatial error (SE) model
y = a + λ(spatially autoregressive errors e) + b1x1 + b2x bnxn + u y = dependent variable x1, x2... xn = independent variables a = constant (intercept) λ = spatial autoregressive coefficient b1, b2 ... bn = regression coefficients e = error term (residual or difference between predicted and observed values of y for the OLS model) u = error term (residual or difference between predicted and observed values of y for the SE model) The parameters of the spatial error model are estimated by means of the maximum likelihood (ML) method.

53 Spatial regression model selection rules
the diagnostics for spatial autocorrelation using the Lagrange Multiplier (LM) tests the tests compare the non-spatial regression (OLS) model to the SL (and SE) model.

54 Example: Spatial regression model selection
there is significant difference between the OLS and the spatial (SL and SE) models there is insignificant difference between the OLS and the spatial (SL and SE) models for PROB = 0.05 if the two spatial models are insignificant different, then select the one with higher value of the statistic

55 Spatial Interpolation
Classification Thiessen polygons Inverse distance weighting Trend surface analysis Kriging

56 Definition a procedure for estimating unknown attribute values using control (or sample) points with known attribute values

57 Classification * Given some required assumptions, trend surface analysis can be treated as a special case of regression analysis and thus a stochastic method.

58 Thiessen polygons constructed around known (control) points so that any point within a Thiessen polygon is closer to the polygon's known point than any other control points

59 Inverse distance weighted interpolation
The method assumes that the unknown attribute value of a point is influenced more by nearby control points than those farther away

60 Regression and trend surface analysis
Regression model: any spatial process has two components: deterministic and stochastic Trend surface analysis: represents the deterministic component

61 Trend surface regression models
A linear trend surface has the following form: y = a + b1(the X coordinate) + b2(the Y coordinate) A quadratic trend surface has the following form: y = a + b1(the X coordinate) + b2(the Y coordinate) + b3(the X coordinate)2 + b4(the Y coordinate)2 + b5(the X coordinate) (the Y coordinate) The parameters of the trend surface models are estimated using the ordinary least squares (OLS) procedure

62 Trend surface analysis: Linear model
NOX = X Y

63 Trend surface analysis: quadratic model
NOX = X Y X Y XY

64 Comparing the trend surface models

65 Kriging spatial variation consists of three elements:
spatial trend (“drift”) spatial autocorrelation error term

66 Kriging Ordinary kriging: assumes the absence of a drift and focuses on the spatially correlated component Universal kriging: assumes that the spatial variation has a drift in addition to the spatially correlated component Co-kriging: uses one or more secondary variables, which are correlated with the primary variable of interest

67 Kriging procedure involves two steps:
constructing empirical (and theoretical) semi-variogram based on the sample (control) point data estimating unknown attribute values

68 Example: Theoretical semivariogram
d Number of pairs g(d) 24 1.375 34 2.147 16 2.437 2 2.500 3 2.5 2 Gamma (semivariance) 1.5 Theoretical semivariogram 1 0.5 0.5 1.5 2.5 3.5 4.5 Distance (km)

69 Example: Spherical semivariogram
3 range a = 2.75 2.5 2 Gamma (semivariance) 1.5 sill C = 2.5 Theoretical semivariogram 1 0.5 0.5 1.5 2.5 3.5 4.5 Distance (km)

70 Estimating unknown values
estimated value of a variable x at point 0 value at known point weight associated with a pair (i and j); it is determined on the basis of the semivariogram number of known points

71 Example: Ordinary kriging

72 Example: Ordinary co-kriging
primary variable: nitric oxides concentration (parts per 10 million) per town secondary variables: proportions of industrial acres per town

73 Example: Comparing ordinary kriging and co-kriging


Download ppt "Synthesis."

Similar presentations


Ads by Google