Synthesis.

Slides:



Advertisements
Similar presentations
Spatial point patterns and Geostatistics an introduction
Advertisements

Managerial Economics in a Global Economy
SPATIAL DATA ANALYSIS Tony E. Smith University of Pennsylvania Point Pattern Analysis Spatial Regression Analysis Continuous Pattern Analysis.
Introduction to Applied Spatial Econometrics Attila Varga DIMETIC Pécs, July 3, 2009.
Basic geostatistics Austin Troy.
Spatial Autocorrelation Basics NR 245 Austin Troy University of Vermont.
GIS and Spatial Statistics: Methods and Applications in Public Health
Correlation and Autocorrelation
WFM 6202: Remote Sensing and GIS in Water Management © Dr. Akm Saiful IslamDr. Akm Saiful Islam WFM 6202: Remote Sensing and GIS in Water Management Akm.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
9. SIMPLE LINEAR REGESSION AND CORRELATION
The Simple Regression Model
Applied Geostatistics
Chapter Topics Types of Regression Models
Deterministic Solutions Geostatistical Solutions
Basics: Notation: Sum:. PARAMETERS MEAN: Sample Variance: Standard Deviation: * the statistical average * the central tendency * the spread of the values.
SA basics Lack of independence for nearby obs
Why Geography is important.
Ordinary Kriging Process in ArcGIS
Applications in GIS (Kriging Interpolation)
Correlation and Regression Analysis
Introduction to Regression Analysis, Chapter 13,
Correlation and Regression
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Spatial Interpolation of monthly precipitation by Kriging method
Using ESRI ArcGIS 9.3 Spatial Analyst
Basic geostatistics Austin Troy.
Ecosystems are: Hierarchically structured, Metastable, Far from equilibrium Spatial Relationships Theoretical Framework: “An Introduction to Applied Geostatistics“,
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Geographic Information Science
Examining Relationships in Quantitative Research
Spatial Statistics in Ecology: Continuous Data Lecture Three.
GEOSTATISICAL ANALYSIS Course: Special Topics in Remote Sensing & GIS Mirza Muhammad Waqar Contact: EXT:2257.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Spatial Interpolation Chapter 13. Introduction Land surface in Chapter 13 Land surface in Chapter 13 Also a non-existing surface, but visualized as a.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Spatial Analysis & Geostatistics Methods of Interpolation Linear interpolation using an equation to compute z at any point on a triangle.
Geo479/579: Geostatistics Ch4. Spatial Description.
Lecture 6: Point Interpolation
Exploratory Spatial Data Analysis (ESDA) Analysis through Visualization.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc
Spatial Point Processes Eric Feigelson Institut d’Astrophysique April 2014.
Environmental Modeling Spatial Interpolation. 1. Definition ► A procedure of estimating the values of properties at un-sampled sites ► The property must.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Chapter 13 Simple Linear Regression
The simple linear regression model and parameter estimation
Labs Put your name on your labs! Layouts: Site 1 Photos Title Legend
Regression and Correlation
Spatial statistics: Spatial Autocorrelation
Regression Analysis AGEC 784.
Inference for Least Squares Lines
Linear Regression.
Statistics for Managers using Microsoft Excel 3rd Edition
Chapter 5 Part B: Spatial Autocorrelation and regression modelling.
Chapter 13 Simple Linear Regression
Correlation and Regression
Inference for Geostatistical Data: Kriging for Spatial Interpolation
6-1 Introduction To Empirical Models
Spatial Autocorrelation
PENGOLAHAN DAN PENYAJIAN
Interpolation & Contour Maps
Correlation and Regression
Product moment correlation
Chapter 13 Additional Topics in Regression Analysis
Concepts and Applications of Kriging
3 basic analytical tasks in bivariate (or multivariate) analyses:
Diagnostics and Remedial Measures
Modeling Spatial Phenomena
Presentation transcript:

Synthesis

Nature of spatial data Geographical/spatial data Spatial vs. non-spatial statistical analysis Properties of spatial data spatial dependence spatial heterogeneity

Properties of spatial data spatial dependence The first law of geography: all things are related, but nearby things are more related than distant things

Properties of spatial data spatial heterogeneity The second law of geography (a law of spatial heterogeneity): conditions vary (“smoothly”) over the Earth's surface

Properties of spatial data The properties of geographical data present a fundamental challenge to conventional statistics. They violate classical assumptions of independence and homogeneity (stationarity) and render classical methods inefficient or inappropriate!!!.

Exploratory Spatial Data Analysis (ESDA) ESDA techniques Spatial heterogeneity (homogeneity) Linked histogram Linked box plot Scatter plot Conditional plot Spatial dependence Covariogram, correlogram and variogram

What is exploratory spatial data analysis (ESDA)? detecting the spatial dependence detecting the spatial heterogeneity (homogeneity or stationary)

GIS data and Linking GIS data geographical data (maps) attribute data (tables, graphs, etc.) geographical data (maps) geographical space attribute space

Linking: dynamic graphics visualizing data in the attribute space and geographical space simultaneously useful for exploring spatial stationary (homogeneity) of spatial patterns and processes

Conditional scatter plot: example

Spatial dependence Providing a description of how the data are related (correlated) with distance (and direction) Three methods: Covariogram Correlogram Variogram (Semi-variogram)

Covariogram, Correlogram and Variogram 1.0 σ2 Distance

Spatial weights Contiguity weights Distance weights Higher order contiguity (neighbors) Properties of spatial weights

Spatial weights Spatial weights define the spatial relationships among spatial objects (e.g., polygons, rasters, points) Spatial weights are used to identify spatial contiguity or neighborhood of a given object Spatial matrix ( for n objects, there will be n × n pairs of relationships)

Contiguity weights adjacent cells adjacent and diagonal cells

Contiguity weights: Binary Connectivity Matrix wij = 1 if an the i-th object is adjacent to the j-th object; wij = 0 otherwise j i 1 2 . n w11 w12 w1n w21 w22 w2n m wm1 wm2 wmn If i = j, then wij = 0

Distance weights: Distance functions

Higher order contiguity (neighbors) Pure contiguity: does not include objects that were contiguous of a lower order.

Higher order contiguity (neighbors) Cumulative contiguity: includes all lower order neighbors

Properties of spatial weights Connectivity histogram: distribution of the spatial weights Connectivity histogram should have approximately normal distribution Detecting unusual features of the spatial weights distribution islands (unconnected objects) bimodal distribution (some objects have very large and others very many neighbors)

Global spatial autocorrelation statistics Univariate spatial autocorrelation Bivariate spatial autocorrelation

Spatial autocorrelation the first law of geography: events (objects) at near by locations are more correlated than those events (objects) located far apart. attribute values at one location are in part determined by the values at the neighboring locations (spatial dependence)

Univariate Spatial autocorrelation univariate spatial autocorrelation is a method to analyze similarity/dissimilarity of the same variables between corresponding distances (lags) Moran’s I coefficient is the most often used measure for analyzing spatial autocorrelation

Global spatial autocorrelation: Moran’s I coefficient the cross-product of the deviations of the i-th and the j-th observations from the global mean the variance of the data set the ratio of the number of data points (areas) to the total number of connections between the points (areas) I = - 1.0 strong negative spatial autocorrelation I = 0.0 random pattern I = + 1.0 strong positive spatial autocorrelation (for I near - 1.0, dissimilar attribute values tend to cluster) (for I near -1/(n-1), attribute values tend to be randomly scattered) (for I near +1.0, similar attribute values tend to cluster)

Moran’s I coefficient: Test of significant Moran’s I has a normal distribution the z-statistic can be used to test the significance of the coefficient

Moran’s I coefficient in ArcGIS Since the value of z(I) = 5.1 is greater than 2.58, the difference between I = 0.27 and E(I) ≈ 0 is significant; therefore, the Moran’s I coefficient is statistically significant at the 0.01 level

Significance for Moran’s I: example Mean Reference distribution

Significance for Moran’s I : example Envelope slops

Global bivariate spatial autocorrelation: Moran’s I coefficient the cross-product of the deviations of the i-th and the j-th observations from the global mean for variable x and y, respectively. the variance of the data set the ratio of the number of data points (areas) to the total number of connections between the points (areas)

x y i

Local spatial autocorrelation statistics Univariate local indicators spatial autocorrelation (LISA) Multivariate local indicators spatial autocorrelation

Univariate local spatial autocorrelation: Moran’s Ii coefficient the deviations of the i-th and the j-th observations from the global mean. the variance of the data set the sum of spatial weights representing the strength of the linkage between i and j.

Univariate local spatial autocorrelation: Moran’s Ii coefficient Ii < 0.0 negative local spatial autocorrelation Ii = 0.0 random pattern Ii > 0.0 positive local spatial autocorrelation

Moran’s Ii coefficient: Example

Moran’s Ii coefficient: Cluster map

Local bivariate spatial autocorrelation: Moran’s Ii coefficient the deviations of the i-th and the j-th observations from the global mean for variable x and y, respectively. the variance of the data set the sum of spatial weights representing the strength of the linkage between i and j.

Bivariate Moran’s Ii coefficient: Example

Bivariate Moran’s Ii coefficient: Example

Spatial regression Regression and spatial regression Spatial lag (SL) model Spatial error (SE) model

Simple regression a dependent variable, Y, is considered to be a function of a single independent variable, X the functional relation between Y and X is linear; that is, Y = a + bX

Multiple regression equation yi = a + b1x1 + b2x2 + ... + bnxn + ei yi = dependent variable x1, x2... xn = independent variables a = constant (intercept) b1, b2 ... bn = regression coefficients ei = error term (residual or difference between observed and predicted values of yi)

Multiple regression equation: assumptions Multicollinearity: there is no intercorrelation of independent variables. Normality: the residuals are distributed normally. Homoskedasticity (equal variance): the residuals are dispersed randomly throughout the range of the estimated dependent variable Spatial independence: there is no spatial autocorrelation of the residuals.

Regression and spatial regression

Example

Explained variance of the dependent variable Ho: b1 and b2 = 0 HA: b1 and b2 ≠ 0

Sigma-square = Sum squared residual/Degrees of freedom = 6014 Sigma-square ML = Sum squared residual/Number of observations = 6014.89/49 = 122.753 S.E. of regression = √130.759 = 11.435 S.E of regression ML = √122.753 = 11.0794

SC = -2L +K ln(N) = -2 × -187.377 + 3 × ln(49) = 386.43 AIC = -2L +2K = -2 × -187.377 + 2 × 3 = 380.754 SC = -2L +K ln(N) = -2 × -187.377 + 3 × ln(49) = 386.43

Example CRIMEi = 68.62 - 0.27 x1 - 1.60 x2 + ei

Example: Regression Diagnostics Spatial regression model selection

Spatial lag (SL) model y = a + r (spatially lagged dependent variable y) + b1x1 + ... + bnxn + e y = dependent variable x1, x2... xn = independent variables a = constant (intercept) r (rho) = spatial autoregressive coefficient b1 ... bn = regression coefficients e = error term (residual or difference between predicted and observed values of y) The parameters of the spatial lag model are estimated by means of the maximum likelihood (ML) method (that is, the parameters are estimated by maximizing the probability (likelihood) of the sample data).

Spatial error (SE) model y = a + λ(spatially autoregressive errors e) + b1x1 + b2x2 + ... + bnxn + u y = dependent variable x1, x2... xn = independent variables a = constant (intercept) λ = spatial autoregressive coefficient b1, b2 ... bn = regression coefficients e = error term (residual or difference between predicted and observed values of y for the OLS model) u = error term (residual or difference between predicted and observed values of y for the SE model) The parameters of the spatial error model are estimated by means of the maximum likelihood (ML) method.

Spatial regression model selection rules the diagnostics for spatial autocorrelation using the Lagrange Multiplier (LM) tests the tests compare the non-spatial regression (OLS) model to the SL (and SE) model.

Example: Spatial regression model selection there is significant difference between the OLS and the spatial (SL and SE) models there is insignificant difference between the OLS and the spatial (SL and SE) models for PROB = 0.05 if the two spatial models are insignificant different, then select the one with higher value of the statistic

Spatial Interpolation Classification Thiessen polygons Inverse distance weighting Trend surface analysis Kriging

Definition a procedure for estimating unknown attribute values using control (or sample) points with known attribute values

Classification * Given some required assumptions, trend surface analysis can be treated as a special case of regression analysis and thus a stochastic method.

Thiessen polygons constructed around known (control) points so that any point within a Thiessen polygon is closer to the polygon's known point than any other control points

Inverse distance weighted interpolation The method assumes that the unknown attribute value of a point is influenced more by nearby control points than those farther away

Regression and trend surface analysis Regression model: any spatial process has two components: deterministic and stochastic Trend surface analysis: represents the deterministic component

Trend surface regression models A linear trend surface has the following form: y = a + b1(the X coordinate) + b2(the Y coordinate) A quadratic trend surface has the following form: y = a + b1(the X coordinate) + b2(the Y coordinate) + b3(the X coordinate)2 + b4(the Y coordinate)2 + b5(the X coordinate) (the Y coordinate) The parameters of the trend surface models are estimated using the ordinary least squares (OLS) procedure

Trend surface analysis: Linear model NOX = 7.267 + 0.0032 X - 0.0017 Y

Trend surface analysis: quadratic model NOX = -22843.43 + 0.0337 X + 9.772 Y - 0.000 X2 - 0.001 Y2 + 5.536 XY

Comparing the trend surface models

Kriging spatial variation consists of three elements: spatial trend (“drift”) spatial autocorrelation error term

Kriging Ordinary kriging: assumes the absence of a drift and focuses on the spatially correlated component Universal kriging: assumes that the spatial variation has a drift in addition to the spatially correlated component Co-kriging: uses one or more secondary variables, which are correlated with the primary variable of interest

Kriging procedure involves two steps: constructing empirical (and theoretical) semi-variogram based on the sample (control) point data estimating unknown attribute values

Example: Theoretical semivariogram d Number of pairs g(d) 0.00-1.00 24 1.375 1.01-2.00 34 2.147 2.01-3.00 16 2.437 4.01-5.00 2 2.500 3 2.5 2 Gamma (semivariance) 1.5 Theoretical semivariogram 1 0.5 0.5 1.5 2.5 3.5 4.5 Distance (km)

Example: Spherical semivariogram 3 range a = 2.75 2.5 2 Gamma (semivariance) 1.5 sill C = 2.5 Theoretical semivariogram 1 0.5 0.5 1.5 2.5 3.5 4.5 Distance (km)

Estimating unknown values estimated value of a variable x at point 0 value at known point weight associated with a pair (i and j); it is determined on the basis of the semivariogram number of known points

Example: Ordinary kriging

Example: Ordinary co-kriging primary variable: nitric oxides concentration (parts per 10 million) per town secondary variables: proportions of industrial acres per town

Example: Comparing ordinary kriging and co-kriging