Download presentation
Presentation is loading. Please wait.
1
Spatial Regression Models
GRAD6104/8104 INES 8090 Spatial Statistic- Spring 2017 Spatial Regression Models
2
Introduction Example: Farm density of Puerto Rico in 2007 (from Lab 2)
3
Introduction Example from Lab 2
4
Introduction Regression models are a confirmatory approach that allows us to examine the relationship between variables based on sampled data Z(s) = Xβ + ε Assumption of independency What if the assumption does not hold, which is often the case for spatial data?
5
Introduction Matrix Algebra (𝐴𝐵)’=𝐵’𝐴’
Identity Matrix I = 𝐼𝑓 𝐵 𝑖𝑠 𝑡ℎ𝑒 𝑖𝑛𝑣𝑒𝑟𝑠𝑒 𝑜𝑓 𝐴, 𝑡ℎ𝑒𝑛 𝐴𝐵=𝐼 or B = A-1 (BC)-1=C-1B-1 A is a nonsinglular (invertible) matrix if A-1 exists
6
Introduction Matrix Algebra 𝑊= 𝑤1 𝑤2 𝑤𝑛 , X= 𝑥1 𝑥2 𝑥𝑛 𝑊’𝑋= 𝑖=1 𝑛 𝑤𝑖𝑥𝑖
𝑋′𝑊𝑋= 𝑖=1 𝑛 𝑗=1 𝑛 𝑤𝑖𝑗𝑥𝑖𝑥𝑗 (covariance matrix?)
7
Linear Regression Models
Fitting a linear model with uncorrelated, homoscedastic errors Z(s) = X(s)β + e(s), e(s) ~ N(0, σ2I) Z(s): response or dependent variables X(s): explanatory or independent variables e(s): uncorrelated, homoscedastic errors
8
Linear Regression Models
Homoscedasticity vs. Heteroscedasticity Y Y X X
9
Linear Regression Models
Estimation methods Ordinary least square (OLS) Maximum Likelihood Estimation (MLE)
10
Linear Regression Models
Ordinary Least Square (OLS) Estimation Minimize the residual sum square 𝑆𝑆𝑅=(𝑌−𝑋’𝛽)’(𝑌−𝑋’𝛽) For univariate regression SSR= 𝑖=1 𝑛 𝑦𝑖−𝛽𝑥𝑖 2 To solve the optimization problem 𝛿𝑆𝑆𝑅 𝛿𝛽 = 0 Y Causality: X -> Y X Causality: X -> Y
11
Linear Regression Models
Ordinary Least Square (OLS) Estimation Minimize the residual sum square 𝑆𝑆𝑅=(𝑌−𝑋’𝛽)’(𝑌−𝑋’𝛽) OLS estimators (Z(s)=Y; X(s)=X) 𝛽 𝑜𝑙𝑠= ( 𝑋 ′ 𝑋) −1 𝑋 ′ 𝑌 𝜎 2𝑜𝑙𝑠=𝑆𝑆𝑅( 𝛽 𝑜𝑙𝑠)/(𝑛−𝑝) where p is the number of parameters Y X Y X
12
Linear Regression Models
Maximum Likelihood Estimation (MLE) General likelihood function (joint density function) 𝐿 𝑥1,𝑥2,…,𝑥𝑛,|𝜃 = 𝑖=1 𝑛 𝑓(𝑥𝑖|𝜃) where f(.): probability distribution function (iid), e.g., normal dist. θ: parameters to be estimated xi: sample data
13
Linear Regression Models
Maximum Likelihood Estimation (MLE) Maximize the joint likelihood, or minimize the following log-likelihood: For linear models, OLS and ML estimators are equivalent
14
Linear Regression Models
Maximum Likelihood Estimation (MLE) Maximize the joint likelihood, or minimize the following log-likelihood ln(L): Besides log-likelihood, we could also optimize penalized likelihood criteria such as: Akaike information criteria (AIC) AIC=ln(L)-k, where k is the number of estimated parameters Bayesian information criteria (BIC) BIC=ln(L)-ln(n)*k/2 where n is the sample size
15
Linear Regression Model
Example from Lab 2
16
Linear Regression Models
Evaluating goodness-of-fit of the model using coefficient of determination R2 R2= 𝑆𝑆𝑅 𝑇𝑆𝑆 = 𝑦 𝑖− 𝑦 𝑦𝑖− 𝑦 2 where TSS is total sum of square The ratio of the explained variance to the total variance of response variable (degree of the variance that is explained) Adjusted R2: 𝑅 2=1− 𝑛−1 𝑛−𝑝 (1−𝑅2) where p is the number of explanatory variables
17
Linear Regression Model
Example from Lab 2
18
Linear Regression Models
Hyp0thesis test for all coefficients H0: β1= β2= … = βp=0 (all coefficients are 0) H1: at least have one βi <>0 F test: F = 𝑆𝑆𝑅/(𝑝−1) 𝑆𝑆𝐸/(𝑛−𝑝) = 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑒𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑢𝑛𝑒𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 where 𝑆𝑆𝑅= 𝑦 𝑖− 𝑦 2 SSE= 𝑦 𝑖−𝑦𝑖 2
19
Linear Regression Model
Example from Lab 2
20
Linear Regression Models
Hyp0thesis test for a specific coefficient H0: βi = 0 H1: βi <>0 t test statistic (1-α) × 100% confidence interval for βi 𝛽 𝑖± 𝑡 𝛼 2 ,𝑛−𝑝 𝑠𝑡𝑑( 𝛽 𝑖)
21
Linear Regression Models
Working with OLS residuals 𝑒 𝑠 =𝑍 𝑠 − 𝑍 (𝑠) QQ-plot or histogram to check the assumption of normal distribution (otherwise, apply data transformation) Scatterplot of residuals to test the constant variance assumption
22
Linear Regression Models
Working with OLS residuals 𝑒 𝑠 =𝑍 𝑠 − 𝑍 (𝑠) Test for spatial autocorrelation using Semivariograms Moran’s I or Geary’s C Figure 6.2 Empirical semivarograms Figure 6.2 Empirical semivariograms
23
Linear Regression Models
Spatially explicit models Local Polynomial Geographically Weighted Regression
24
Linear Regression Models
Spatially explicit models Local Polynomial Trend Surface Models (1-order or higher) Z(x,y)=β1+ β2x+ β3y+e(x,y) Z(x,y)=β1+ β2x+ β3y+ β4x2+ β5xy+ β6y2+e(x,y) … where xi, yi are X, Y coordinates of observation I; e(x,y) ~iid (0,σ2) Solve it using OLS
25
Linear Regression Models
Spatially explicit models Local Polynomial Trend Surface Models (1-order or higher) Figure 5.3 Trend surface modeling a: original data (simulated) b: predicted mean values from polynomial trend surface with degrees =14 c: predicted mean values from polynomial trend surface with degrees =8
26
Linear Regression Models
Spatially explicit models Local Polynomial LOESS: Locally Weighted Least Squares (non-parametric regression) Assumption: a smooth mean function Estimate the smooth trend in a moving fashion by fitting a site-specific n-order polynomial to only the most proximate data a site Fit the trend using weighted least squares, with weights inversely related to distance from the site
27
Linear Regression Models
Spatially explicit models Geographically Weighted Regression (Brunsdon, Fotheringham and Charlton 1996) Z(s)=X(s)β(s0)+e0 , e0 ~ (0, σ2W(s0)-1) where W(s0), or W(si,s0): weights that control the contribution of Z(si) to the relationship between Z(s) and X(s) Brunsdon, C., Fotheringham, A. S., & Charlton, M. E. (1996). Geographically weighted regression: a method for exploring spatial nonstationarity. Geographical analysis, 28(4),
28
Linear Regression Models
Spatially explicit models Geographically Weighted Regression So first we need to define weight function Then weighted least square approach can be applied to estimate β(s0) GWR allows for the incorporation of spatial nonstationarity Image source:
29
Linear Models with Correlated Error
What about a linear model with spatially correlated residuals? Z=Xβ+e, E(e)=0, Var(e) = V = σ2R Where V =V(θ) is a positive definite covariance matrix σ2 is the constant variance R is the correlation matrix
30
Linear Models with Correlated Error
What about a linear model with spatially correlated residuals? Z=Xβ+e, E(e)=0, Var(e) = V = σ2R 1 5 2 3 4 Assuming constant mean β and isotropic spherical covariance function with range 4, sill 5, and nugget 0
31
Linear Models with Correlated Error
Generalized least square (GLS) estimation Z=Xβ+e, E(e)=0, Var(e) = V = σ2R Minimize the generalized residual sum of squares Correspondingly, GLS estimators are
32
Linear Models with Correlated Error
Generalized least square (GLS) estimation GLS estimators are If the distribution of e is multivariate normal, then the GLS estimators are also the best unbiased estimator and the maximum likelihood estimator (MLE) Standard estimator of σ2 is
33
Linear Models with Correlated Error
Results of toy example Z=Xβ+e, E(e)=0, Var(e) = V = σ2R 1 5 2 3 4 Consideration of correlation among observations 1-4 affects the estimation of β
34
Linear Models with Correlated Error
Estimated generalized least square (EGLS) estimation In practice, we may not know the true value of θ and then V cannot be specified. So, we need to estimate θ, i.e., we use 𝜃 𝑎𝑛𝑑 𝑉 EGLS estimators are then:
35
Linear Models with Correlated Error
Maximum likelihood estimation Assumption: multivariate normality of the observations The log-likelihood function is: ML estimators of β, σ2 are:
36
Linear Models with Correlated Error
MLE vs EGLS estimators β estimators are the same MLE of σ2 is the residual sum of squares of EGLS divided by the sample size
37
Spatial Autoregressive Models
Incorporate neighborhood structure into regression (similar to time series analysis) Simultaneous autoregressive model (SAR) Conditional autoregressive model (CAR) Time Series Analysis Temporal Autocorrelation Box-Jenkins approach AR model (Autoregressive) MA model (Moving average)
38
Spatial Autoregressive Models
Simultaneous Autoregressive Model (SAR) Z(s) = X(s)β - ρWX(s)β + ρWZ(s) + v = X(s)β + (1-ρW)-1v where ρWX(s)β and ρWZ(s) are spatially lagged variables that allow for the incorporation of spatial interaction v ~ Nn(0, σ2I) ρ is the spatial coefficient If ρ=0, reduced to linear regression model
39
Spatial Autoregressive Models
Simultaneous Autoregression (SAR) Z(s) = X(s)β - ρWX(s)β + ρWZ(s) + v = X(s)β + (1-ρW)-1v If ρ is known, the GLS estimator is where ΣSAR=Var[Z(s)] (variance-covariance matrix)
40
Spatial Autoregressive Models
Simultaneous Autoregression (SAR) Z(s) = X(s)β - ρWX(s)β + ρWZ(s) + v = X(s)β + (1-ρW)-1v Spatial Error Model Incorporate spatial structure in residuals Spatial Lag Model Incorporate spatial structure in dependent variables
41
Spatial Autoregressive Models
Simultaneous Autoregression (SAR) Spatial Error Model Incorporate spatial structure in error/residuals Z(s)=X(s)β + ε ε = λW ε + ξ where ε is the vector of residuals λ is the coefficient for spatial error ξ: uncorrelated errors
42
Spatial Autoregressive Models
Simultaneous Autoregression (SAR) Spatial Lag Model Incorporate spatial structure in dependent variables Z(s)=ρWZ(s)+X(s)β + ε where WZ(s) are spatially lagged dependent variables ρ is the spatial coefficient
43
Spatial Autoregressive Models
Conditional Autoregressive Model (CAR) Conditional probability distribution of each observation, Z(si), given the observed values of all of the other observations f(Z(si)|Z(s)-i) Where Z(s)-i is the vector of all observations except Z(si)
44
Spatial Autoregressive Models
Conditional Autoregressive Model (CAR) a.k.a. Markov random fields, MRFs Or 𝑍𝑖−µ𝑖= 𝑗 𝐶𝑖𝑗 𝑍𝑗−µ𝑗 +𝜀𝑖, 𝜀𝑖 ~𝑁 0,𝜎2 , 𝑖=1,…,𝑛 where C={Cij} such that Cii=0 and I-C is symmetric and positive definite
45
Model Comparison CAR vs. SAR First-order CAR vs. second-order CAR
Isotropy vs. anisotropy Different neighborhood definitions Constant mean vs. planar trend
46
Other Regression Modeling
Logistic regression for binary data E.g., presence/absence Poisson regression for count data Nonlinear regression Neural networks (Tang et al. 2009) Genetic programming ( Tang, W., Malanson, G. P., & Entwisle, B. (2009). Simulated village locations in Thailand: A multi-scale model including a neural network approach. Landscape ecology, 24(4),
47
Reading Assignment Tang, W., Malanson, G. P., & Entwisle, B. (2009). Simulated village locations in Thailand: A multi-scale model including a neural network approach. Landscape ecology, 24(4), F Dormann, C., M McPherson, J., B Araújo, M., Bivand, R., Bolliger, J., Carl, G., ... & Wilson, R. (2007). Methods to account for spatial autocorrelation in the analysis of species distributional data: a review. Ecography, 30(5),
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.