Spatial autoregressive methods

Slides:



Advertisements
Similar presentations
Copula Representation of Joint Risk Driver Distribution
Advertisements

Dates for term tests Friday, February 07 Friday, March 07
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
The Multiple Regression Model.
Objectives 10.1 Simple linear regression
Use of Kalman filters in time and frequency analysis John Davis 1st May 2011.
Prediction, Correlation, and Lack of Fit in Regression (§11. 4, 11
Computer Vision – Image Representation (Histograms)
Basic geostatistics Austin Troy.
10 Further Time Series OLS Issues Chapter 10 covered OLS properties for finite (small) sample time series data -If our Chapter 10 assumptions fail, we.
8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Homoskedasticity.
Spatial Autocorrelation Basics NR 245 Austin Troy University of Vermont.
Dealing with Spatial Autocorrelation
Chapter 9 Gauss Elimination The Islamic University of Gaza
Variance and covariance M contains the mean Sums of squares General additive models.
Chapter 12 Simple Regression
Spatial autoregressive methods Nr245 Austin Troy Based on Spatial Analysis by Fortin and Dale, Chapter 5.
SYSTEMS Identification
1 Sociology 601, Class 4: September 10, 2009 Chapter 4: Distributions Probability distributions (4.1) The normal probability distribution (4.2) Sampling.
SA basics Lack of independence for nearby obs
Multiple Regression and Correlation Analysis
Basics of regression analysis
Dealing with Heteroscedasticity In some cases an appropriate scaling of the data is the best way to deal with heteroscedasticity. For example, in the model.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Applications in GIS (Kriging Interpolation)
AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Chapter 13.3 Multicollinearity.
Lorelei Howard and Nick Wright MfD 2008
Method of Soil Analysis 1. 5 Geostatistics Introduction 1. 5
Variance and covariance Sums of squares General linear models.
12 Autocorrelation Serial Correlation exists when errors are correlated across periods -One source of serial correlation is misspecification of the model.
Structural Equation Modeling Continued: Lecture 2 Psy 524 Ainsworth.
Regression Method.
CORRELATION & REGRESSION
Some matrix stuff.
Geo479/579: Geostatistics Ch12. Ordinary Kriging (1)
Spatial Statistics in Ecology: Area Data Lecture Four.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Relative Values. Statistical Terms n Mean:  the average of the data  sensitive to outlying data n Median:  the middle of the data  not sensitive to.
Estimation Method of Moments (MM) Methods of Moment estimation is a general method where equations for estimating parameters are found by equating population.
June 30, 2008Stat Lecture 16 - Regression1 Inference for relationships between variables Statistics Lecture 16.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 4-1 Basic Mathematical tools Today, we will review some basic mathematical tools. Then we.
University of Colorado Boulder ASEN 5070 Statistical Orbit determination I Fall 2012 Professor George H. Born Professor Jeffrey S. Parker Lecture 9: Least.
G Lecture 71 Revisiting Hierarchical Mixed Models A General Version of the Model Variance/Covariances of Two Kinds of Random Effects Parameter Estimation.
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
AUTOCORRELATION 1 Assumption B.5 states that the values of the disturbance term in the observations in the sample are generated independently of each other.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
Simple and multiple regression analysis in matrix form Least square Beta estimation Beta Simple linear regression Multiple regression with two predictors.
AUTOCORRELATED DATA. CALCULATIONS ON VARIANCES: SOME BASICS Let X and Y be random variables COV=0 if X and Y are independent.
Variance and Standard Deviation
Covariance components II autocorrelation & nonsphericity
Biointelligence Laboratory, Seoul National University
Chapter 14 Introduction to Multiple Regression
Spatial statistics: Spatial Autocorrelation
Chapter 7. Classification and Prediction
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3
Chapter 5 Part B: Spatial Autocorrelation and regression modelling.
Chapter 11: Simple Linear Regression
Mixed models and their uses in meta-analysis
Evgeniya Anatolievna Kolomak, Professor
Chapter 11 Simple Regression
Chapter 3: TWO-VARIABLE REGRESSION MODEL: The problem of Estimation
Chapter 13 Simple Linear Regression
OVERVIEW OF LINEAR MODELS
OVERVIEW OF LINEAR MODELS
Lecturer Dr. Veronika Alhanaqtah
Presentation transcript:

Spatial autoregressive methods Nr245 Austin Troy Based on Spatial Analysis by Fortin and Dale, Chapter 5

Autcorrelation types None: independence Spatial independence, functional dependence True autocorrelation>> inherent autoregressive Functional autocorr>> induced autoregressive

Autocorrelation types Double autoregressive Notice there are now two autocorrelation parameters r-x and r-z

Effects? Standard test statistics become “too liberal”—more significant results than the data justify Because observations are not totally independent have lower actual degrees of freedom, or lower “effective sample size”: n’ instead of n; since t stat denominator = s/n, if n is too big it inflates the t statistic Above: simulations yield 4x the type 1 errors for inherent AR than expected, induced AR model yields 2 x and double AR model is 8x

What to do? Non-effective Why not just adjust up the significance level? E.g. 99% instead of 95%? Because don’t how by how much to adjust without further information. Could end up with a test that is way too conservative Why not just adjust sampling to only include “independent samples?” Because wasteful of data and because easy to mistake “critical distance to independence”

Best approach: Adjust effective sample size In presence of SA, variance of mean of obs can be adjusted sing covariances of Xs Cov(Xi, Xj) becomes For large sample sizes So for instance n=1000 and ro=.4 means n’=429 Problem is that, to be useful, autoregressive model (ro parameter) has to be an effective descriptor of the structure of autocorrelation of the data, but it’s a simplification Next step therefore is factoring in correlation matrix, R, based on lag distances r(d)

Moving average models At 1st order we get a matrix like: Half of info for Xi contained in Xi+1 Half contained in Xi-1 Hence only every other ob. Needed So produce ro=.5 for large n and n’=n/2. n’=n/2 A k order model can take form Translates into generalized matrix form With variance covariance matrix

Moving average When you increase the order, calculating sample size gets complicated; e.g. second order model, where two ro parameters now Important point: If there are several different levels of autocorrelation (rk), each rk must be incorporated even if non-significant Using only significant values can understate n’ Fortin and Dale recommend not using moving average approach because very sensitive to irregularities in the data and can produce a wide range of estimates

Two dimensional approaches Problem with MA approach as it was just presented is assumes one-dimensionality In 2-d spatial data, xi depends on all neighbors most likely Now must define what is “neighbor” in 2d (e.g. w=1/8 for 9 cell grid of neighbors, all else = 0) Two best ways for dealing with this: Simultaneous autoregressive models (SAR) Conditional autoregressive models (CAR) CAR’s neighborhood matrices specify relationship between lagged response values at each location and neighboring location SAR’s specify relationship between lagged residuals Both use nxn spatial weights matrix (W) composed of wij Can be based on adjacency, number neighbors or distance Zeros on diagonals, weights on off diagonals In both SAR and CAR, SA tends to persist across long distances

CAR More commonly used in spatial statistics Not based on spatial dependence per se; instead probability of a certain value is conditional on neighbor values Here Where j is the autocorrelation parameter and V is a symmetrical weight matrix Symmetrical requirement means that directional processes can’t be modeled.

SAR Based on concept of set of simultaneous equations to be solved. In this xi and xi-1 are each defined by their own equations containing other xs Where x is a vector and is linearly dependent on a vector of underlying variables z1, z2 z3…. Given as matrix Z, u is a vector non-independent error terms with mean zero and var-covar matrix C Spatial autocorrelation enters via u where Here e is independent error term and W is neighbor weights standardized to row totals of 1. W is not necessarily symmetrical, allowing for inclusion of anisotropy. Wij is >0 if values at location i is not independent of value at location j

SAR This yields the model With variance covariance matrix (from u) Note how similar to MA—difference is no inverse in formula The elements of C are variances From Fortin and Dale p. 231

SAR Advantages: doesn’t require weight matrix to be symmetrical, so can model anisotropic phenomena. SAR can take three forms Lagged response model: autoregressive process only occurs in the response variable Lagged mixed model, where SA affects both response and predictors Spatial error model: assumes SA process occurs only in error term and not in response or predictor