Download presentation

Presentation is loading. Please wait.

Published byZoe Watkinson Modified about 1 year ago

1
Lecture #5: MAPS WITH GAPS-- Small geographic area estimation, kriging, and kernel smoothing Spatial statistics in practice Center for Tropical Ecology and Biodiversity, Tunghai University & Fushan Botanical Garden

2
Topics for today’s lecture The E-M algorithm The spatial E-M algorithm Kriging in ArcGIS geographically weighted regression (GWR) approaches to map smoothing

3
THEOREM 1 When missing values occur only in a response variable, Y, then the iterative solution to the EM algorithm produces the regression coefficients calculated with only the complete data. PF: Let b denote the vector of regression coefficients that is converged upon. Then if,

4
THEOREM 2 When missing values occur only in a response variable, Y, then by replacing the missing values with zeroes and intro- ducing a binary 0/-1 indicator variable covariate -I m for each missing value m, such that I m is 0 for all but missing value observation m and 1 for missing value observation m, the estimated regression coefficient b m is equivalent to the point estimate for a new obser- vation, and hence furnishes EM algorithm imputations. PF: Let b m denote the vector of regression coefficients for the missing values, and partition the data matrices such that

5
The EM algorithm solution where: the missing values are replaced by 0 in Y, and I m is an indicator variable for missing value m that contains n-m 0s and a single 1

6
THEOREM 3 For imputations computed based upon Theorem 2, each standard error of the estimated regression coefficients b m is equivalent to the conventional standard deviation used to construct a prediction interval for a new observation, and as such furnishes the corresponding EM algorithm imputation standard error. PF:

7
What is the set of equations for the following case? 107 7y 4 = ?

8
Some preliminary assessments

9
simulations

10
simulated imputations

11
EM algorithm solution for aggregated georeferenced data: vandalized turnips plots

12
MTB > regress c4 8 c7-c14 Regression Analysis: C4 versus C7, C8, C9, C10, C11, C12, C13, C14 The regression equation is C4 = C C C C C C C C14 Predictor Coef SE Coef T P Constant C7 [I 1 -I 6 ] C8 [I 2 -I 6 ] C9 [I 3 -I 6 ] C10 [I 4 -I 6 ] C11 [I 5 -I 6 ] C12 [plot(6,5)] C13 [plot(5,6)] C14 [plot(6,6)]

13
Analysis of Variance for C4 Source DF SS MS F P C Error Total Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev (-----*-----) (----*-----) (----*-----) (-----*-----) (-----*-----) (------*-----) Pooled StDev =

14
Residual spatial autocorrelation What does this mean?

15

16
SAR-based missing data estimation where y m is a missing value (replaced by 0 in Y ), I m is an indicator variable for y m, and is the m th column of geographic weights matrix W

17
The Jacobian term NOTE: denominator becomes (n-n m )

18
What is the set of equations for the following case? 7Y 2 = ?10

19
spatial autoregressive (AR) kriging estimate with semivariogram model fit semivariogram model with

20
The pure spatial autocorrelation CAR model Dispersed missing values: NOTE: exactly the same algebraic structure as the kriging equation Imputation Imputation = the observed mean plus a weighted average of the surrounding residuals

21
Employing rook’s adjacency and a CAR model, what is the equation for the following imputation? y 5 = ?4 955

22
The spatial filter EM algorithm solution where: the missing values are replaced by 0 in Y, and I m is an indicator variable for missing value m that contains n-m 0s and a single 1

23
Field plot Conven- tional EM estimate Spatial SAR- EM estimate = Spatial filter: 3 selected eigenvectors (6,5) (5,6) (6,6) Imputation of turnip production in 3 vandalized field plots

24
Cressie’s PA coal ash modelestimate Cressie10.27% Spherical10.62% Gaussian10.18% exponential10.12% SAR10.17% spatial filter10.71% minmeanmax

25

26
Missing 1992 georeferenced density of milk production in Puerto Rico: constrained (total = 1918) Predicted from 1991 DMILK Predicted from spatial filter Predicted from both ,3391,8481, predictions Moran scatterplot

27
USDA-NASS estimation of Pennsylvania crop production covariate total constraints map gaps

28

29
USDA-NASS estimation of Michigan crop production If this is 2% milk, how much am I paying for the other 98%?

30
Michigan imputations different response variable specifications

31
USDA-NASS estimation of Tennessee crop production

32
Tennessee imputations

33
An EM specification when some data for both Y and the Xs are missing

34
Concatenation results:

35
The spatial model power transformation spatial autocorrelation totals constraints covariate

36

37

38
Field plotSpatial filter: 3 selected eigenvectors (6,5)24.31 (5,6)13.62 (6,6)23.93Imputation of turnip production in 3 vandalized field plots

39
Cross-validation of spatial filter for observed turnip data

40
The accompanying table contains a test set of sixteen random samples (#17-32) used to evaluate three maps. The “Actual” column lists the measured values for the test locations identified by “Col, Row” coordinates. The difference between these values and those predicted by the three interpolation techniques form the residuals shown in parentheses. The “Average” column compares the whole field arithmetic mean of 23 (guess 23 everywhere) for each test location. Kriging: best linear unbiased spatial interpolator (i.e., predictor)

41
ArcGIS: Geostatistical Wizard anisotropy check density of German workers

42
Cross-validation check of krigged values This is one use of the missing spatial data imputation methods.

43
Unclipped krigged surface krigged (mean response) surface prediction error surface exponential semivariogram model values increase with darkness of brown extrapolation

44
Clipped krigged surface prediction error surface krigged (mean response) surface values increase with darkness of brown

45
Detrended population density across China anisotropy check

46
Cross-validation check of krigged values This is one use of the missing spatial data imputation methods.

47
Unclipped krigged surface krigged (mean response) surface prediction error surface exponential semivariogram model values increase with darkness of brown extrapolation

48
Clipped krigged surface krigged (mean response) surface prediction error surface values increase with darkness of brown

49
THEOREM 4 The maximum likelihood estimate for missing georeferenced values described by a spatial autoregressive model specification is equivalent to the best linear unbiased predictor kriging equation of geostatistics.

50
Geographically weighted regression: GWR Spatial filtering enables easier implementation of GWR, as well as proper assessment of its dfs Step #1: compute the eigenvectors of a geographic connectivity matrix, say C Step #2: compute all of the interactions terms X j E k for the P covariates times the K candidate eigenvectors (e.g., with MC > 0.25) Step #3: select from the total set, including the individual eigenvectors, with stepwise regression

51
Step #4: the geographically varying intercept term is given by: Step #5: the geographically varying covariate coefficient is given by factoring X j out of its appropriate selected interaction terms:

52
A Puerto Rico DEM example Mean elevation (Y) is a function of: standard deviation of elevation (X), eigenvectors E 1 - E 18, and 18 interaction terms (XE) Results intercept: 1, E 2, E 5 -E 7, E 9, E 11 -E 13, E 15, E 18 slope: 1, E 4, E 6, E 9, E 10 R 2 increases from (with X only) to (with geographically varying coefficients) P(S-W) = 0.52 for the final model

53
GWR- spatial filter intercept (MC = 0.692) GWR- spatial filter slope (MC = 0.721)

54
Spatial moving averages Local smoothing of attribute values where: w ij is a spatial weights matrix y i is the attribute value for each areal unit n is the number of areal units

55

56
A summary: what have we learned during the 5 lectures? Lecture #1 The nature of data and its information content. What is spatial autocorrelation? Visualizing spatial autocorrelation: Moran scatterplots, semivariogram plots, and maps. Defining and articulating spatial structure: topology and distance perspectives; contagion and hierarchy concepts. Necessary concepts from multivariate statistics. An example of the elusive negative spatial autocorrelation. Some comments about spatial sampling. Implications about space-time data structure.

57
Lecture #2 Multivariate grouping, and location-allocation modeling. Going from the global to the local: variability and heterogeneity. Impacts of spatial autocorrelation on histograms. The LISA and Getis-Ord statistics. Cluster analysis: multivariate analysis, cluster detection, and spider diagrams. –An overview of geographic and space-time clusters. Regression diagnostics and geographic clusters

58
Lecture #3 Autoregressive specifications and normal curve theory (PROC NLIN). Auto-binomial and auto-Poisson models: the need for MCMC. Relationships between spatial autoregressive and geostatistical models Spatial filtering specifications and linear and generalized linear models (PROC GENMOD). Autoregressive specifications and linear mixed models (PROC MIXED). Implications for space-time datasets (PROC NLMIXED)

59
Lecture #4 Frequentist versus Bayesian perspectives. Implementing random effects models in GeoBUGS. Spatially structured and unstructured random effects: the CAR, the ICAR, and the spatial filter specifications Lecture #5 The E-M algorithm The spatial E-M algorithm Kriging in ArcGIS Approaches to map smoothing

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google