Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sugar Cane Production in Puerto Rico, 1958/59- 1973/74: A Comparison of Four Model Specifications for Describing Small Heterogeneous Space- Time Datasets.

Similar presentations


Presentation on theme: "Sugar Cane Production in Puerto Rico, 1958/59- 1973/74: A Comparison of Four Model Specifications for Describing Small Heterogeneous Space- Time Datasets."— Presentation transcript:

1 Sugar Cane Production in Puerto Rico, 1958/ /74: A Comparison of Four Model Specifications for Describing Small Heterogeneous Space- Time Datasets by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

2 ABSTRACT Researchers increasingly are accounting for heterogeneity in their empirical analyses. When data form a short time series—too short to utilize an ARIMA model—a random effect term can be employed to account for serial correlation. When data also are georeferenced, forming a space-time dataset, a random effect term can be included that is spatially structured in order to account for spatial autocorrelation, too. But space-time heterogeneity can be accounted for in various ways, including specifications involving recently developed spatial filtering methodology. This paper summarizes comparisons of four model specifications—simple pooled space-time; sequential, comparative statics; temporally varying coefficients with a spatially unstructured random effect; and, temporally varying coefficients with a spatially structured random effect—illustrating implementations with annual sugar cane production data for the 73 municipalities of Puerto Rico during 1958/ /74. Covariates whose importance is assessed include elevation and distance from the primate city.

3 Panel data versus space-time data Panel data are a form of longitudinal data, and can be a cross-section (i.e., the spatial dimension) of individuals (e.g., farms) that are surveyed periodically over a given time horizon. With repeated observations of the same individuals, panel data permit a researcher to study the dynamics of change with short time series. A main advantage of panel data: controlling for unobserved heterogeneity (the fundamental complication of non-experimental data collection) BUT longitudinal data need not involve the same individuals: if a sample is not the same, observed changes also may result from sampling error

4 Spatial filtering A given random variable can be decomposed into a spatial component and an aspatial component: impulse-response function approach (based upon the autoregressive model), Getis approach (based on the K function), eigenfunction spatial filtering approach. The spatial component relates to spatial autocorrelation

5 High Peak district biomass index: ratio of remotely sensed data spectral bands B 3 and B 4 Spatially autocorrelatedGeographically random

6 Defining spatial autocorrelation Auto: self Correlation: degree of relative correspondence Positive: similar values cluster together on a map Negative: dissimilar values Cluster together on a map

7 Spatial auto- correlation from r to MC

8 Constructing eigenfunctions for filtering spatial autocorrelation out of georeferenced variables: Moran Coefficient = (n/1 T C1) x Y T (I – 11 T /n)C (I – 11 T /n)Y/ Y T (I – 11 T /n)Y the eigenfunctions come from (I – 11 T /n)C (I – 11 T /n)

9 Eigenvectors for spatial filter construction The first eigenvector, say E 1, is the set of real number numerical values that has the largest MC achievable by any set for the spatial arrangement defined by the geographic connectivity matrix C. The second eigenvector is the set of values that has the largest achievable MC by any set that is uncorrelated with E 1. The third eigenvector is the third such set of values. And so on. This sequential construction of eigenvectors continues through E n, the set of values that has the largest negative MC achievable by any set that is uncorrelated with the preceding (n-1) eigenvectors.

10 Useful citation

11 Random effects model is a random observation effect (differences among individual observational units) is a time-varying residual error (links to change over time) The composite error term is the sum of the two.

12 Random effects model: normally distributed intercept term ~ N(0, ) and uncorrelated with covariates supports inference beyond the nonrandom sample analyzed simplest is where intercept is allowed to vary across areal units (repeated observations are individual time series) The random effect variable is integrated out (with numerical methods) of the likelihood fcn accounts for missing variables & within unit correlation (commonality across time periods)

13 Sugar cane production in Puerto Rico Began in the 1530s Experienced a sharp decline during Introduction of slave labor resulted in considerable expansion during By 1828, sugar exports were sizeable Spanish monarchy discouraging expansion throughout much of the 1800s United States took possession of the island in 1899, fully developing the long-demanded railroad on the island and channeling considerable investment into sugar cane production, achieving maximum expansion in the 1920 Production peaked around 1950

14 Island-wide time series US intervention

15 1924 sugar cane railroad Finally started by the Spanish Crown, but aggressively completed by US investors

16 Covariates of sugar cane productionelevation distance from San Juan covariate spatial filters

17 Model specifications I-A: initial I-B: with linear time trend II: with random effect

18 III: with spatial filter IV: with spatially structured random effect

19 Sugar cane production: 1958/ /741958/ / /691973/74 Scale Dark red: high Dark green: low

20 YearcovariatesDeviancePseudo-R 2 MC for %Residual MC 1958/59 Time-based intercept, mean elevation, Distance from San Juan / / / / / / / / / / / / / / /

21 Year Spatially unstructuredSpatially structured Deviance statistic Pseudo-R 2 Residual MCSelected vectorsDeviance statistic Pseudo-R 2 Residual MC 58/ E 3, E 4, E 6, E 7, E 8, E 13, E / E 3, E 4, E 6, E 7, E 8, E 13, E / E 1, E 3, E 4, E 6, E 7, E 8, E 13, E / E 3, E 4, E 6, E 7, E / E4E / E 1, E / E 3, E / E3E Mixed binomial regression: time varying covariate coefficients, spatially unstructured and structured random effects

22 Year Spatially unstructuredSpatially structured Devi- ance Pseudo-R 2 Residual MCSelected vectorsDevi- ance Pseudo-R 2 Residual MC 66/ E 3, E 6, E / E 1, E 3, E 4, E 6, E / E 1, E 3, E 4, E 5, E 6, E 8, E 12, E 13, E 14, E / E 1, E 2, E 3, E 4, E 6, E 8, E 11, E 16, E / E 1, E 3, E 4, E 6, E 7, E 8, E 11, E 15, E / E 1, E 2, E 3, E 4, E 5, E 6, E 8, E 9, E 10, E 11, E 12, E 16, E 17, E / E 1, E 2, E 3, E 4, E 6, E 8, E 9, E 10, E 11, E 12, E 13, E 16, E 17, E / E 1, E 2, E 3, E 4, E 6, E 8, E 9, E 10, E 11, E 12, E

23 Spatial filters for space-time spatially structured random effects1958/59 MC = 0.77, GR = /64 MC = 0.93, GR = /69 MC = 0.86, GR = /74 MC = 0.94, GR = 0.22

24 (normally distributed) random intercept: areal unit specific across all years featureSpatially unstructuredAdded to spatial structure Sample mean Sample variance Moran Coefficient (MC) Geary Ratio (GR) P(Shapiro-Wilk)< (4 lower tail outliers) Correlations with covariates ( , )( , )

25 Time series plots: intercept & covariate binomial regression coefficientsintercept ● simple pooled model ■ comparative static model ♦ model with a spatially unstructured random effect ▲ mixed model with spatially structured random effect mean elevation distance

26 Time series plots: covariate binomial regression coefficient standard errors mean elevation distance ●simple pooled model ■ comparative static model ♦ model with a spatially unstructured random effect ▲ mixed model with spatially structured random effect

27 Residual serial correlation The random effects estimator approximates the degree of serial correlation (or its importance in the model), and hence allows the computation of corrected estimates. The 73 residual Durbin-Watson statistics have a range of (0.140, 2.513), with a mean of and a standard deviation of Determining significance here is complicated because of small T, inclusion of a random effects term, and variable SF eigenvecvtor #s

28 Graphical portrayal of DWs GLM residuals (heuristic using 4 dfs lost) 0 – – – – – 3.26 undecided positive serial correlation

29 Summary of results

30 STAR-binomial specification time space space-time

31 Pseud- & quasi-likelihood estimation

32 Extra binomial variation remains 1958/ / / / / / / / / / / / / / / / ●pineapple production ■ milk production ♦ sugar cane production ▲ tobacco production

33 implications 1.spatial autocorrelation appears to be a source of part of the overdispersion 2.random effects (e.g., missing covariates) appear to be a source of part of the overdispersion 3.land use competition may be a source of part of the overdispersion 4.spatial filters for mean elevation and distance have six eigenvectors in common; of these, one is shared with most of the annual comparative static spatial filters, and two with most of the spatially structured random effect term spatial filters

34 5.the components of spatial autocorrelation in sugar cane production vary over time 6.a spatially unstructured random effect term that seeks to account for serial correlation in multiple short time series can better highlight latent spatial autocorrelation 7.a spatial filter can effectively structure a random effect term 8.failure to include a spatially structured random effect term can result in biased parameter estimates (largely because of the nonlinear nature of the model specification) 9.spatial and temporal autocorrelation interact in a complex way

35 THE END


Download ppt "Sugar Cane Production in Puerto Rico, 1958/59- 1973/74: A Comparison of Four Model Specifications for Describing Small Heterogeneous Space- Time Datasets."

Similar presentations


Ads by Google