Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stata in Space: An example for the econometric analysis of spatially explicit raster data --- Daniel Müller --- Institute of Agricultural Economics and.

Similar presentations


Presentation on theme: "Stata in Space: An example for the econometric analysis of spatially explicit raster data --- Daniel Müller --- Institute of Agricultural Economics and."— Presentation transcript:

1 Stata in Space: An example for the econometric analysis of spatially explicit raster data --- Daniel Müller --- Institute of Agricultural Economics and Social Sciences Humboldt University Berlin Berlin -- August, 12th, 2003

2 Outline Introduction Spatial data analysis Data preparation
The empirical example Econometric estimation Export of results and geovisualization

3 Introduction Socioeconomic data usually exist for (discrete) social entities, rarely explicitly linked to location (georeferenced) ‘Natural’ data: often continuous (rainfall, slope, elevation) and georeferenced Integration of both data sources can provide additional insights Allows to understand spatial patterns & processes Knowing the where can help us infer the why necessary?

4 Spatial data analysis Spatial analysis is the analysis of data linked to location (spatial data) Why analysis of spatial data ? Variables of interest vary in space Location matters! Spatial analysis can provide important insights: geographical targeting of investments diffusion of technologies causes and consequences of land-use change

5 What’s special about spatial data ?
Spatial data analysis What’s special about spatial data ? => Location matters !!! => Tobler’s 1st law of geography (1979): “Everything is related to everything else, but near things are more related than distant things.” => Spatial effects: - spatial autocorrelation - spatial heterogeneity 1. Location matters: absolute and relative; physical measurement often, economic data seldom an explicit spatial sample 2. Dependence = spatial structure

6 Peculiarities in space: Spatial effects
Spatial data analysis Peculiarities in space: Spatial effects 1. Spatial autocorrelation Coincidence of value similarity with locational similarity Second dimension adds mathematical complexity (multiple directions) 2. Spatial heterogeneity Each location is unique Units of observations not homogeneous across space Structural instability over space, e.g. heteroskedasticity non-constant error variances due to, e.g. unequal population densities or varying technological development

7 Peculiarities in space: spatial effects [2]
Spatial data analysis Peculiarities in space: spatial effects [2] Spatial effects due to: interactions among neighboring agents data from different sources different sample designs varying aggregation rules “Spatial relationships among observations can result in unreliable estimates and misguided statistical inference of the parameters.” (Anselin 1988). => Corrections necessary non-constant error variances

8 Geographic Information Systems (GIS):
Spatial data analysis Geographic Information Systems (GIS): Compile, store, manipulate, analyse, visualize spatial data Consist of hardware, software, data and procedures Data models: vector & raster

9 Spatial data analysis Raster data model:
Arrangement of regularly shaped, contiguous cells Continuous data layers; fit together edge-to-edge Typically consist of square cells Each cell represents a location in a raster GIS Cells are arranged in layers Values of a cell indicate characteristics of that location Data is composed of many layers covering the same geographical area

10 Raster data model --- file structure:
Spatial data analysis Raster data model --- file structure: Header: Contains spatial information! 1 2 3 4 5 6

11 Raster data model --- land use map:
Spatial data analysis Raster data model --- land use map:

12 From data layers to resulting map
Spatial data analysis From data layers to resulting map data layers overlays analyses output

13 Importing grids into Stata
Data preparation Importing grids into Stata ras2dta , files(filelist) [ idcell(varname) nodata(#) dropmiss xcoord(#) ycoord(#) genxcoor(varname) genycoor(varname) header(filename) saving(filelist) replace clear ] infile-s grids (filelist) into Stata: -generate-s IDcode for each cell (=observation) reads the information from the header (if present) “ sets missing values to a specified number “ drop-s unnecessary empty cells “ generate-s X and Y coordinates “ save-s the header information in a file

14 Integration of data layers
Data preparation Integration of data layers Import of raster grids (-ras2dta-) Combination of raster layers in Stata (-joinby-, -merge-) based on spatial identifier (ID-code of cells) Socioeconomic (survey, census) data can be joined to grids based on, e.g., administrative boundaries

15 Corrections of spatial effects
Data preparation Corrections of spatial effects Spatial lag variables with index values for latitude (Y) and longitude (X) Spatially lagged variables Regular sampling from a grid => 1. can be done with -ras2dta- => 2. we ignore here => 3. is easy in Stata, e.g. with : -spatsam- non-constant error variances

16 Data preparation spatsam , gap(#) xcoord(varname) ycoord(varname) [ saving(filename) norestore noseed replace ] Basically that‘s: keep if (xcoord / gap) == int (xcoord / gap) & (ycoord / gap) == int (ycoord / gap) Therefore, only every #-th observation in X and Y direction is kept in the sample. non-constant error variances

17 Land use change in Vietnam
The empirical example Land use change in Vietnam Land use as an inherently spatial process Returns to land use are (spatially) affected by: market accessibility (von Thünen) land rent (Ricardo) Possible factors to consider: soil quality, topography, climate, market locations, population density, technology Limited dependent variable problem (-mlogit-)

18 The empirical example Data Satellite image interpretation:
- land cover => land use (change) GIS, maps, point measurements: - geophysical indicators => topography, soil, climate Socioeconomic & policy variables: - village survey, secondary statistics => technology, population, education, market access Data integration based on spatial identifier and (approximated) village areas

19 Econometric estimation
Observations: 964,000 pixels (50 x 50 m) Spatial sample: every 5. cell in X & Y direction Estimation: 35,000 observations => Dependent: five land cover classes (1, 2, .., 5) => Independent: a) geophysical b) socioeconomic c) policy d) spatial effects non-constant error variances

20 Econometric estimation
1. Estimation of the influence of hypothesized determinants on land use. 2. What is the probability that a certain pixel falls into one of the five land-use categories? => -mlogit- (reduced form, clustered for villages) => -mlogtest, iia-, -fitstat- (Long & Freese) Then we take the highest predicted probability as predicted land use. observations within villages likely not independent -> SE underestimated -> robust SE clustered for villages

21 Outputting results from Stata
Export of results Outputting results from Stata dta2ras [varlist], xcoord(#) ycoord(#) cellsize(#) [ header(filename) idcell(varname) nodata(#) xllcorner(#) yllcorner(#) saving(varlist) replace ] writes header in front of file with the information from xcoord(#) ycoord(#) cellsize(#) or header(); (optionally) nodata(#) xllcorner(#) yllcorner(#) then the results can be mapped in the GIS

22 Geovisualization of results
Prediction map

23 Geovisualization of results
Maximum predicted probabilities

24 Thank you. Questions, comments and critique welcome
Thank you ! Questions, comments and critique welcome ! ____________________________ © Daniel Müller Institute of Agricultural Economics and Social Sciences Humboldt University Berlin Stata ados available for download at:


Download ppt "Stata in Space: An example for the econometric analysis of spatially explicit raster data --- Daniel Müller --- Institute of Agricultural Economics and."

Similar presentations


Ads by Google