# Spatial modelling an introduction

## Presentation on theme: "Spatial modelling an introduction"— Presentation transcript:

Spatial modelling an introduction
Duncan Lee, Adrian Bowman and Marian Scott Enviornmental statistics course August 2008

Outline Spatial point processes Areal unit data Geostatistics
Spatio-temporal modelling

1. Spatial point processes
‘A Spatial point process is a set of locations, irregularly distributed within a designated region and presumed to have been generated by some form of stochastic mechanism’ - Diggle (2003). A realisation from a spatial point process is termed a spatial point pattern – a countable collection of events at locations{ui}. Here the locations of the events {ui} are random and are the data, no other variable is collected!

Example 1

Example 2

Notation A spatial point process is defined for a region A.
Sub-regions within A are denoted A1, A2,…… Single locations within A are denoted u1,u2,…… We denote by N(A), the random variable representing the number of events in the region A. Similar defnitions apply to N(Ak) and N(uk).

“Does the point pattern have any spatial dependence?”
Question of interest “Does the point pattern have any spatial dependence?” Three general types of structure are possible. Complete spatial randomness (CSR). –events occur at random. Clustered process – events occur close to existing events. Regular process. – events occur away from existing events.

Complete Spatial Randomness
CSR asserts that: (i) For any subregion Ak, N(Ak)~Poisson(|Ak|). (ii) For disjoint sub-regions (A1, A2) , N(A1) and N(A2) are independent.  is termed the intensity and is the expected number of events per unit of area, so that |A| is the expected number of events in A. A process satisfying (i) and (ii) is called a homogeneous Poisson process (with intensity ).

Mean and covariance For a CSR process N(A)~Poisson(|A|).
Mean – Constant across A. Therefore at a single location u1,with area 1, the mean of N(u1) equals . Covariance – The spatial dependence of the process between two points (u1, u2) is determined by the second order intensity function 2 (u1, u2) . However the latter is hard to work with.

K function Instead of working with the second order intensity function 2 (u1, u2) to measure spatial dependence, we work with the K function K(t) = E{N0(t)} /  where N0(t) is the number of events within a distance t of an arbitrary event.

Why is the K-function useful?
Recall that K(t) = E(n0 of events within t of an arbitrary event) /  For a CSR process - K(t) =  t2. For a clustered process we would expect more points close together than under CSR, so for small t, K(t) >  t2. For a regular process we would expect less points close together than under CSR, so for small t, K(t) <  t2.

Determining if CSR holds
Step 1 - estimate the intensity  by hat = N(A)/|A|. Step 2 - estimate K(t) for a given distance t, by calculating the average number of events (over all points in the pattern) within distance t of that event. Step 3 – Plot the theoretical function for CSR, K(t) =  t2, against t, and add a second line for the estimated K function for the point process. If CSR is reasonable they will be very similar.

Some examples K(t) t

Further models for Spatial point processes
If CSR does not hold for the data in question there are other models that can be used. For example Poisson cluster process – Models clusters. Inhomogeneous Poisson process – spatially varying intensity. Cox process – incorporating time-varying intensity. Inhibition process – models regular processes.

Another example

Implementing point process models
Point process models (including CRS and others) can be implemented in R using the add on libraries spatstat Splancs For further details see

2. Areal unit data The region of interest A is split into n non-overlapping sub-regions A1,…,An . The random variable of interest is only available as an aggregated average or total for each sub-region, and is represented by Z1,…,Zn . The sub-regions are fixed, and it is the variable being measured for each region that is random. In comparison, for Point processes no variable Z was measured, as it was the location of the event that was random.

Motivating example Lip cancer rates for the 56 counties in Scotland. Two possible questions of interest: Does any environmental variable effect the number of new cases? Is there an outbreak of lip cancer cases in any part of Scotland? Map taken from a paper by Wakefield from Biostatistics 2007,

Modelling areal unit data
When modelling areal unit data z1,….zn from sub-regions A1,…,An consider the following: Response distribution – normal, Poisson, binomial, etc. Regression variables – e.g. sunlight in the lip cancer example. Spatial dependence – are areas close together related? Method of analysis – frequentist or Bayesian methods.

Spatial dependence Spatial dependence quantifies how the values of z1,…,zn are related to each other. There are three general types of dependence. Independence - the values of z1,…zn are not related. Negative dependence – if areas i and j are close together then zi and zj will have different values. Positive dependence – if areas i and j are close together then zi and zj will have similar values.

Modelling positive dependence
A common method for modelling positive dependence is based on a neighbourhood or weight matrix W. A matrix of 1’s and 0’s, where element ij is 1 if areas i and j are neighbours and 0 otherwise. Neighbours can be defined in many ways including: Areas sharing a common border. Areas less than a distance d apart. Area i is one of the closest areas in terms of distance to area j.

Conditional autoregressive (CAR) models
For simplicity assume that z1,…zn are normally distributed and there are no covariates, then the CAR model is given by Zi|Z-i So the expected value of zi is equal to the mean of its neighbours, as ni is the number of neighbours of area i.

3. Geostatistical data For a fixed region A, the variable of interest could be measured at any location. However due to time/cost constraints it has only been measured at n locations u1,…, un , which are typically chosen and not random. The random variables measured at all n locations are denoted by Z(u1),…, Z(un) . Therefore this is different from Point processes where the locations are the random variable. Areal data where the variable can only be measured as n aggregated averages (or totals) for sub-regions A1,…, An.

Goals of geostatistics
Given observations Z(u1),…, Z(un), there are three general goals of a geostatistical analysis. How best to model the data? How to estimate Z(u0) where u0 is an unobserved location? How to draw a map of Z(u) for all points u in the region.

Modelling geostatistical data
When modelling geostatistical data consider the following: Response distribution – normal, Poisson, binomial, etc. Spatial trend – e.g. regression variables or other trends. Spatial dependence – how are areas close to each other related. Method of analysis – frequentist or Bayesian methods.

General geostatistical model
A general model for data Z=(Z(u1),…, Z(un)), is Z = µ + S The data Z are assumed to be normally distributed. µ is the mean function and models spatial trend. S is a stochastic process and models spatial dependence.

Modelling spatial trend
A spatial trend is a systematic change in the mean function µ over the area of interest. It is generally smooth, although it may change abruptly in response to environmental forcing variables (e.g., bedrock geology). It can be modelled in numerous ways. Regression variables such as geology. Polynomials in the co-ordinates u1…un. Modelled within the spatial dependence component S (non-stationary).

Spatial dependence For the remainder of this course we assume that any spatial trend has been removed by the mean function µ. We assume positive spatial dependence rather than negative, that is the closer two points are the more similar their values of the variable will be.

Modelling spatial dependence
A common model for spatial dependence is S ~ N(0 , C) which implies the data are normally distributed. Here C is the variance-covariance matrix, and is a transformed correlation matrix. If all observations have the same variance, then to C=σ2V, where V is the correlation matrix. σ2 is the common variance of each observation.

Correlation matrix V The correlation matrix typically has the following characteristics. The diagonal elements equal 1, as they represent the correlation of an observation with itself. The ijth element of V is close to one if locations ui and uj are close. As locations ui and uj get further apart, the ijth element gets closer to zero. Negative dependence (i.e. negative values in V) is rarely seen in geostatistical data.

Simplifying V or C The covariance / correlation (spatial dependence) structure in the data can have two simplifying properties. Stationarity – The covariance (or correlation) between ui and uj only depends on their difference ui – uj. so the locations of the two points does not matter, only their distance and direction from each other. Isotropy – The covariance (or correlation) between ui and uj only depends on the magnitude of their difference ||ui – uj.||, so the locations of the two points does not matter, only their distance apart.

Assuming the spatial dependence is stationary and isotropic, the covariance function between 2 points Z(u) and Z(u + t) simplifies to a function of the scalar distance between the two points. Similarly the correlation function is given by Where σ2 is the variance and also denoted by C(0).

Semi-variogram modelling
However in the geostatistical literature spatial dependence is modelled in terms of the semi-variogram γ(t) = 0.5Var(Z(u+t) – Z(u)) = C(0) – C(t) = σ2 - C(t) rather than the covariance function.

Estimating the semi-variogram
The semi-variogram for data Z(u1),…, Z(un) can be estimated by calculating for any value of t. Here N(t) is the set of points (ui, uj) that are distance t apart. This function is called the empirical semi-variogram, and it can be plotted against t to see the general shape.

Alternatively, you could plot the semi-variogram cloud, which is a plot of
against for all pairs of points. This form gives more than one value for each distance t, so it is a scatterplot.

What should a semi-variogram look like?

The nugget is the limiting value of the semi-variogram as the distance t approaches zero. It quantifies the amount of spatial variability at very small spatial scales (those less than the separation between observations) and also measurement error. The sill is the horizontal asymptote of the variogram, if it exists, and represents the overall variance of the random process. The range is the distance t* at which the semi-variogram reaches the sill. Pairs of points that are further apart than the sill are uncorrelated

Sometimes the semi-variogram only approaches the sill asymptotically, and in this case we define the practical range as the lag t* at which γ(t) = 0.95* sill = 0.95* σ2

Modelling spatial dependence
Spatial dependence in the data can now be modelled in two stages. Plot the empirical semi-variogram and determine which family of semi-variogram models it resembles. Estimate the parameters (sill, nugget, range) of the chosen semi-variogram model by least squares methods.

Semi-variogram models
A number of semi-variogram models exist that can be used. Nugget - random data Spherical Exponential Although these models may not fit the data particularly well.

Spatial prediction Once a trend and spatial dependence model have been fitted, it is of interest to estimate Z at some unobserved location u0. There are many methods for doing this including: Regression modelling using generalised least squares. Inverse distance weighted interpolation. Kriging.

The majority of these approaches predict z
The majority of these approaches predict z*(u0) the variable at location u0 by a weighted average of the form The main difference between the methods is how the weights are estimated. A map can then be produced by predicting the surface at a regular grid of points.

137Cs deposition maps in SW Scotland prepared by different European teams (ECCOMAGS, 2002)

Kriging 1 Ordinary Kriging
First, the trend is estimated using least squares methods. Then the observed values can be de-trended by subtracting the estimated trend from the data. Finally a model for the variogram is fitted to the de-trended data and used to generate the weights for the prediction.

Kriging 2 There are a number of other kriging methods, such as block kriging, indicator kriging and co-kriging. Some interesting issues concern the uncertainty in the prediction. We can use the kriging procedure to produce uncertainty maps, and recent work has been to develop approaches to incorporate this uncertainty in the variogram model.

Kriging in R There are routines to do kriging in the R libraries:-
geoR fields gstat sgeostat spatstat spatdat

Choosing the locations u1…un
The desired set of locations depends on the goal of the analysis. Point prediction – Locate points on a regular grid so that all prediction locations will be highly correlated with a few observed data points. Average estimation – If the aim is to estimate the average value of Z over the region A, then correlated points provides redundant information. Therefore you want the distance between pairs of points to be roughly the variogram range.

4. Spatio-temporal statistical modelling
Spatio-temporal statistical modelling is a real challenge because: usually very large data sets and one ‘dimension’ may be richer than the other lots of stations, limited measurement in time. few stations, monitored very frequently in time. need to combine the techniques found in time series and spatial analysis.

Modelling spatial and temporal dependence
One major difficulty concerns how to jointly model correlation through time correlation over space Is correlation through space constant over time, and correlation through time constant over space? if yes, then we have a ‘separable’ and stationary process. if not, then we need to build a space-time correlation structure (hard work).

General approach The general approach to spatio-temporal models is through stochastic spatio-temporal processes Z(u,t) - where u represents space and t represents time which may be a combination of a spatial and a time series process.

Simplifying assumptions
Stationarity – natural extension from time series and spatial models. Isotropy – natural extension from spatial models. Separability – The covariance function of Z(u,t) can be split into space and time parts, i.e. cov[Z(u1,, t1), Z(u2, t2)] = Cu(u1,u2)CT(t1,t2) which means we can use the tools we have met previously.

Spatial Analysis Across Time
At each time point a plane across space was fitted and Gaussian Variograms of the residuals were computed. The average of the variogram parameters’ estimates were used to obtain the spatial covariance matrix .

non-separable processes
Much harder problem, still the basis of much statistical research.