# Sampling and monitoring the environment Marian Scott Sept 2006.

## Presentation on theme: "Sampling and monitoring the environment Marian Scott Sept 2006."— Presentation transcript:

Sampling and monitoring the environment Marian Scott Sept 2006

Outline Variation General sampling principles Methods of sampling –Simple random sampling –Stratified sampling –Systematic sampling –How many samples (power calculations) Spatial sampling –Grid, transect and cluster sampling

Variation Natural variation in the attribute of interest, might be due to –feeding habits if measuring sheep, rainfall patterns if measuring plants Also variation/ uncertainty due to analytical measurement techniques. Natural variation may well exceed the analytical uncertainty Expect therefore that if you measure a series of replicate samples, they will vary and if there is sufficient you may be able to define the distribution of the attribute of interest.

The normal distribution

The log normal distribution

What is statistical sampling? Statistical sampling is a process that allows inferences about properties of a large collection of things (commonly described as the population), to be made from observations made on a relatively small number of individuals belonging to the population (the sample). In conducting statistical sampling, one is attempting to make inferences to the population.

Statistical sampling The use of valid statistical sampling techniques increases the chance that a set of specimens (the sample, in the collective sense) is collected in a manner that is representative of the population. Statistical sampling also allows a quantification of the precision with which inferences or conclusions can be drawn about the population.

Statistical sampling the issue of representativeness is important because of the variability that is characteristic of environmental measurements. Because of variability within the population, its description from an individual sample is imprecise, but this precision can be described in quantitative terms and improved by the choice of sampling design and sampling intensity (Peterson and Calvin, 1986).

Good books The general sampling textbooks by Cochran (1977) and Thompson (1992), the environmental statistics textbook by Gilbert (1987), and papers by Anderson-Sprecher et al. (1994), Crepin and Johnson (1993), Peterson and Calvin (1986), and Stehman and Overton (1994).

Know what you are setting out to do before you start describing a characteristic of interest (usually the average), describing the magnitude in variability of a characteristic, describing spatial patterns of a characteristic,mapping the spatial distribution, quantifying contamination above a background or specified intervention level detecting temporal or spatial trends, assessing human health or environmental impacts of specific facilities, or of events such as accidental releases, assessing compliance with regulations

Rules Rule 1: specify the objective

Use your scientific knowledge the nature of the population such as the physical or biological material of interest, its spatial extent, its temporal stability, and other important characteristics, the expected behaviour and environmental properties of the compound of interest in the population members, the sampling unit (i.e., individual sample or specimen), the expected pattern and magnitude of variability in the observations.

Rules Rule 1: specify the objective Rule 2: use your knowledge of the environmental context

Some examples

Other approaches Nearest neighbour methods - G-function The empirical distribution function of event-to- event nearest neighbours distances, G(·). Nearest neighbour methods - F-function The empirical distribution function of point-to- event nearest neighbour distances, F(·) Tests for CSR For a Poisson process (ie CSR) then the theoretical distribution functions G(s) = F(s) = 1 - exp(- s 2 )

Further models for Spatial point processes Poisson cluster process Inhomogeneous Poisson process Cox process Inhibition process

The problem of geostatistics Given observations at n sites Z(s 1 ),…, Z(s n ) What is our estimate of Z(s 0 ) where s 0 is an unobserved site

The autocovariance function The autocorrelation function

The (semi)variogram In terms of the autocovariance

Kriging Assuming that the mean is zero And the prediction error is …

Kriging in terms of the covariance function Assuming that the mean is zero Meteorologists and oceanographers know this as optimal interpolation (OI) or objective analysis. They usually work relative to a first guess.

A taxonomy of kriging Simple krigingmean known Ordinary krigingmean unknown Universal krigingmean linear function of covariates There are others

Kriging in R There are routines to do kriging in the R libraries:- geoR fields gstat sgeostat spatstat spatdat

Isotropy and Stationarity An isotropic process is one whose properties (in particular the variogram) do not vary with direction A stationary process is one whose properties do not vary with space See Richards definition of stationarity in time series.

Steps in a geostatistical analysis 1.Exploration 2.Estimating the variogram 3.Kriging

Estimating the variogram The obvious estimator is An alternative is the robust estimator

What does a generic variogram look like

Fit a variogram model Rather than look at the empirical variogram we can fit a model. See table The previous examples are a spherical and an exponential variogram

Kriging Once we have a estimate of the variogram we can perform kriging

An example Wave period in the North Atlantic measured by the radar altimeter on TOPEX/POSEIDON We will concentrate on the area in red

Exploratory Phase Either cut and paste from spat0.r or source(spat0.r,echo=T) load('EMS.rda') par(mfrow=c(2,2)) plot(periodsmall\$lon,periodsmall\$lat,pch='.') hist(periodsmall\$Tz) hist(log(periodsmall\$Tz)) periodsmall\$lnTz<-log(periodsmall\$Tz) tiny.period<- data.frame(periodsmall[seq(1,length(periodsmall\$Tz),100),]) plot(tiny.period\$lon,tiny.period\$lat,pch='.')

Exploratory phase - 2 (spat1.r) library(akima) par(mfrow=c(2,1)) int.Tz<- interp.old(tiny.period\$lon,tiny.period\$lat,tiny.period\$lnTz) image(int.Tz,xlim=range(tiny.period\$lon),ylim=range(tiny.peri od\$lat)) contour(int.Tz,add=T) persp(int.Tz,xlim=range(tiny.period\$lon),ylim=range(tiny.perio d\$lat),xlab='lon',ylab='lat',zlab='log period',phi=35)

Estimating the variogram (spat2.r) library(geoR) tiny.geo<- as.geodata(tiny.period,coords.col=c(3,2),data.col=9) # create a variogram tiny.var<-variog(tiny.geo,estimator.type='classical') # the robust estimator tiny.var.robust<-variog(tiny.geo,estimator.type='modulus') par(mfrow=c(2,1)) plot(tiny.var) plot(tiny.var.robust)

Fitting a variogram model (spat3.r) tiny.var.fit<- variofit(tiny.var.robust,ini.cov.pars=c(0.04, 25.0),cov.model='exponential', fix.nugget=FALSE,nugget=0.005) lines(tiny.var.fit)

Kriging par(mfrow=c(1,1)) loci<-expand.grid(seq(0,20)-50,seq(0,20)+25) kc<- krige.conv(tiny.geo,loc=loci,krige=krige.control(type.krige ='ok',obj.model=tiny.var.fit)) image(kc,loc=loci) contour(kc,add=TRUE)

persp(kc, loc = loci,phi=45,xlab='lon',ylab='lat',zlab='log Tz')

If you have time Try different forms for the variogram: –gaussian –spherical –?cov.spatial for details