# Rachel Fewster Department of Statistics, University of Auckland Variance estimation for systematic designs in spatial surveys.

## Presentation on theme: "Rachel Fewster Department of Statistics, University of Auckland Variance estimation for systematic designs in spatial surveys."— Presentation transcript:

Rachel Fewster Department of Statistics, University of Auckland Variance estimation for systematic designs in spatial surveys

Method of estimating density of objects in a survey region. Line transect sampling

D # detections per unit area = p

D = p Line transect sampling Density, D

Estimate the variance of the ratio by the Delta method: “squared CVs add” D # detections per unit area = p ENCOUNTER RATE easy ENCOUNTER RATE VARIANCE: Largest and most difficult component Usually >70% of total variance

Encounter rate estimates mean detections per unit line length Encounter Rate and its variance

Inferential framework: which Var(n/L)? Animals from spatial p.d.f. Select lines Detect animals Variance is defined over conceptual survey repeats Find n/L

Inferential framework: which Var(n/L)? Animals from spatial p.d.f. Select lines Detect animals Variance is defined over conceptual survey repeats Find n/L Gained value of n/L from first survey

Same animals, new positions Second survey: Inferential framework: which Var(n/L)?

Select new lines Same animals, new positions Detect new animals Find new n/L Inferential framework: which Var(n/L)? Second survey:

Select new lines Same animals, new positions Detect new animals Find new n/L Inferential framework: which Var(n/L)? Gained value of n/L from second survey Overall, gives var(n/L) across the repeated surveys This is our ENCOUNTER RATE VARIANCE.

To estimate a variance, use repeated observations with the same variance Random-line estimator: makes no assumptions about the unknown distribution of objects; How to estimate Var(n/L)?

To estimate a variance, use repeated observations with the same variance Random-line estimator: makes no assumptions about the unknown distribution of objects; random variables are IID with respect to the design. How to estimate Var(n/L)?

Systematic Survey Designs Surveys usually use SYSTEMATIC transect lines, instead of random lines. Grid has random start-point

Systematic lines give LOWER VARIANCE than random lines in trended populations But the variance is HARD TO ESTIMATE A systematic sample has NO REPETITION: it is a sample of size 1!

Variance for systematic designs There is no general design-unbiased variance estimator for data from a single systematic sample Approaches to systematic variance estimation are: 1.Ignore the problem and use estimators for random lines 2.Use some form of post-stratification 3.Model the autocorrelation in the systematic sample Approach used to date

Variance for systematic designs There is no general design-unbiased variance estimator for data from a single systematic sample Approaches to systematic variance estimation are: 1.Ignore the problem and use estimators for random lines 2.Use some form of post-stratification 3.Model the autocorrelation in the systematic sample Approach in Fewster et al, Biometrics, 2009

But the stratified estimators are still biased sometimes – e.g. high sampling fraction, or population clustering Stratified variance estimators: results Can we do better…?

Variance for systematic designs There is no general design-unbiased variance estimator for data from a single systematic sample Approaches to systematic variance estimation are: 1.Ignore the problem and use estimators for random lines 2.Use some form of post-stratification 3.Model the autocorrelation in the systematic sample

Historical Note Many estimators for systematic designs originated in social statistics – discrete surveys Correlation will clearly exist in responses of neighbours, but modelling the correlation is hard!

But space is continuous! As a strip changes position very slightly...... it still covers many of the same objects.

But space is continuous! As a strip changes position very slightly...... it still covers many of the same objects. Idea: 1.Divide the region into hundreds of tiny ‘striplets’ 2.Allow the number of objects available in each striplet to be random variables X 1, X 2, …, X J 3.The number of objects available in any full strip is the sum of the objects in the constituent striplets

1.Divide the region into hundreds of tiny ‘striplets’ 2.Number of objects available in striplets 1, 2, …, J is X 1, X 2, …, X J 3. Number of objects available in any full strip is the sum of the objects in the constituent striplets. Expected number of objects per striplet Random number of objects per striplet, X 1, X 2, …, X J ~ Multinomial Striplet #objects available striplet position 0 1 2 3 4

1.Divide the region into hundreds of tiny ‘striplets’ 2.Number of objects available in striplets 1, 2, …, J is X 1, X 2, …, X J 3. Number of objects available in any full strip is the sum of the objects in the constituent striplets. Striplet #objects available striplet position 0 1 2 3 4 Full strip at this position: 10 objects Full strip at next position: 7 objects Full strip at next position: 8 objects... etc

Recap: We want the variance in the encounter rate, n/L, over: 1.Moving grid; 2.Moving objects; 3.Detections Account for: 1.Large-scale trends 2.Small-scale noise

1. Trends in object density across the region Observed number of detections per unit search area #detections / unit area Points correspond to observed transects Fit a GAM to give a fitted object density for any search strip in the region x-coordinate

#detections / unit area x-coordinate 1. Trends in object density across the region Fit a GAM to give a fitted object density for any search strip in the region For any striplet j, we now have an expected number of objects available,  j

Expected number of objects per striplet,  j Striplet #objects available striplet position 0 1 2 3 4 Account for: 1.Large-scale trends

Striplet #objects available striplet position 0 1 2 3 4 Account for: 2. Small-scale noise Random number of objects per striplet, X 1, X 2, …, X J ~ Multinomial(N,  j /N) Striplet idea means we correctly model the autocorrelation between systematic grids

Striplet #objects available striplet position 0 1 2 3 4 Account for: 2. Small-scale noise

Recap: We want the variance in the encounter rate, n/L, over: 1.Moving grid; 2.Moving objects; 3.Detections Variance in number of objects available is taken care of (1 & 2) Variance in detections is Binomial given #objects available (1 & 2)

Law of Total Variance: b is the grid placement: Mean and variance of #detections, n, given grid placement, is all that’s needed.

Striplet variance estimator:

Simulation Results: 3 habitat types but no clustering Clustering included

Simulation Results: Red lines give correct answers

Simulation Results: Ignoring the systematic design: appalling performance!

Simulation Results: Post-stratification: improvement but still clear bias

Simulation Results: Striplet method: huge improvement!

Spotted Hyena in the Serengeti

Short grass plains: prey herds congregate in wet season Long grass plains: unattractive in wet season

Spotted Hyena in the Serengeti Wet season: non-territorial ‘commuters’ (n=186) Dry season: territorial residents (n=53)

Wet season: highly clustered. cv(n/L) is: -17% ignoring systematic design -14% using poststratification -7% using striplets! Overall cv(D) is: -20% ignoring systematic design -17% using poststratification -11% using striplets The estimator matters!

Dry season: not clustered; small n cv(n/L) is: -15% ignoring systematic design -12% using poststratification -13% using striplets Overall cv(D) is: -23% ignoring systematic design -20% using poststratification -21% using striplets Not much difference

In Revision, Biometrics

1. For a systematic design, variance estimators based on random lines are not adequate for trended or clustered populations 2. Post-stratification improves estimation for trended pops, but far from perfect 3. New ‘striplet’ method huge improvement in all line/strip situations trialled to date Variance can be highly overestimated Conclusions

Striplet variance estimator: B is the number of possible grids, in discrete approximation  j is fitted #objects in striplet j g j (b) is fitted P(detection) in striplet j

Williams & Thomas, JCRM 2008 Application: British Columbia multi- species marine survey Select species with greatest and least trends in encounter rate for illustration

Greatest trend: Dall’s Porpoise Highest encounter rates on short lines Worst case!

Least trend: floating plastic garbage No trend in encounter rate with line length

Results Dall’s Porpoise: previous reported CV=31% Stratified methods: reported CV=19% Estimated CV=31% using Poisson-based estimator with no adjustment for systematic lines Estimated CV=19% using design-based estimator with post-stratification and overlapping strata

Results Floating garbage: previous reported CV=15% Stratified methods: reported CV=14% For untrended population, there is little difference in the different estimators

But space is continuous! As a strip changes position very slightly...... it still covers many of the same objects.

But space is continuous! As a strip changes position very slightly...... it still covers many of the same objects. Idea: 1.Divide the region into hundreds of tiny ‘striplets’ 2.Allow the number of objects available in each striplet to be random variables X 1, X 2, …, X J 3.The number of objects available in any full strip is the sum of the objects in the constituent striplets

1.Divide the region into hundreds of tiny ‘striplets’ 2.Number of objects available in striplets 1, 2, …, J is X 1, X 2, …, X J 3. Number of objects available in any full strip is the sum of the objects in the constituent striplets. Striplet #objects available striplet position 0 1 2 3 4 Expected number of objects per striplet Random number of objects per striplet, X 1, X 2, …, X J ~ Multinomial

1.Divide the region into hundreds of tiny ‘striplets’ 2.Number of objects available in striplets 1, 2, …, J is X 1, X 2, …, X J 3. Number of objects available in any full strip is the sum of the objects in the constituent striplets. Full strip at this position: 10 objects Full strip at next position: 7 objects Full strip at next position: 8 objects... etc Striplet #objects available striplet position 0 1 2 3 4

1. Trends in object density across the region Observed number of detections per unit search area #detections / unit area Points correspond to observed transects Fit a GAM to give a fitted object density for any search strip in the region x-coordinate

1. Trends in object density across the region #detections / unit area Fit a GAM to give a fitted object density for any search strip in the region x-coordinate For any new grid placement, we now have an expected number of objects available for that grid

Download ppt "Rachel Fewster Department of Statistics, University of Auckland Variance estimation for systematic designs in spatial surveys."

Similar presentations