Appropriate Sampling Ann Abbott Rocky Mountain Research Station

Name: Appropriate Sampling Ann Abbott Rocky Mountain Research Station
Uploaded: 2017-08-24T20:29:20+00:00
Duration: PTM20S7
Description: Appropriate Sampling Ann Abbott Rocky Mountain Research Station

Appropriate Sampling Ann Abbott Rocky Mountain Research Station
Moscow Forestry Sciences Laboratory

Outline What is Appropriate Sampling How do we do it Questions to Ask
Sampling Designs Sample Size Northern Region Protocol

What is appropriate sampling?
Meets the objectives of the research question Representative of the population Feasible Cost effective

Appropriate Sampling is the RESULT
Of answering a series of questions The answers to the appropriate questions lead naturally to the appropriate Sampling Design and Data Analysis/Interpretation

Questions to Ask Objectives of the Research Population for inferences
Sampling Units Translation of the objectives Preliminary Information Choice of Sampling Design

Questions to Ask Determination of Sample Size Auxiliary Variables
Randomization Recording Results Analysis

Stating the Objectives
Have the objectives of the investigation been clearly and explicitly stated, along with the reasons for undertaking it? Have the objectives been translated into precise questions that sample determinations can be expected to answer?

Defining the Population
Has the population about which inferences to be made been carefully defined? What constraints are to be placed on the population? Are the units to be measured or counted representative of the population? If not, what changes must be made to ensure representativeness?

Defining the Population
Is there a logical framework for the choice of sample units from the defined population? If not, what steps can be taken to impose a logical sampling frame?

Sampling Units A successful sampling scheme involves the selection of an appropriate sampling unit Quadrat Leaves of a plant Individual organism Belt transect Point

Sampling Units Are the sampling units naturally defined?
If not, how will they be defined? Is the number of sampling units finite? If it is finite, is the total number of units in the population large enough to ignore finite sampling considerations? Is the definition of the sampling units appropriate to the objectives

Choice of Sampling Unit
Must be the unit upon which you wish to make inferences and estimates Defined to be “nonoverlapping collections of elements from the population that cover the entire population” Sampled without replacement

Choice of Sampling Unit
Point versus Area Point samples allow inferences based on the number of observations in the sample Inferences are made on means or percentages from the sample observations Area samples are generally measured with densities or percent of the area covered Inferences are made by extrapolating the sample density to the entire area

Choice of Sampling Unit: Point vs Area
Point samples are quicker, can potentially give a more cost effective coverage of the area Area samples can yield more detailed information but may be more time consuming Area sampling assumes that counts are made without error

Translating the Objectives
What exactly is to be estimated or tested? Are the required estimates proportions, totals, means, totals or means over sub-populations, or something else? Have blank data sheets been constructed? What is the smallest subset of data from which estimates are to be made? What precision is required of the estimates for the various subsets?

Preliminary Information
Is information about the population available that may be helpful in designing the sampling scheme? Are estimates of the likely variability available? Is a pilot study feasible or desirable? Are there any known factors that help stratify the population?

Variability The variation that is inherent in soils data must be accounted for during the design phase of a soil sampling plan, including Sampling design Data collection procedures Analytical procedures Data Analysis

“One of the key characteristics of the soil system is its extreme variability.” (Mason 1992)
Researchers have long been cautioned about failing to consider the variability in soil sampling when dealing with any study of the soils system (e.g. Cline 1944).

Accounting for Variability
Ensuring that the sample adequately covers the entire population Reporting variability estimates along with central tendency estimates Reporting interval estimates

Use an interactive approach to balance the data quality needs and resources with designs that will either control variation, stratify to reduce variation, or reduce the influence of variation on the decision process

Precision, Bias and Accuracy
Precision is a measure of the reproducibility of measurements of a particular soil condition or constituent The statistical techniques seen in soil sampling are designed to measure precision and not accuracy Bias is a systematic error that contributes to the difference between the mean of a large number of test results and an accepted reference value.

Precision, Bias and Accuracy
Accuracy is the correctness of the measurement and cannot be directly measured: it is the sum of precision and bias Red dots are precise but biased Blue dots are unbiased but imprecise Yellow dots are biased and imprecise Green dots are unbiased, precise and therefore accurate

Sampling Designs Simple Random Sampling Stratified Random Sampling
Systematic Random Sampling Cluster Sampling Other Combinations

Sampling Designs Can the population as defined be broken into naturally occurring groups, where the grouping variable affects the measured variable(s)? If it cannot, Simple Random Sampling or Systematic Sampling can be effective If it can, Stratified Random Sampling or Cluster Sampling

Simple Random vs Systematic
Simple Random Sampling: If there a “list” (sampling frame) of all sampling units in the population Randomly selects from units on the list Systematic Random Sampling: If there is no sampling frame available but there is an estimate of the total number of sampling units Randomly selects starting point

Simple Random Sampling
Used when there is inadequate information for developing a conceptual model for a site or for stratifying a site Any sample in which the probabilities of selection are known Sampling units are chosen by using some method using chance to determine selection

Simple random sampling is the basis for all probability sampling techniques and is the point of reference from which modifications to increase sampling efficiency may be made Alone, simple random sampling may not give the desired precision

Simple Random Sampling
Advantages Prior information about population is not necessary Easy to perform, easy to analyze Disadvantages May not give desired precision Need a sampling frame

Computation Simple Random Sample-continuous variable Mean Variance
Confidence Interval Sample Size

Computation Simple Random Sample-Binomial variable Proportion Variance
Confidence Interval Sample Size

Systematic Random Sampling
Attempt to provide better coverage of the study area or population than that provided by a simple random sample or a stratified random sample Is a simple random sample based on spatial distribution over the site Does not require a complete list of sampling units Can give better coverage than a simple random sample

Requires some estimate of the total number of sampling units in the population Required sample size must be calculated Determine sampling interval between units Randomly select starting point Transect sampling is a version of Systematic Random Sampling

Collects samples in a regular pattern over the area in the investigation Grid Line Transect Orientation of grid or transect starting point should be randomly selected

Considerations Sample size and population size estimates Some knowledge of the population to avoid sampling along periodicities

Stratified vs Cluster Sampling
Used when the population can be broken into naturally occurring groups or segments Stratified Random Sampling: when there is more variability among groups than within groups Cluster Sampling: when there is more variability within groups than among

Stratified Random Sampling
Prior knowledge of the sampling area and information obtained from background data may be used to reduce the number of observations necessary to attain specified precision Goal is to increase precision and control sources of variability in the data

Variability between strata must be larger than variability with strata for any benefit to be seen Sampling within each stratum is done with a Simple Random Sample

Advantages Gives estimates for subgroups Can be more precise than Simple Random Sampling Can be more convenient to implement Disadvantages Requires prior information about the population More complicated computation

Computation Stratified Random Sample-continuous variable Mean Variance
Confidence Interval

Stratified Random Sample
Sample Size Calculation Requires information about the relationship between the individuals among strata Can be calculated by weighting strata Can allocate sampling based on minimizing the variance for a fixed cost Other ways to allocate sampling among strata (optimal, Neyman)

Post Stratification Can be used when stratification is appropriate for some key variable, but cannot be done until after the sample is selected Often appropriate when a simple random sample is not properly balanced according to major groupings

Post Stratification Mean Variance

Cluster Sampling Used when there is more variability within groups than among Groups are randomly sampled Units within groups are sampled Can sample every element within the group Can take a second random sample within the group

Questions to Ask in Choosing a Sampling Design
If there is no information on population groupings, will simple random or systematic random sampling better meet the objectives? Is Simple Random Sampling likely to be effective? If not, have the reasons for not using simple random sampling been clearly stated?

If Systematic Random Sampling is chosen, what interval will separate units? Is there a likelihood that the interval will coincide with periodicity in the data? If so, what steps will be taken to avoid the resulting bias in the estimates?

If there is a grouping in the population, will stratification improve the precision of the estimates? Has the efficiency of the stratification been calculated? What is the basis of the stratification? How will the sampling units be allocated?

If there is a grouping in the population, is there an advantage to cluster sampling? Has the efficiency been calculated?

Sample Size Calculated based on variability (standard deviation) within the population and desired precision of the estimate (confidence level) Simple Random Sample and Systematic Random Sample Stratified Random Sample (complicated) but still needs variance

Sample Size Specific sampling design considerations
Systematic: is the sample size required to uniformly cover the population consistent with the expected precision? Stratified: has the efficiency of the stratification been tested in reducing the sample size or in obtaining the largest number of observations from the part of the population of greatest interest?

Sample Size Sample design considerations, continued
Multistage: has the efficiency of various combinations of sample units at different stages been tested? Cluster: has the efficiency of various size clusters been tested?

Sample Size Cost considerations
Must the number of observations be modified to account for variation in cost in different parts of the sampling procedure? If so, can the design be improved for better cost efficiency?

Randomization Have the sampling units been selected by an explicit randomization procedure? Has the randomization procedure been documented? Were any constraints correctly applied?

Sample Design Example Northern Region Soil Monitoring Protocol
Goal: Develop an easy, cost effective and statistically defensible monitoring protocol for disturbance Stating the objectives: Characterize the activity area in terms of management related disturbance

Northern Region Protocol
Defining the population: All possible ‘points’ within the Activity Area Sampling units defined as ‘points’ Infinite number of possible ‘points’ in the population so finite sample correction factors do not need to be used

Sample Design Stratification may be desirable but variability information is unavailable Simple Random Sampling may not give the appropriate coverage Systematic Random Sampling (Transect) was chosen to give the best coverage of the area

Northern Region Protocol Translating the objectives What exactly will be measured or tested:
Forest floor depth Forest floor missing Topsoil displacement Mixed topsoil/subsoil Erosion Rutting (3 depths) Burning (light, moderate, severe) Compaction (3 depths) Platy/massive structure (3 depths) 5 forest floor variables

Translating the objectives: Blank data tables

Translating the objectives: what exactly is to be estimated or tested? What proportion of points in the sample have the characteristic of the indicator variable? What is the variability associated with the proportion?

Translating the objectives: What is the required precision of the estimates? Confidence intervals within ± 5% of the estimate Confidence levels are determined by the line officer, allow choice from 70% to 95%

What preliminary information is available about the activity area? Approximate size and shape Harvest history Variability estimates generally unknown A pilot would be best Stratification potential exists

Problem: Variability estimates are unavailable Pilot studies are not feasible due to time and cost constraint Statistically valid sample sizes are required

Sequential Sampling An alternative approach to sampling in which the sample size is not fixed in advance Observations are collected individually or in small batches After each observation or batch, the data are examined to determine whether or not a decision may be made from the accumulated data

Sequential Sampling Combines data collection and data analysis into a single process or sampling plan Can considerably reduce the sample size requirements and data processing overheads

Sequential Sampling Best used in situations where classification of a population is useful and where the emphasis is on decision making In the simplest and most frequently used form, it is used to make binary classifications but can be extended into other applications

Use a combination of sequential and systematic random sampling to obtain variability information for sample size calculation at the same sampling visit as the full data collection trip First 30 observations are used to calculate initial sample size, then sample size is continually updated as sampling continues

Indicator variables are binomial (0,1) Binomial variables converge to a normal distribution when n ≥ 30 Attractive for sampling since the maximum variability can be computed

When sampling is complete for the activity area, the estimates and confidence intervals are computed Protocol allows field crews to sample an activity area with a statistically valid sample size in one visit

Appropriate Sampling Ann Abbott Rocky Mountain Research Station

Similar presentations

Presentation on theme: "Appropriate Sampling Ann Abbott Rocky Mountain Research Station"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Appropriate Sampling Ann Abbott Rocky Mountain Research Station

Similar presentations

Presentation on theme: "Appropriate Sampling Ann Abbott Rocky Mountain Research Station"— Presentation transcript:

Similar presentations

About project

Feedback