Presentation is loading. Please wait.

Presentation is loading. Please wait.

Western Ecology Division Web Page:

Similar presentations


Presentation on theme: "Western Ecology Division Web Page:"— Presentation transcript:

1 Western Ecology Division Web Page: http://www.epa.gov/NHEERL/ARM
Statistical Perspective on the Design and Analysis of Natural Resource Monitoring Programs Anthony (Tony) R. Olsen USEPA NHEERL Western Ecology Division Corvallis, Oregon (541) Web Page:

2 National Water Quality Monitoring Council: Monitoring Framework
Applies to all natural resource monitoring Monitoring pieces must be designed and implemented to fit together View as information system National monitoring requires consistent framework Reference: Water Resources IMPACT, September 2003 issue It is not perfect but it is good framework. May lack clarity on which activities belong to which cog. Note that the cogs were consciously selected to not have sharp boundaries – reflection of reality of designing a monitoring program.

3 Impact Article Contributors
Data Collection Franceska Wilde Herbert J. Brass Jerry Diamond Data Management Karen S. Klima Kenneth J. Lanfear Ellen McCarron Assess and Interpret Dennis R. Helsel Lindsay M. Griffith Report Results Mary Ambrose Abby Markowitz Charles Job Framework Overview Charles A. Peters Robert C. Ward The Three C’s Abby Markowitz Linda T. Green James Laine Monitoring Objectives Charles S. Spooner Gail E. Mallard Monitoring Design Tony Olsen Dale M. Robertson Several of the authors are also members of the Monitoring Council. The framework is a product of the council; the IMPACT articles were based on work by the council.

4 Monitoring Program Weaknesses
Monitoring results are not directly tied to management decision making (monitoring objectives) Results are not timely nor communicated to key audiences (convey results) Objectives for monitoring are not clearly, precisely stated and understood (monitoring objectives) Monitoring program not viewed/implemented as an information system (data management, overall) Monitoring measurement protocols, survey design, and statistical analysis become scientifically out-of-date (field/lab methods, monitoring design, data analysis/assessment) Why monitor if you are not making a difference. Failure may be reflected in monitoring program not being adequately funded. Monitoring programs must continually improve and so must plan for change. Note that only cog not represented is field operations! Individuals designing a monitoring program typically spend more time on the details of field/lab methods, monitoring design, and data assessment. These areas are not one of the major weaknesses – unless they are not updated to keep current. First three reasons are there because they are HARD and don’t involve just science. Now will address comments on statistical perspective cog by cog.

5 Communicate, Coordinate, Collaborate
Communication: process of conveying information; can be one way or an exchange of thoughts, messages, or ideas Coordination: process in which two or more participants link, harmonize or synchronize interaction and activities Collaboration: process in which two or more participants work collectively to deal with issues that they cannot solve individually; partnerships, alliances, teams Two aspects: one internal to a monitoring program and one external. Internal: multiple disciplines are essential in the development of a monitoring program. Learning how to communicate across these disciplines is not always easy nor straightforward. Terminology differs, cultures differ. Disciplines: Decision-makers, management, natural resource specialists, statistical specialists, field operation specialists, laboratory specialists, data management specialists, communication specialists In some cases, a monitoring program must serve multiple “masters” or decision-makers. Need to make sure the Three C’s are effective with all of them Rarely does a monitoring program exist in a vacuum. Other organizations may monitor the same resource: federal, state, local groups. Cost-effective monitoring across all these organizations places a premium on the Three C’s. A statistical perspective is part of each of the six cogs – consequently a statistical specialist has much to contribute to effective communication, coordination, and collaboration

6 Statistical perspective is key
Convey Results and findings Develop monitoring objectives Design monitoring program Kish (1965): “The survey objectives should determine the sample design; but the determination is actually a two-way process…” Initially objectives are stated in common sense statements – challenge is to transform them into quantitative questions that can be conveyed precisely to intended audience. Statistical perspective is key Know whether a monitoring design can answer the question Know when the question is not precise enough – multiple interpretations Useful to think about what tables and graphics will be used to convey results and findings in a report. Do this assuming that can visit all sites (ie ignore the issue of site selection). Also useful to think about what the monitoring program would present to decision-makers when they only have minutes to make the presentation. Helps focus on high priority objectives. I recognize that data from a monitoring program will be summarized and used in many ways that was not anticipated – that is fine. What don’t want to do is lose sight of what key information is required by decision-makers and public. Developing monitoring objectives must be done in the context of institutional constraints. That is a major reason it is difficult. Major institutional constraint is funding. It makes a difference if are designing for a $1M, $10M, or $100M national monitoring program.

7 Identify Monitoring Objectives
Objectives determine the monitoring design (yet monitoring design constrains objectives that can be met) Usual to have multiple objectives Precise statements are required Objectives must be prioritized Objectives compete for samples Statistical perspective helps identify Target population Subpopulations that require estimates Elements of target population Potential sample frames Variables to be measured Impact of precision required Abstract concepts from survey design and experimental design are useful when discussing monitoring objectives.

8 Example: From Question to Objective
What is the quality of waters in the United States? What is the quality of streams with flowing water during summer in the U.S.? What is the biological quality of streams with flowing water during summer in the U.S.? How many km of streams with flowing water during the summer are impaired, non-impaired, or marginally-impaired within the U.S.? How is impairment determined? What is meant by summer? Are constructed channels, canals, effluent-dominated streams included? Want the objective to require a quantitative answer that can be used by management. The process of making the objectives quantitative and specific identifies the target population is identified, the subpopulations of interest, the variables to be measured, and the types of statistical summaries required to convey results

9 Key components of monitoring design
Develop monitoring objectives Design monitoring program Collect field and lab data Key components of monitoring design What will be monitored? (target population) What will be measured? (variables or indicators) When and how frequently will the measurements be taken? (temporal design) Where will the measurements be taken? (site selection) Statistical perspective Sample frame and target population Survey design “What will be measured” is not considered in this presentation.

10 What is a Target Population?
Target population denotes the ecological resource for which information is wanted Requires a clear, precise definition Must be understandable to users Field crews must be able to determine if a particular site is in the target population More difficult to define than most expect. Includes definition of what the elements are that make up the target population All forests within the United States is a target population for the FIA monitoring program. How do define a forest? Does it include urban forested areas? Transition zones exist between what is clearly forested land and clearly rangeland. When are no longer in forest? When is a lake, a lake of interest? National Lake Fish Tissue survey gave what we thought was a very precise definition of a lake. One lake waterbody selected in the sample was a tertiary treatment pond associated with sewage treatment. It met all requirements of being a lake, including having a permanent fish population. Pond was fenced off from public access. Monitoring objectives was to estimate concentrations of contaminants in fish tissue. Results would be used not only for human consumption but also wildlife consumption perspective. Since it was known that wildlife ate fish from the pond, decision was made that it was part of the target population. In the course of discussion, also became known that humans also climbed the fence to catch fish!

11 Target Population, Sample Frame, Sampled Population
We Live in an Imperfect World… Target population denotes the ecological resource about which estimates are needed Defined conceptually using written text Must be sufficiently specific so that it is clear if an aquatic resource is included or not. Must define what are the elements of the target population. Elements may be any location in an estuary, a lake, any point on a stream network, or a 6th field Hydrologic unit Sampling Frame is a physical representation of the target population It consists of sample units that are potential members of the sample Extent (size) of the frame is obtained by summation Sample Frames almost always are not exact representations of the target population Sample Frame may not include some Target Population elements: undercoverage Sample Frame may contain non-target elements, e.g., mis-identified sample units: Overcoverage A subset of the Sample Frame sample units are selected for sampling: the sample Probability survey designs used to select the subset One design: Generalized Random Tessellation Stratified Designs - GRTS May include stratification, unequal probability selection, panels for surveys over time Sample Frame overcoverage and sample site field access problems addressed by including an OversampleSampling Units are the Sites selected for sampling Sampled Population is a conceptual population that is a subset of intersection the Target Population and the Sample Frame It excludes portion of the Target Population within the Sample Frame that could not be sampled (conceptually) due to access problems, lost samples, or other reasons a sample could not be collected It doesn't include part of the Sample Frame that is determined to not be elements of the Target Population Population Estimates are based on All Sites Evaluated for potential field sampling Site Evaluation and Field Sampling Categorizing each Sample Site is critical information Target Sampled -- Site Information Collected Landowner Denial -- Some landowners deny field crew access Physical Barrier -- Site can not physically be reached within protocols or for safety reasons Target Not-Sampled -- Sample lost, field season ended before site could be sampled, and many other reasons Non-Target -- Site not element of target population Population Extent estimates made for each Site Category Provides estimate of the Target Population extent if it is not known Provides estimate of the Sample Frame overcoverage extent, i.e., how much too large is Frame Provides estimate of percent of Target Population that is expected to have landowners deny access Population Status estimates based on Target Sampled Sites (e.g., IBI score, non- Impairment) Potential Corrections and Assumptions Non-Target Site Information can be used to determine if Sample Frame should be improved (mis-identified units, extent) Estimates based on Target Sampled sites apply to the Sampled Population -- with no additional assumptions Estimates based on Target Sampled sites can apply to the portion of Target Population within the Sample Frame ONLY IF assume that the Access Denied, Target Not-Sampled, etc., sites occurred randomly and independently of site characteristics Estimates for Target Population NOT ONLY require assumptions above BUT ALSO that portions of Target Population that are not included in the Sample Frame have same characteristics as the Sampled Population Ideally, cyan, yellow, gray squares would overlap completely

12 Basic Spatial Survey Designs
Simple Random Sample Systematic Sample Regular grid Regular spacing on linear resource Spatially Balanced Sample Combination of simple random and systematic characteristics Guarantees all possible samples are distributed across the resource (target population) Generalized Random Tessellation Stratified (GRTS) design

13 Generalized Random Tessellation Stratified (GRTS) Survey Designs
Probability sample producing design-based estimators and variance estimators Give another option to simple random sample and systematic sample designs Simple random samples tend to “clump” Systematic samples difficult to implement for aquatic resources and do not have design-based variance estimator Emphasize spatial-balance Every replication of the sample exhibits a spatial density pattern that closely mimics the spatial density pattern of the resource Developed to meet needs of monitoring programs. This is an example of how long-term associations between statisticians and monitoring professionals can lead to new developments and improvements in cost-effectiveness and scientific-defensibility of monitoring programs.

14 Spatial Balance: 256 points

15 Why aren’t Basic Designs Sufficient?
Monitoring objectives may include requirements that basic designs can’t address efficiently Estimates for particular subpopulations requires greater sampling effort Administrative restrictions and operational costs Natural resource in study region makes basic designs inefficient Resource may be known to be restricted to particular subregions Complex designs may be more cost-effective

16 Example of a spatially-balanced design with unequal probability of selection based on lake area

17 Example of a spatially-balanced survey design with (1) unequal probability selection based on overlapping subpopulations that were of interest, and (2) nested subsampling of indicators related to increased cost to acquire some indicators.

18 National Wadeable Stream Assessment 2004
Spatially balanced survey design for streams with (1) stratification by states, (2) unequal probability selection based on Omernik ecoregions and stream Strahler order, and (3) intensive study regions for specific subpopulations.

19 Survey Design & Response Design
Survey design is process of selecting sites at which a response will be determined Which sites will be visited (spatial component) Which monitoring season will sites be visited (temporal component, panel design) Response design is process of obtaining a response at a site: When site is to be visited within a monitoring season A single index period visit during a monitoring season Multiple visits during monitoring season: e.g. monthly, quarterly Field plot design Process of going from basic field measurements to indicators Monitoring design can be thought of in terms of a survey design and a response design. The split is somewhat artificial but is useful in the design process.

20 Statistical perspective
Collect field and lab data Design monitoring program Compile and manage data Components Field methods (response design) Laboratory methods Measurement quality objectives Quality assurance & quality control Logistical plan and gaining site access Statistical perspective Experimental designs to determine cost-effective and scientifically-defensible response designs Statistical quality control Methods for minimizing non-response Many statistical aspects involved in obtaining results that require laboratory analyses. Chemical laboratory operations have a long history in the use of statistics, including inter laboratory comparisons. Biological sample counting, and other physical sample operations, laboratories also incorporate statistics into their operations. The success of these operations is directly related to data quality and data comparability for the monitoring program for these samples. Like to give a few examples concerning response designs and then one example on non-response.

21 Response Design - Fish Species Richness (% of Maximum) Stream Length
10 20 30 40 50 60 70 80 100 Stream Length (Channel Width Units) Species Richness (% of Maximum) 1-pass sampling Spread effort throughout reach Get “common” species in approx. relative abundance EMAP conducted species-area studies to determine the length of stream reach required to be sampled to capture fish community information. Not cost-effective (or even possible) to get 100% detection of all species in a large-scale stream monitoring program. Must address the question of what can be done that is scientifically-defensible AND still provides the information required by decision-makers.

22 Response Design: Benthos and Periphyton
C K J I H G F E D FLOW Distance between transects=4 times mean wetted width at X-site X-site Total reach length=40 times mean wetted width at X-site (minimum=150 m) R L SAMPLING POINTS L=Left C=Center R=Right First point (transect B) determined at random Subsequent points assigned in order L, C, R EMAP conducted research studies on the field plot design for collecting biological and physical habitat information to determine if the signal to noise ratio (variation across plots divided by repeat measurement variation) was strong enough. One basic principle: many small samples that are composited are better than a single large sample. Provides better coverage of variation in habitats.

23 US Forest Service Forest Inventory and Analysis (FIA) Plot Design
FIA and FHM conducted studies on cost-efficiency of the field plot design. Although might be desireable to count/measure all vegetation within large annular plot, it is not cost-effective to do so.

24 Minimizing Non-Response: Prairie Potholes
Landowner contact procedure Obtain owner list from USDA ASCS local office Cover letter explained study, random selection, measurements, walking access only, timing/duration visit, offer to honor special owner conditions Consent form Map of identifying wetland to be visited Telephone contact 2-4 weeks after letter – list of FAQs and answers provided to personnel Second letter 5-6 weeks after initial letter Access rates: private land 42% 25% of access approvals required multiple contacts From Lesser et al (2001) Sampling prairie pothole wetlands is a contentious issue since it typically involves gaining access to them through agricultural fields. In aquatic surveys of streams, EMAP has found that how landowners are approached also makes a considerable difference. Survey researchers have considerable experience in how to contact survey participants and increase the probability of their responding. Natural resource monitoring is beginning to take advantage of this knowledge.

25 Components: compile and manage data
Collect field and lab data Assess and interpret data Compile and manage data Components: compile and manage data Data entry Database development Metadata Data preservation Data discovery and retrieval Statistical perspective Statistical QA checking of data Access to auxiliary data used in statistical analyses Influence retrieval and database design Importance of preserving design information Checking data can be a time-consuming and complex process. One type of checking is completed for each data item individually – typically involved checking that data value is legitimate response through comparison with acceptable responses or acceptable range. Second type of checking is across sites for each data item – are outliers present is the question relative to rest of the sites. Third type of checking is multivariate across sites and selected data items - % landcover can’t add up to more than 100%, etc. Auxiliary data: 1. summary of sample frame characteristics may be necessary in weight adjustment process. 2. May have known values that are used to constrain estimates – Total land area by county, etc. 3. Remote sensing information that will be used in regression estimators

26 Examples STORET modified to include survey design information
Which sites are part of the survey design Stratification, weights, cluster variables USGS NWIS and NWISWeb NWIS focus on input/site specific (typically time focus) NWISWeb focus on retrieval (typically spatial focus) National Resource Inventory’s analysis database Statistical imputation for missing data Statistical creation of pseudo points Incorporate known information Link across years for consistency Determination of single weight for each point in database Results in a single, consistent database for 1982, 1987, 1992, … that is easy to use for statistical analyses

27 Derived indicator construction Statistical Design-Based estimation
Convey Results and findings Compile and manage data Assess and interpret data Derived indicator construction Statistical Design-Based estimation Statistical Model-assisted and model-based estimation Inference to unsampled locations Spatial pattern inference (or where is the map!) Semi-empirical modeling Incorporating physical processes Empirical statistical modeling using auxiliary data Derived indicators: Tree volume (FIA), Soil erosion (NRCS) , Index of Biotic Integrity (EMAP), nutrient loads (NAWQA) Considerable statistical modeling typically key part in the development of derived indicators.

28 Design-Based Population Estimation
Scientific inference from sample to population Minimizes assumptions used in the inference process Relies on principles of statistical survey design and analysis Natural resource programs who use Forest Inventory & Analysis National Resource Inventory National Wetland Status and Trends Program National Agricultural Statistics Service programs Environmental Monitoring and Assessment Program (EMAP)

29 Estimating Site Occupancy Rates
MacKenzie, D. I., J. D. Nichols, G. B. Lachman, S. Droege, J. A. Royle, and C. A. Langtimm Estimating site occupancy rates when detection probabilities are less than one. Ecology 83: Likelihood based model for estimation Assumes simple random sample of sites Similar to closed-population, mark-recapture model Estimate probability of occupancy and probability of detection Estimation with complex survey designs Maximum likelihood as before Likelihood must incorporate survey design Stratification Unequal probability of selection Cluster sample

30

31 Statistical Model-assisted and Model-based Estimation
Improve estimation based on complete coverage information Adjustment for non-response at the site level Small area estimation Spatially-explicit model of probability of impairment Identification of “hot spots” likely to be impaired Will see increased use of these techniques

32 Semi-parametric Small Area Model: Northeast Lakes ANC prediction for HUCs
J. Breidt, J. Opsomer, G. Ranalli, G. Claeskens, G. Kauermann Colorado State University STARMAP research program sponsored by USEPA STAR grants program

33 Semi-empirical Modeling: USGS NAWQA
Estimated nitrogen export (kg/km2/yr) for watersheds of the conterminous United States. SPARROW relates in-stream water-quality measurements to spatially referenced characteristics of watersheds, including contaminant sources and factors influencing terrestrial and stream transport. The model empirically estimates the origin and fate of contaminants in streams, and quantifies uncertainties in these estimates based on model coefficient error and unexplained variability in the observed data.

34 Questions to ask when planning reporting
Convey Results and findings Develop monitoring objectives Assess and interpret data Questions to ask when planning reporting What is objective for communicating the results? Who is the target audience? What is message want to convey? What formats will be used to convey the message? Statistical perspective Clarity on scope of inference: target population/sampled population Reporting of precision for results Construction of statistical tables Construction of presentation quality statistical graphics Two of the identified weaknesses for monitoring programs are (1) not being tied to decision-making and (2) not convey results in a timely manner to key audiences. Monitoring program staff are dominated by scientists who are experienced in writing journal articles. They have less experience in communicating to other audiences. Also time to remember that the Monitoring Framework is an information system – not just a data generation system. To produce timely reports requires pre-planning. NASS is an example of an organization that has a history for timely production of reports based on survey results. It does take a “production mentality” to make that happen. Statistical perspective can contribute to effective and timely reporting. The first two have to do with the scientific-defensibility of the report – must be clear what natural resource the results apply to and how well the results are known. When a table or a graph is constructed they should be constructed to communicate a specific message. We no longer need to use tables as a data storage device – that can be done in other ways. Why should a table of results by state have the states listed in alphabetical order? Readers will find it difficult to see a message! A number of statisticians have contributed to our knowledge of how to construct tables and graphics for presentation purposes. Several are Tufte, Wainer, Cleveland, and Carr. For example, Wainer notes: "tables are for communication, not archiving” and “tables can be improved by making them more graphical” Like to show a few statistical graphics that have been used by monitoring programs. Not to say that these are always the best in all circumstances but have been useful.

35 IBI Results Geographic Distribution
(Insufficient Data) North-Central Appalachians Western Appalachians Use of the “Stop light” color model: Red: Poor, Yellow: Fair, Green: Good Note the clear identification of portion of streams where have insufficient data to make an assessment. Ridge and Blue Ridge Valleys

36 Estuarine Stressor Comparison
Benthic invertebrate condition Louisianian Province Virginian Province Degraded 18 ± 8% Degraded 30 ± 6% Undegraded 82 ± 8% Undegraded 70 ± 6% Condition Unknown 10% Unknown 39% Low Dissolved Oxygen 49% An attempt at displaying associations. It only gives half of the picture. May be that have same Stressor percents for Undegraded portion of the resource. The graphic does make that point that although the percent degraded is not all that different between the two provinces, the stressors are very different. Habitat 14% Metals 42% Low D.O. Contaminants 10% Contaminants 28% Both 2% Toxicity 4% Stressors Associated with Degraded Condition

37 MAIA: Relative Risk Assessment
“The risk of Poor BMI is 1.6 times greater in streams with Poor SED than in streams with OK SED.” This graphic focuses on impact of stressors on biotic indicators in streams. Uses relative risk as one way of communicating impact of stressor to the biology. Uses the same language that is used to communicate stressor risks to humans. Change the stressors to those related to heart disease. Gives information on how big a problem a stressor is (extent) and increase in risk to the biota when it is present.

38 West Virginia has defined 25 Hydrologic units covering the state and reports of the condition of streams by these units. This graphic is a RowPlot of population estimates for the mean stream condition index and the std dev of stream condition index. With 95% confidence intervals. Note that it is sorted from good to poor scores for mean. What is missing are micromaps as another column that show the spatial pattern.

39 Same survey. Now the focus is on presenting summaries of the distribution using boxplots. Provides information on how variable the scores are within a reporting unit.

40 This is an illustration of reporting not only the overall index of stream condition (WVSCI in first column) but the seven components that go into the overall index. Again results sorted by mean. This plot is more technical – would be of interest to those familiar with the construction of the overall index – scientists.

41 Spatial display of survey results from NRI.

42 Lake Ontario Diporeia Spatial Pattern
Example where statistical spatial analyses were used to estimate a surface over an area. It falls short of being a good presentation graphic.

43 Summary Statistical perspective is pervasive throughout the monitoring framework Substantial advances in incorporating statistical perspective in monitoring have been made during the last half of the 20th century Many statistical methodology advances are on the horizon that will improve monitoring cost-effectiveness Incorporating a statistical perspective throughout the development and implementation of a monitoring program is no longer optional – it is essential

44 When will natural resource monitoring programs be able to support an Environmental Statistics Briefing Room?


Download ppt "Western Ecology Division Web Page:"

Similar presentations


Ads by Google