Estimating American pre-settlement forest parameters from General Land Office data: problems and some solutions. Jim Bouldin, University of California.

Slides:



Advertisements
Similar presentations
McGraw-Hill Ryerson Copyright © 2011 McGraw-Hill Ryerson Limited. Adapted by Peter Au, George Brown College.
Advertisements

1. Estimation ESTIMATION.
Data Sources The most sophisticated forecasting model will fail if it is applied to unreliable data Data should be reliable and accurate Data should be.
1 Validation and Verification of Simulation Models.
8-1 Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall Chapter 8 Confidence Interval Estimation Statistics for Managers using Microsoft.
Copyright ©2011 Pearson Education 8-1 Chapter 8 Confidence Interval Estimation Statistics for Managers using Microsoft Excel 6 th Global Edition.
Statistics for Managers Using Microsoft® Excel 5th Edition
Confidence Intervals: Estimating Population Mean
Inference for regression - Simple linear regression
Community Diversity – Measures and Techniques What is the best way to describe community diversity? Is it: 1.Species richness – the total number of species.
Collecting Tree Survey Data Each group is assigned a series of randomly generated points along a transect line passing through a stand of trees. 1.At.
Determining Sample Size
Significance Tests …and their significance. Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from.
Chapter 6 The Normal Probability Distribution
BPS - 3rd Ed. Chapter 211 Inference for Regression.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Basic Business Statistics 11 th Edition.
Confidence Interval Estimation
Chap 8-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 8 Confidence Interval Estimation Business Statistics: A First Course.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Estimation Bias, Standard Error and Sampling Distribution Estimation Bias, Standard Error and Sampling Distribution Topic 9.
Chapter 7 Estimates and Sample Sizes
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Managerial Economics Demand Estimation & Forecasting.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
EMIS 7300 SYSTEMS ANALYSIS METHODS FALL 2005 Dr. John Lipp Copyright © Dr. John Lipp.
1 Chapter 9 Hypothesis Testing. 2 Chapter Outline  Developing Null and Alternative Hypothesis  Type I and Type II Errors  Population Mean: Known 
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
BPS - 3rd Ed. Chapter 161 Inference about a Population Mean.
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
Issues concerning the interpretation of statistical significance tests.
Spatial Statistics in Ecology: Point Pattern Analysis Lecture Two.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Question paper 1997.
Estimating a Population Mean:  Known
Chap 8-1 Chapter 8 Confidence Interval Estimation Statistics for Managers Using Microsoft Excel 7 th Edition, Global Edition Copyright ©2014 Pearson Education.
BOT / GEOG / GEOL 4111 / Field data collection Visiting and characterizing representative sites Used for classification (training data), information.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
For starters - pick up the file pebmass.PDW from the H:Drive. Put it on your G:/Drive and open this sheet in PsiPlot.
PCB 3043L - General Ecology Data Analysis.
1 Module One: Measurements and Uncertainties No measurement can perfectly determine the value of the quantity being measured. The uncertainty of a measurement.
Confidence Interval Estimation For statistical inference in decision making: Chapter 9.
1 ES Chapter 18 & 20: Inferences Involving One Population Student’s t, df = 5 Student’s t, df = 15 Student’s t, df = 25.
Introduction to statistics I Sophia King Rm. P24 HWB
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
Confidence Intervals for a Population Proportion Excel.
1 Chapter 5 – Density estimation based on distances The distance measures were originally developed as an alternative to quadrat sampling for estimating.
CHAPTER 2.3 PROBABILITY DISTRIBUTIONS. 2.3 GAUSSIAN OR NORMAL ERROR DISTRIBUTION  The Gaussian distribution is an approximation to the binomial distribution.
Review: Stages in Research Process Formulate Problem Determine Research Design Determine Data Collection Method Design Data Collection Forms Design Sample.
8.1 Confidence Intervals: The Basics Objectives SWBAT: DETERMINE the point estimate and margin of error from a confidence interval. INTERPRET a confidence.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Business Statistics: A First Course 5 th Edition.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 9 Introduction to the t Statistic
FREQUENCY DISTRIBUTION
Chapter 4 Basic Estimation Techniques
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
PCB 3043L - General Ecology Data Analysis.
Basic Estimation Techniques
Confidence Interval Estimation
Basic Practice of Statistics - 3rd Edition Inference for Regression
Lecture 1: Descriptive Statistics and Exploratory
(-4)*(-7)= Agenda Bell Ringer Bell Ringer
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Last Update 12th May 2011 SESSION 41 & 42 Hypothesis Testing.
Objectives 6.1 Estimating with confidence Statistical confidence
Objectives 6.1 Estimating with confidence Statistical confidence
CHAPTER – 1.2 UNCERTAINTIES IN MEASUREMENTS.
Presentation transcript:

Estimating American pre-settlement forest parameters from General Land Office data: problems and some solutions. Jim Bouldin, University of California at Davis Purpose: To identify problems in the analysis of a critical, historical tree data set in estimating pre-settlement forest conditions, and to provide some solutions to them. 2. Abstract: The General Land Office (GLO) Bearing Tree (BT) data have long been used to quantify pre-settlement forest conditions in many places in the United States. The number of such studies now approaches 100, and their frequency and complexity has been increasing in the last decade. However, almost all studies have an unknown reliability due to one or more potentially severe analytical problems. These include (1) mistaking relative frequencies (taxon-based) of BT data as equivalent to relative density (2) the effect of non-random forest spatial patterns on density estimation using traditional methods, (3) missing tree data, and (4) several possible types of surveyor bias in the selection of trees. Because the issues are almost entirely unrecognized, methods to address them are likewise nearly non-existent. Cumulatively, these problems greatly restrict the full potential of the data, and weaken the confidence in the various estimates from existing studies. Here I present empirical and simulation-derived evidence for the existence of these issues, using data from simulation and empirical data from Minnesota, and offer some analytical solutions to them. 3. Background: The GLO collected bearing tree (BT) data as part of their land survey work in the 19 th century. The data are highly valuable: they form the potential basis for the historical estimation of such critical variables as species composition, tree density, basal area, canopy cover, biomass, and carbon content, at spatial scales from less than 1 km to multiple state regions. This survey was conducted over ~ 70% of the land area of the United States (Fig 1). Bearing tree data was collected at sample points (survey corners) spaced at 0.8 km intervals along a 1.6 km grid (Fig 2). At such points, surveyors sampled from one tree (if available), in each of the four quadrants or two halves defined by survey line(s) passing through the sample point (Fig 3). Surveyors recorded the taxon, diameter, distance and bearing for each tree. However, several analytical problems arise from the fact that surveyors were not unbiased in their choice of trees, and which are compounded by inappropriate statistical analysis models and techniques. 4. Methods: I used simulation analysis, empirical analysis of Minnesota BT data, and a comprehensive literature review, to identify problems and solutions in BT data use. Spatial simulations were performed in ArcView using over 430,000 trees in 8,500+ circular plots. Spatial aggregation was simulated by varying the densities between plots by factors ranging from 4 to 1024, and comparing calculated densities using two different estimators. Empirical analysis involved the full set of 350,000+ trees from the Minnesota BT data set, focusing on statistical analysis of: (1) taxon-based differences in mean point-to-corner distance, (2) the effect of ignoring missing trees, and (3) surveyor bias for particular bearings or quadrants in their choice of trees. Several new analytical techniques were developed and employed. 5. Results: (1)Estimates of relative density by taxon are the most common use of GLO BT data. However, virtually all extant estimates assume that relative BT frequency is equal to relative density, which assumes that mean point-to-tree distances are equal across all taxa. Across the state of Minnesota, this assumption is shown here to be grossly incorrect, leading to large errors for many taxa (Fig 4). The observed mean distance differences between certain taxa are real and not artifacts of surveyor preference of certain taxa over others (data not presented). (2)The commonly-used point-centered-quarter (PCQ) estimator (Cottam and Curtis, 1956) under-estimates absolute density in direct proportion to the variation in density (spatial non- randomness) across a study area (Table 1). Other estimators are far more accurate in such situations (e.g. Morisita 1957). (3)No existing study attempts to account for missing trees (one or more trees not recorded by surveyors in given quadrants at a sample point). This can lead to large over-estimates in density (Warde and Petranka, 1981). Table 2 reveals that such errors are evident both statewide and, especially, in individual Ecological Subsections in Minnesota (Fig 6). (4)Surveyors in Minnesota displayed location-based biases in their choice of BTs, and had different selection criteria depending on the type of corner (Fig. 5). Both invalidate the common assumption that surveyors chose the Q 1 through Q 4 trees at section corners and Q 1 and Q 2 trees at quarter corners, leading to underestimates of actual density (Eq. 1, Table 3). 6. Conclusions: (1)The assumptions and analytical methods used to model pre-settlement forest structure and composition from GLO BT data in existing studies are shown to very likely be in error or invalid to varying degrees, based on simulated and large scale empirical data. (2)The attention to methods issues and to appropriateness of analytical assumptions in existing studies is very poor. None have used the techniques described here to assess the validity of parameter estimates. Because of this, existing studies, which have recently become increasingly frequent and complex, have an unknown reliability. (3)The results shown here are conservative in the sense that synergistic interactions between the various error sources, which could magnify the effects shown here, are not assessed. Figure 1. The GLO survey area (+ Alaska, not shown). Figure 3. The traditional PCQ sampling method, measuring to the closest (Q n ) tree in each of four quadrants at a section corner. Figure 2. The GLO survey grid arrangement. BT data is collected at both section and quarter corners. Figure 4. Relative densities of major taxa, calculated as traditional relative frequency (top), vs distance-weighted relative frequency (= relative density, bottom). Table 1. Tree densities, and percent errors from actual density, under a range of spatial aggregations, when calculated with the traditional vs more robust methods. Figure 5. Frequency of BT occurrence by 10 degree angle sector for quarter corners (top) and section corners (bottom), for a common species, tamarack, Larix laricina. Peaks in bottom graph correspond to quadrant middles and valleys correspond to quadrant edges. Figure 6. Ecological Subsections in Minnesota (Tables 2 and 3) Equation 1. Formula for C, the density under-estimation factor caused by preference for certain bearings within angle sectors (Fig. 5, Table 3). Table 2. Missing tree fractions and overestimation percentages, by subsection, due to not accounting for missing trees in density estimates (Warde and Petranka, 1981). The statewide average over-estimate was 216%. Table 3. Underestimation factors (C, Eq. 1), due to surveyor bias toward BTs in quadrant middles at section corners. References: Cottam, G. & Curtis, J. T. (1956) The use of distance measures in phytosociological sampling. Ecology 37, Morisita, M. (1957) A new method for the estimation of density by the spacing method, applicable to non- randomly distributed populations. Physiology and Ecology-Kyoto 7, (Japanese). (English translation (1960) Division of Range Management and Wildlife Habitat Research, U.S. Forest Service, M-5123). Warde, W., & Petranka. J.W. (1981) A correction factor table for missing point-center quarter data. Ecology 62(2),