Seminar 3 Data requirements, limitations, and challenges: Inverse modeling of seed and seedling dispersal Likelihood Methods in Forest Ecology October.

Slides:



Advertisements
Similar presentations
Case Study 2 Neighborhood Models of the Allelopathic Effects of an Invasive Tree Species Gómez-Aparicio, L. and C. D. Canham Neighborhood analyses.
Advertisements

Exponential Distribution. = mean interval between consequent events = rate = mean number of counts in the unit interval > 0 X = distance between events.
Likelihood Ratio, Wald, and Lagrange Multiplier (Score) Tests
Spatial Autocorrelation Basics NR 245 Austin Troy University of Vermont.
FOR 474: Forest Inventory Plot Level Metrics from Lidar Heights Other Plot Measures Sources of Error Readings: See Website.
Maximum likelihood (ML) and likelihood ratio (LR) test
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Maximum likelihood (ML)
Estimation of parameters. Maximum likelihood What has happened was most likely.
Applied Geostatistics
Linear statistical models 2008 Model diagnostics  Residual analysis  Outliers  Dependence  Heteroscedasticity  Violations of distributional assumptions.
SA basics Lack of independence for nearby obs
6.4 Prediction -We have already seen how to make predictions about our dependent variable using our OLS estimates and values for our independent variables.
Analysis of Individual Variables Descriptive – –Measures of Central Tendency Mean – Average score of distribution (1 st moment) Median – Middle score (50.
Maximum likelihood (ML)
Introduction to Regression Analysis, Chapter 13,
Population Ecology Population: A group of organisms that belong to the same species that live in the same place at the same time.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Lecture 4 Model Formulation and Choice of Functional Forms: Translating Your Ideas into Models.
Day 7 Model Evaluation. Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals.
Understanding Populations The Human Population From 1900 to 2003, the population tripled in size to reach 6.3 billion people Today, the human population.
Random variables Petter Mostad Repetition Sample space, set theory, events, probability Conditional probability, Bayes theorem, independence,
Probability theory 2 Tron Anders Moger September 13th 2006.
The Triangle of Statistical Inference: Likelihoood
What is a probability distribution? It is the set of probabilities on a sample space or set of outcomes.
Random Sampling, Point Estimation and Maximum Likelihood.
Probability Distributions and Dataset Properties Lecture 2 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006.
Ecology 8310 Population (and Community) Ecology. Context.
Investigating Scientific Claims. Outline I. Experimental vs. Observational Science II. Evidence vs. Inference A. Definitions B. Examples III. Types of.
Mechanism vs. phenomenology in choosing functional forms: Neighborhood analyses of tree competition Case Study 3 Likelihood Methods in Ecology April 25.
Analysis of Categorical and Ordinal Data: Binomial and Logistic Regression Lecture 6.
Lecture 2 Forestry 3218 Lecture 2 Statistical Methods Avery and Burkhart, Chapter 2 Forest Mensuration II Avery and Burkhart, Chapter 2.
The Triangle of Statistical Inference: Likelihoood Data Scientific Model Probability Model Inference.
Likelihood Methods in Ecology November 16 th – 20 th, 2009 Millbrook, NY Instructors: Charles Canham and María Uriarte Teaching Assistant Liza Comita.
Regression. Types of Linear Regression Model Ordinary Least Square Model (OLS) –Minimize the residuals about the regression linear –Most commonly used.
Simulation of spatially correlated discrete random variables Dan Dalthorp and Lisa Madsen Department of Statistics Oregon State University
Lecture 5 Model Evaluation. Elements of Model evaluation l Goodness of fit l Prediction Error l Bias l Outliers and patterns in residuals.
Issues in Estimation Data Generating Process:
Random Variable The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be 'heads' or 'tails'. However, we.
Forest Dynamics on the Hickory Ridge of St. Catherines Island Alastair Keith-Lucas Forestry and Geology Department, University of the South Introduction.
Lecture 6 Your data and models are never perfect… Making choices in research design and analysis that you can defend.
Machine Learning 5. Parametric Methods.
Population density - number of individuals that live in a defined area.
Linear Regression Linear Regression. Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Purpose Understand Linear Regression. Use R functions.
 1 Species Richness 5.19 UF Community-level Studies Many community-level studies collect occupancy-type data (species lists). Imperfect detection.
Nonlinear Logistic Regression of Susceptibility to Windthrow Seminar 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006.
Mechanism vs. phenomenology in choosing functional forms: Neighborhood analyses of tree competition Case Study 3.
Review. Common probability distributions Discrete: binomial, Poisson, negative binomial, multinomial Continuous: normal, lognormal, beta, gamma, (negative.
Functional Traits and Niche-based tree community assembly in an Amazonian Forest Kraft et al
Nonlinear function minimization (review). Newton’s minimization method Ecological detective p. 267 Isaac Newton We want to find the minimum value of f(x)
Model Comparison. Assessing alternative models We don’t ask “Is the model right or wrong?” We ask “Do the data support a model more than a competing model?”
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Prediction and Missing Data. Summarising Distributions ● Models are often large and complex ● Often only interested in some parameters – e.g. not so interested.
Lecture 5 Model Evaluation
Spatial statistics: Spatial Autocorrelation
Chapter 4 Basic Estimation Techniques
Seed dispersal and seedling recruitment in Miro (Podocarpus ferrugineus, Podocarpaceae) & Puriri (Vitex lucens, Verbenaceae) Andrew Pegman PhD Candidate.
Modeling and Simulation CS 313
Seminar 3 - Inverse modeling
Case Study 2 - Neighborhood competition
Chapter 7: Sampling Distributions
Statistical Methods For Engineers
Estimating Population Size
Case Study - Neighborhood Models of Allelopathy
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Spatial Data Analysis: Intro to Spatial Statistical Concepts
Integration of sensory modalities
Spatial Data Analysis: Intro to Spatial Statistical Concepts
Lecture 5 Model Evaluation
Lecture 6 C. D. Canham Lecture 7 Your data and models are never perfect… Making choices in research design and analysis that you can defend.
Presentation transcript:

Seminar 3 Data requirements, limitations, and challenges: Inverse modeling of seed and seedling dispersal Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Approaches to Estimation of Seed and Seedling Dispersal Functions Direct sampling around isolated trees ( David Greene) Develop mechanistic models with directly measurable parameters ( Ran Nathan) Inverse modeling using likelihood methods and neighborhood models (Eric Ribbens, Jim Clark, and a rapidly growing community of practitioners…)

The questions… l What are the shapes of the dispersal functions? l How does fecundity vary as a function of tree size? l What other factors determine the spatial distribution of seeds and seedlings around parent trees? - Wind direction (anisotropy) - Secondary dispersal - Density and distance - dependent seed predation and pathogens - Substrate conditions - Light levels

The basic approach: field methods l Map the distribution of potential parent trees within a stand l Sample the density of seeds or seedlings at mapped locations within the stand l Measure any additional features at the location of the seed traps or seedling quadrats

The Probability Model l Observations consist of counts l Assume the counts are either Poisson or Negative Binomial distributed l Poisson PDF: Where x = observed density (integer), and = predicted density (continuous)

Negative Binomial PDF “shape” of the PDF controlled by both the expected mean ( m ) and a “shape” parameter ( k ) As k varies, the distribution can vary from over- to under-dispersed (i.e. variance > or < mean) This is the notation for the gamma function…

The basic “scientific” model l Seed rain at a given location is the sum of the input of N parent trees, with the input from any given tree a function of the: - Size (typically DBH) and - Distance to the parent

How does total seed production vary with tree size? l Common assumption: seed production is a function of DBH 2 (following Ribbens et al. 1994) where  = 2, and STR = total standardized seed production of a 30 cm DBH tree Is this a reasonable assumption? Is it supported by either independent data or theory?

How does seed rain vary with distance from a parent tree? l Two basic classes of functions are commonly used*: - Monotonically declining (negative exponential): - Lognormal: *See Greene et al. (2004), J. Ecol. for a discussion…

One more trick… Normalizing the dispersal function [ g(dist)] so that STR is in meaningful units… Where  is the “arcwise” (i.e. 360 o ) integration of the dispersal function

Lognormal form: Exponential form: So, the basic scientific models…

The Scientific Models

Anisotropy: does direction matter? For the lognormal dispersal function: Incorporate effect of direction from source tree on modal dispersal distance 1 : 1 Staelens, J., L. Nachtergale, S. Luyssaert, and N. Lust A model of wind-influenced leaf litterfall in a mixed hardwood forest. Canadian Journal of Forest Research.

Shape of the wind direction effect When would this matter? (just to increase goodness of fit and improve parameter estimation?)

Potential Dataset Limitations Censored data: not all parents are accounted for Insufficient variation in predicted values: parents are too uniformly distributed Two different populations treated as one: not all potential parents actually produce seeds Lack of independence: spatial autocorrelation among nearby samples

Effects of Search Radius

Sampling along the edge...

What if all of the trees are uniformly spaced? l This produces relatively similar neighborhoods for all observations… l Random vs. strategic sampling…

Goodness of Fit – Sites with Different Densities

What if all of the trees are the same size? Tradeoffs between STR and 

What if not all trees produce seed?

Can the data discriminate between different functions?

Beware of simplifying assumptions in your model...

Parameter Estimation – Varying 

What is the minimum size of a reproductive adult? l Most studies have arbitrarily assumed that all adults over a low minimum size (10 – 15 cm DBH) contribute seeds. l One approach – estimate the minimum (don’t assume it) How could we determine the effective minimum reproductive size?

Parent size and seedling production in a Puerto Rican rainforest Source: Uriarte et al. (2005) J. Ecology

Scaling reproductive output to tree size: Maximum likelihood parameter estimates Species  min. size (cm) Casearia arborea Dacryodes excelsa 0.51 NA Guarea guidonia Inga laurina Manilkara bidentata Prestoea acuminata Schefflera morototoni Sloanea berteriana Tabebuia heterophylla Source: Uriarte, M., C. D. Canham, J. Thompson, J. K. Zimmerman, and N. Brokaw Seedling recruitment in a hurricane-driven tropical forest: light limitation, density-dependence and the spatial distribution of parent trees. Journal of Ecology 93:

Should there be an “intercept” in the model? l Allowing for long-distance dispersal via a “bath” term: Where b is an average input of seeds even when there are no parents in the neighborhood…

For seedlings: does light influence germination? 0 < M(GLI) < 1 L opt Lhi = slope away from L opt Llo = slope to Lopt

Light Availability

Is there evidence of density dependence in seedling establishment? l Add yet another multiplier... C δ Conspecific seedling density DD Effect (0-1)

Negative Conspecific Density Dependence

Spatial autocorrelation

Dealing with spatial autocorrelation among observations… l Remember - the formula for calculating log-likelihood assumes that observations are independent… l We have been conditioned to assume that two observations taken at locations close together are likely to be not independent (a legacy of Stuart Hurlbert) l Moran’s I and other indices of spatial autocorrelation How do you determine whether this is true?

A critical distinction… l Remember – the issue is whether the residuals (the error terms in the probability model) are independent. NOT whether the raw observations are… If your scientific model “explains” why two nearby observations have similar values, then the fact that they are similar is NOT evidence of lack of independence*… *despite assertions to the contrary in some papers on the subject

So, examine your residuals for spatial autocorrelation Distance class (m) Moran’s I l A “best-case” species… Examples from a study of seedling recruitment in a New Zealand rainforest (data from Elaine Wright)

Another species… l A worse case… Distance class (m) Moran’s I

Causes and consequences of fine-scale spatial autocorrelation… l The causes are probably legion: - Many trees don’t produce seed in any given mast year, - Many factors can cluster input of seeds or survival of seedlings l The consequences are important but not fatal: - Generally very little bias in parameter estimates themselves, - But estimates of the variance of the parameters will be biased (low) Do the thought experiment or test this with real data – what would happen if you duplicated some observations in the dataset and then redid the analysis?