Estimation and Model Selection for Geostatistical Models

Slides:



Advertisements
Similar presentations
Spatial point patterns and Geostatistics an introduction
Advertisements

Spatial point patterns and Geostatistics an introduction
Bayesian Belief Propagation
Use of Estimating Equations and Quadratic Inference Functions in Complex Surveys Leigh Ann Harrod and Virginia Lesser Department of Statistics Oregon State.
A Bayesian random coefficient nonlinear regression for a split-plot experiment for detecting differences in the half- life of a compound Reid D. Landes.
Copula Regression By Rahul A. Parsa Drake University &
Combining Information from Related Regressions Duke University Machine Learning Group Presented by Kai Ni Apr. 27, 2007 F. Dominici, G. Parmigiani, K.
Multi-Lag Cluster Enhancement of Fixed Grids for Variogram Estimation for Near Coastal Systems Kerry J. Ritter, SCCWRP Molly Leecaster, SCCWRP N. Scott.
Model- vs. design-based sampling and variance estimation on continuous domains Cynthia Cooper OSU Statistics September 11, 2004 R
SPATIAL DATA ANALYSIS Tony E. Smith University of Pennsylvania Point Pattern Analysis Spatial Regression Analysis Continuous Pattern Analysis.
1 STARMAP: Project 2 Causal Modeling for Aquatic Resources Alix I Gitelman Stephen Jensen Statistics Department Oregon State University August 2003 Corvallis,
Paper Discussion: “Simultaneous Localization and Environmental Mapping with a Sensor Network”, Marinakis et. al. ICRA 2011.
State-Space Models for Within-Stream Network Dependence William Coar Department of Statistics Colorado State University Joint work with F. Jay Breidt This.
Semiparametric Mixed Models in Small Area Estimation Mark Delorey F. Jay Breidt Colorado State University September 22, 2002.
Bayesian modeling for ordinal substrate size using EPA stream data Megan Dailey Higgs Jennifer Hoeting Brian Bledsoe* Department of Statistics, Colorado.
Models for the Analysis of Discrete Compositional Data An Application of Random Effects Graphical Models Devin S. Johnson STARMAP Department of Statistics.
Applied Geostatistics
What is a Multi-Scale Analysis? Implications for Modeling Presence/Absence of Bird Species Kathryn M. Georgitis 1, Alix I. Gitelman 1, Don L. Stevens 1,
1 Accounting for Spatial Dependence in Bayesian Belief Networks Alix I Gitelman Statistics Department Oregon State University August 2003 JSM, San Francisco.
Strength of Spatial Correlation and Spatial Designs: Effects on Covariance Estimation Kathryn M. Irvine Oregon State University Alix I. Gitelman Sandra.
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
Two-Phase Sampling Approach for Augmenting Fixed Grid Designs to Improve Local Estimation for Mapping Aquatic Resources Kerry J. Ritter Molly Leecaster.
Deterministic Solutions Geostatistical Solutions
Basics: Notation: Sum:. PARAMETERS MEAN: Sample Variance: Standard Deviation: * the statistical average * the central tendency * the spread of the values.
Chapter 11 Multiple Regression.
Habitat association models  Independent Multinomial Selections (IMS): (McCracken, Manly, & Vander Heyden, 1998) Product multinomial likelihood with multinomial.
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
Bayesian kriging Instead of estimating the parameters, we put a prior distribution on them, and update the distribution using the data. Model: Prior: Posterior:
Models for the Analysis of Discrete Compositional Data An Application of Random Effects Graphical Models Devin S. Johnson STARMAP Department of Statistics.
State-Space Models for Biological Monitoring Data Devin S. Johnson University of Alaska Fairbanks and Jennifer A. Hoeting Colorado State University.
Optimal Sample Designs for Mapping EMAP Data Molly Leecaster, Ph.D. Idaho National Engineering & Environmental Laboratory Jennifer Hoeting, Ph. D. Colorado.
Applications of Nonparametric Survey Regression Estimation in Aquatic Resources F. Jay Breidt, Siobhan Everson-Stewart, Alicia Johnson, Jean D. Opsomer.
Statistical Models for Stream Ecology Data: Random Effects Graphical Models Devin S. Johnson Jennifer A. Hoeting STARMAP Department of Statistics Colorado.
Random Effects Graphical Models and the Analysis of Compositional Data Devin S. Johnson and Jennifer A. Hoeting STARMAP Department of Statistics Colorado.
SILSOE RESEARCH INSTITUTE Using the wavelet transform to elucidate complex spatial covariation of environmental variables Murray Lark.
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
Method of Soil Analysis 1. 5 Geostatistics Introduction 1. 5
Lecture II-2: Probability Review
Multi-scale Analysis: Options for Modeling Presence/Absence of Bird Species Kathryn M. Georgitis 1, Alix I. Gitelman 1, and Nick Danz 2 1 Statistics Department,
Learning Theory Reza Shadmehr Bayesian Learning 2: Gaussian distribution & linear regression Causal inference.
Comparison of Variance Estimators for Two-dimensional, Spatially-structured Sample Designs. Don L. Stevens, Jr. Susan F. Hornsby* Department of Statistics.
Spatial Interpolation of monthly precipitation by Kriging method
Explorations in Geostatistical Simulation Deven Barnett Spring 2010.
Geographic Information Science
Geo479/579: Geostatistics Ch16. Modeling the Sample Variogram.
Short course on space-time modeling Instructors: Peter Guttorp Johan Lindström Paul Sampson.
Exposure Assessment for Health Effect Studies: Insights from Air Pollution Epidemiology Lianne Sheppard University of Washington Special thanks to Sun-Young.
Controls on Catchment-Scale Patterns of Phosphorous in Soil, Streambed Sediment, and Stream Water Marcel van der Perk, et al… Journal of Environmental.
Stochastic Hydrology Random Field Simulation Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Geo479/579: Geostatistics Ch12. Ordinary Kriging (2)
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
Using Regional Models to Assess the Relative Effects of Stressors Lester L. Yuan National Center for Environmental Assessment U.S. Environmental Protection.
Exposure Prediction and Measurement Error in Air Pollution and Health Studies Lianne Sheppard Adam A. Szpiro, Sun-Young Kim University of Washington CMAS.
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
Biointelligence Laboratory, Seoul National University
Spatial statistics: Spatial Autocorrelation
Probability Theory and Parameter Estimation I
ICS 280 Learning in Graphical Models
Quantifying Scale and Pattern Lecture 7 February 15, 2005
Bayesian Inference for Small Population Longevity Risk Modelling
Inference for Geostatistical Data: Kriging for Spatial Interpolation
Spatial Prediction of Coho Salmon Counts on Stream Networks
Paul D. Sampson Peter Guttorp
Stochastic Hydrology Random Field Simulation
Gaussian distribution & linear regression
Problem: Interpolation of soil properties
Econometrics Chengyuan Yin School of Mathematics.
Correlation A measure of the strength of the linear association between two numerical variables.
Maximum Likelihood Estimation (MLE)
Presentation transcript:

Estimation and Model Selection for Geostatistical Models Kathryn M. Georgitis Alix I. Gitelman Oregon State University Jennifer A. Hoeting Colorado State University

Aquatic Resource Surveys Designs and Models for Aquatic Resource Surveys DAMARS R82-9096-01 The research described in this presentation has been funded by the U.S. Environmental Protection Agency through the STAR Cooperative Agreement CR82-9096-01 Program on Designs and Models for Aquatic Resource Surveys at Oregon State University. It has not been subjected to the Agency's review and therefore does not necessarily reflect the views of the Agency, and no official endorsement should be inferred

Talk Outline Stream Sulfate Concentration G.I.S. Data Sources Bayesian Spatial Model Implementation Problems What exactly is the problem? Simulation results

Original Objective: Model sulfate concentration in streams in the Mid-Atlantic U.S. using a Bayesian geostatistical model

Why stream sulfate concentration? Indirectly toxic to fish and aquatic biota Decrease in streamwater pH Increase in metal concentrations (AL) Observed positive spatial relationship with atmospheric SO4-2 deposition (Kaufmann et al 1991)

Wet Atmospheric Sulfate Deposition http://www.epa.gov/airmarkets/cmap/mapgallery/mg_wetsulfatephase1.html

The Data MAHA/MAIA water chemistry data Watershed variables: 644 stream locations Watershed variables: % forest, % agriculture, % urban, % mining % within ecoregions with high sulfate adsorption soils National Atmospheric Deposition Program

MAHA/MAIA Stream Locations

Map of NADP and MAHA/MAIA Locations

Sketch of watershed with overlaid landcover map

Bayesian Geostatistical Model (1) Where Y(s) is observed ln(SO4-2) concentration at stream locations X(s) is matrix of watershed explanatory variables b is vector of regression coefficients Where D is matrix of pairwise distances, f is 1/range, t2 is the partial sill s2 is the nugget

Bayesian Geostatistical Model Priors: b~Np(0,h2I) f~Uniform(a,b) 1/t2 ~ Gamma(g,h) 1/s2 ~Gamma(f,l) (Banerjee et al 2004, and GeoBugs documentation)

Semi-Variogram of ln(SO4-2) Range Partial Sill Nugget

Results using Winbugs 4.1 n=644 tried different covariance functions only exponential without a nugget worked computationally intensive 1000 iterations took approx. 2 1/4 hours

New Objective: Why is this not working? Large N problem? Possible solutions: SMCMC: ‘accelerates convergence by simultaneously updating multivariate blocks of (highly correlated) parameters’ (Sargent et al. 2000, Cowles 2003, Banerjee et al 2004 ) f = (1/range) did not converge subset data to n=322 SMCMC & Winbugs: f still did not converge and posterior intervals for all parameters dissimilar

Is the problem the prior specification? Investigated sensitivity to priors Original Priors: b~Np(0,h2I) f~Uniform(a,b) 1/t2 ~ Gamma(g,h) 1/s2 ~Gamma(f,l) - f: Tried Gamma and different Uniform distributions (Banerjee et al 2004, Berger et al 2001) Variance components: Tried different Gamma distributions, half-Cauchy (Gelman 2004)

Is the problem the presence of a nugget? Simulations: RandomFields package in R Using MAHA coordinates (n=322) Constant mean Exponential covariance with and without a nugget Prior Sensitivity (Berger et al. 2001, Gelman 2004)

Posterior Intervals for f Using Different Priors Prior f~Uniform (4,6) Prior f~Uniform (0,100)

Posterior Intervals for Partial Sill Using Different Priors for f Prior f~Uniform (4,6) Prior f~ Uniform (0,100)

Is the Spatial Signal too weak? Simulations were using nugget/sill = 2/3 Try using a range of nugget/sill ratios Previous research: Mardia & Marshall (1984): spherical with and without nugget Zimmerman & Zimmerman (1991): R.E.M.L vs M.L.E. for Exponential without nugget Lark (2000): M.O.M. vs M.L.E. for spherical with nugget

Is the Spatial Signal too weak? f= 10 and f = 2.5 100 realizations each combination

Simulation Results for f=10 Bias for ML and REML Estimates

Simulation Results for f=10 Bias for ML and REML Estimates

Simulation Results for f=2.5 Bias for ML and REML Estimates

Simulation Results for f=2.5 Bias for ML and REML Estimates

Conclusions Covariance Model Selection Problem ML, REML, Bayesian Estimation (Harville 1974) Infill Asymptotic Properties of M.L.E.: Ying 1993: Ornstein-Uhlenbeck without nugget 2-dim.; lattice design Chen et al 2000: Ornstein-Uhlenbeck with nugget; 1-dim. Zhang 2004: Exponential without nugget; found increasing range more skewed distributions

Simulation Results for f=2.5 Bias for ML and REML Estimates

Simulation Results for f=2.5 Bias for ML and REML Estimates

Simulation Results for f=10 Bias for ML and REML Estimates

Results from SMCMC and Winbugs