Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.

Slides:



Advertisements
Similar presentations
Sampling Design, Spatial Allocation, and Proposed Analyses Don Stevens Department of Statistics Oregon State University.
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Active Appearance Models
VARYING RESIDUAL VARIABILITY SEQUENCE OF GRAPHS TO ILLUSTRATE r 2 VARYING RESIDUAL VARIABILITY N. Scott Urquhart Director, STARMAP Department of Statistics.
An Overview STARMAP Project I Jennifer Hoeting Department of Statistics Colorado State University
Bayesian inference “Very much lies in the posterior distribution” Bayesian definition of sufficiency: A statistic T (x 1, …, x n ) is sufficient for 
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
The General Linear Model Or, What the Hell’s Going on During Estimation?
Robust sampling of natural resources using a GIS implementation of GRTS David Theobald Natural Resource Ecology Lab Dept of Recreation & Tourism Colorado.
Nonparametric, Model-Assisted Estimation for a Two-Stage Sampling Design Mark Delorey, F. Jay Breidt, Colorado State University Abstract In aquatic resources,
1 STARMAP: Project 2 Causal Modeling for Aquatic Resources Alix I Gitelman Stephen Jensen Statistics Department Oregon State University August 2003 Corvallis,
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
The Simple Linear Regression Model: Specification and Estimation
State-Space Models for Within-Stream Network Dependence William Coar Department of Statistics Colorado State University Joint work with F. Jay Breidt This.
Semiparametric Mixed Models in Small Area Estimation Mark Delorey F. Jay Breidt Colorado State University September 22, 2002.
Bayesian modeling for ordinal substrate size using EPA stream data Megan Dailey Higgs Jennifer Hoeting Brian Bledsoe* Department of Statistics, Colorado.
Models for the Analysis of Discrete Compositional Data An Application of Random Effects Graphical Models Devin S. Johnson STARMAP Department of Statistics.
1 Accounting for Spatial Dependence in Bayesian Belief Networks Alix I Gitelman Statistics Department Oregon State University August 2003 JSM, San Francisco.
Strength of Spatial Correlation and Spatial Designs: Effects on Covariance Estimation Kathryn M. Irvine Oregon State University Alix I. Gitelman Sandra.
PAGE # 1 Presented by Stacey Hancock Advised by Scott Urquhart Colorado State University Developing Learning Materials for Surface Water Monitoring.
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
Two-Phase Sampling Approach for Augmenting Fixed Grid Designs to Improve Local Estimation for Mapping Aquatic Resources Kerry J. Ritter Molly Leecaster.
Nonparametric, Model-Assisted Estimation for a Two-Stage Sampling Design Mark Delorey Joint work with F. Jay Breidt and Jean Opsomer September 8, 2005.
Example For simplicity, assume Z i |F i are independent. Let the relative frame size of the incomplete frame as well as the expected cost vary. Relative.
Habitat association models  Independent Multinomial Selections (IMS): (McCracken, Manly, & Vander Heyden, 1998) Product multinomial likelihood with multinomial.
Visual Recognition Tutorial
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
State-Space Models for Biological Monitoring Data Devin S. Johnson University of Alaska Fairbanks and Jennifer A. Hoeting Colorado State University.
Ordinary Kriging Process in ArcGIS
Optimal Sample Designs for Mapping EMAP Data Molly Leecaster, Ph.D. Idaho National Engineering & Environmental Laboratory Jennifer Hoeting, Ph. D. Colorado.
Application of Geostatistical Inverse Modeling for Data-driven Atmospheric Trace Gas Flux Estimation Anna M. Michalak UCAR VSP Visiting Scientist NOAA.
Applications of Nonparametric Survey Regression Estimation in Aquatic Resources F. Jay Breidt, Siobhan Everson-Stewart, Alicia Johnson, Jean D. Opsomer.
Random Effects Graphical Models and the Analysis of Compositional Data Devin S. Johnson and Jennifer A. Hoeting STARMAP Department of Statistics Colorado.
Applications in GIS (Kriging Interpolation)
Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark.
1 Adjustment Procedures to Account for Nonignorable Missing Data in Environmental Surveys Breda Munoz Virginia Lesser R
Lecture II-2: Probability Review
Objectives of Multiple Regression
9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
1 Spatial and Spatio-temporal modeling of the abundance of spawning coho salmon on the Oregon coast R Ruben Smith Don L. Stevens Jr. September.
Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Geographic Information Science
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Receptor Occupancy estimation by using Bayesian varying coefficient model Young researcher day 21 September 2007 Astrid Jullion Philippe Lambert François.
Maximum Likelihood - "Frequentist" inference x 1,x 2,....,x n ~ iid N( ,  2 ) Joint pdf for the whole random sample Maximum likelihood estimates.
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
Example: Bioassay experiment Problem statement –Observations: At each level of dose, 5 animals are tested, and number of death are observed.
Problem: 1) Show that is a set of sufficient statistics 2) Being location and scale parameters, take as (improper) prior and show that inferences on ……
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Lecture 3: MLE, Bayes Learning, and Maximum Entropy
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
VARYING DEVIATION BETWEEN H 0 AND TRUE  SEQUENCE OF GRAPHS TO ILLUSTRATE POWER VARYING DEVIATION BETWEEN H 0 AND TRUE  N. Scott Urquhart Director, STARMAP.
Exposure Prediction and Measurement Error in Air Pollution and Health Studies Lianne Sheppard Adam A. Szpiro, Sun-Young Kim University of Washington CMAS.
Canadian Bioinformatics Workshops
Biointelligence Laboratory, Seoul National University
Bayesian Semi-Parametric Multiple Shrinkage
Linear Regression Modelling
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Ch3: Model Building through Regression
Estimation and Model Selection for Geostatistical Models
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
TROUBLESOME CONCEPTS IN STATISTICS: r2 AND POWER
Parametric Methods Berlin Chen, 2005 References:
Longitudinal Data & Mixed Effects Models
Presentation transcript:

Distribution Function Estimation in Small Areas for Aquatic Resources Spatial Ensemble Estimates of Temporal Trends in Acid Neutralizing Capacity Mark Delorey F. Jay Breidt Colorado State University This research is funded by U.S.EPA – Science To Achieve Results (STAR) Program Cooperative Agreement # CR

Project Funding The work reported here was developed under the STAR Research Assistance Agreement CR awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA. The views expressed here are solely those of the presenter and the STARMAP, the Program he represents. EPA does not endorse any products or commercial services mentioned in this presentation.

Outline Statement of the problem: How to get a set of estimates that are good for multiple inferences of acid trends in watersheds? Hierarchical model and Bayesian inference Constrained Bayes estimators -adjusting the variance of the estimators Conditional auto-regressive (CAR) model -introducing spatial correlation Constrained Bayes with CAR Summary

The Problem Evaluation of the Clean Air Act Amendments of examine acid neutralizing capacity (ANC) -surface waters are acidic if ANC < 0 -supply of acids from atmospheric deposition and watershed processes exceeds buffering capacity Temporal trends in ANC within watersheds (8-digit HUC’s) -characterize the spatial ensemble of trends -make a map, construct a histogram, plot an empirical distribution function

Data Set 86 HUC’s in Mid-Atlantic Highlands ANC in at least two years from 1993–1998 HUC-level covariates: -area -average elevation -average slope, max slope -percents agriculture, urban, and forest -spatial coordinates -dry acid deposition from NADP

Region of Study

Locations of Sites

Small Area Estimation Probability sample across region -regional-level inferences are model-free -samples are not sufficiently dense in small watersheds (HUC-8) -need to incorporate auxiliary information through model Two standard types of small area models (Rao, 2003) -area-level: watersheds -unit-level: site within watershed

Two Inferential Goals Interested in estimating individual HUC-specific slopes Also interested in ensemble: spatially-indexed true values: spatially-indexed estimates: -subgroup analysis: what proportion of HUC’s have ANC increasing over time? -“empirical” distribution function (edf):

Deconvolution Approach Treat this as measurement error problem: Deconvolve: -parametric: assume F  in parametric class -semi-parametric: assume F  well-approximated within class (like splines, normal mixtures) -non-parametric: assume E F [e i  ] is smooth Not so appropriate for heteroskedastic measurements, explanatory variables, two inferential goals

Hierarchical Area-Level Model Extend model specification by describing parameter uncertainty: Prior specification:

Bayesian Inference Individual estimates: use posterior means where Do Bayes estimates yield a good ensemble estimate? -use edf of Bayes estimates to estimate F  ? No: Bayes estimates are “over-shrunk” -too little variability to give good representation of edf (Louis 1984, Ghosh 1992)

Adjusted Shrinkage Posterior means not good for both individual and ensemble estimates Improve by reducing shrinkage -sample mean of Bayes estimates already matches posterior mean of -adjust shrinkage so that sample variance of estimates matches posterior variance of true values Louis (1984), Ghosh (1992) Cressie and Stern (1991)

Constrained Bayes Estimates Compute the scalars Form the constrained Bayes (CB) estimates as where

Shrinkage Comparisons for the Slope Ensemble

Numerical Illustration Compare edf’s of estimates to posterior mean of F  : Comparison of ensemble estimates at selected quantiles:

Estimated EDF’s of the Slope Ensemble CB Posterior Mean Bayes

Spatial Model Let where  is an unknown coefficient vector, C = (c ij ) represents the adjacency matrix,  is a parameter measuring spatial dependence,  is a known diagonal matrix of scaling factors for the variance in each HUC, and  is an unknown parameter. Adjacency matrix C can reflect watershed structure

Conditional Auto Regressive (CAR) Model Let A h denote a set of neighboring HUCs for HUC h The previous formulation is equivalent to: Cressie and Stern (1991)

HUC Structure First level (2-digit) divides U.S. into 21 major geographic regions Second level (4-digit) identifies area drained by a river system, closed basin, or coastal drainage area Third level (6-digit) creates accounting units of surface drainage basins or combination of basins Fourth level (8-digit) distinguishes parts of drainage basins and unique hydrologic features

Neighborhood Structure All watersheds within the same HUC-6 region were considered part of same neighborhood No spatial relationship among HUC-4 regions or HUC-2 regions considered at this point

Model Specifications Adjacency matrix:  = diag  hh =, h = 1,…,m; n h = # neighbors of HUC h  hk = 0, h ≠ k  can be fixed or random

Constrained Bayes with CAR (  fixed) If  is known, Stern and Cressie (1999) show how to solve for H 1 (Y) and H 2 (Y) under the mean and variance constraints: and respectively, where

When  is Unknown or Random We place a uniform prior on  and minimize the Lagrangian: where to get a system of equations that can be used to solve for Posterior quantities can be estimated using BUGS or other software

Spatial Structure

Summary In Bayesian context, posterior means are overshrunk; in order to obtain estimates appropriate for ensemble, need to adjust In CAR, if  is known, can find CB estimators following Stern and Cressie (1999); if  is unknown, can still find CB estimators numerically Contour plot indicates that trend slopes of ANC are smoothed and somewhat homogenized within HUC

Ongoing Work Replace spatial CAR with geostatistical model; model site responses where Is CB estimate of rate the same as rate from CB estimates?

Other Issues Restrict to acid-sensitive waters Combine probability and convenience samples Modify spatial structure