Presentation is loading. Please wait.

Presentation is loading. Please wait.

Applications of Nonparametric Survey Regression Estimation in Aquatic Resources F. Jay Breidt, Siobhan Everson-Stewart, Alicia Johnson, Jean D. Opsomer.

Similar presentations


Presentation on theme: "Applications of Nonparametric Survey Regression Estimation in Aquatic Resources F. Jay Breidt, Siobhan Everson-Stewart, Alicia Johnson, Jean D. Opsomer."— Presentation transcript:

1 Applications of Nonparametric Survey Regression Estimation in Aquatic Resources F. Jay Breidt, Siobhan Everson-Stewart, Alicia Johnson, Jean D. Opsomer Nonparametric Model-Assisted Survey Regression Estimation F. Jay Breidt & Jean D. Opsomer Application to Northeastern Lakes Findings  For both CDF estimation and estimation of the median: Compared nonparametric regression estimator to Horvitz-Thompson and parametric estimators Nonparametric regression estimator performed well, in terms of mean square error, especially when the parametric model was misspecified Model-assisted approaches had lower relative bias than model-based approaches Extension to CDF Estimation Colorado State MS Project: Alicia Johnson Objectives Extend nonparametric regression estimation to finite population cumulative distribution function (CDF) estimation and compare to parametric techniques. Approach  Replaced response variable by indicator =1 for, 0 otherwise  Smoothed indicator versus auxiliary, x  Generated seven populations with various mean functions and variance terms  Performed simulation study to compare nonparametric regression CDF estimator to standard CDF estimators for estimation of CDF at median for estimation of median Model-Assisted Estimation Auxiliary Information  Use auxiliary information available for the entire aquatic resource of interest in addition to the sample data Example: spatial location of every lake in the population is known for EPA’s Environmental Monitoring and Assessment Program (EMAP) Northeastern Lakes study General Form of the Model-Assisted Estimator  Estimate population total as sum of model-based predictions for all population elements, plus a design-bias adjustment: Classical Parametric Survey Regression Estimator  Model-based predictions come from regressing the sample response on the auxiliary variable: A Nonparametric Approach Motivation for Nonparametric Methods  Regression estimator is inefficient if true relationship between the response and the auxiliary information is not linear  Breidt and Opsomer (2000) replaced parametric regression by nonparametric regression  Model-based predictions come from a local linear smooth (kernel regression) Local Linear Regression  Smooth at a point by performing locally weighted least squares regression  Weights come from kernel function, K Kernel may be a density or other function such as Epanechnikov, ¾(1-u 2 )I{|u| <1} Kernel scaled by bandwidth, h Large h leads to smoother, more global linear regression Small h leads to rougher, more local linear regression  Intercept in the locally weighted least squares fit is the smooth at the point  Modify for survey context by incorporating design weights.  Plug into model-assisted estimator Nonparametric Survey Regression Estimator  Nonparametric estimator of the total: where the nonparametric model-based prediction is with local design matrix, and the local weighting matrix, asymptotically design unbiased and consistent competitive with classical survey regression when the parametric model is correct dominates the classical estimator when the parametric model is misspecified admits a consistent variance estimator: For more information, see Breidt, F.J. and Opsomer, J.D. (2000). Local Polynomial Regression Estimation in Survey Sampling. Annals of Statistics 28, 1026-1053. Objectives Extension to Spatial Sampling Colorado State MS Project: Siobhan Everson-Stewart Approach  Replaced univariate kernel regression with bivariate kernel regression  Used product Epanechnikov kernel  Performed a simulation study to compare nonparametric regression estimator to standard estimators  Created smooth, spatially correlated surface over the unit square; varied strength of correlation, planar trend, variation in surface, random noise, and sample size Findings  Compared performance of Horvitz- Thompson, regression, and kernel regression estimators  Parametric planar regression did well when surface contains planar portion  Local planar regression estimator performed well, especially when parametric model was misspecified Extend nonparametric regression estimation to spatial sampling and compare to parametric techniques.  Population and Study Design EMAP surveyed lakes in the northeastern United States from 1991-1996 Aquatic resource of interest is over 20,000 lakes in 8 states 330 individual lakes were visited, each from one to six times Many measurements were taken on each lake, including several lake chemistry levels Acid neutralizing capacity (ANC) is a measure of a lake’s ability to buffer itself  Auxiliary Information For every lake in the region of interest, auxiliary information included spatial location, elevation, and ecoregion Use spatial location for illustration Easy to extend semiparametrically with parametric terms for elevation and ecoregion  CDF Estimation in Spatial Sampling Applied to Northeastern lakes data set Combined CDF estimation and spatial location extension Estimated CDF of ANC using local planar regression (LPR)  Confidence Interval Calculation Lakes are considered acidic if ANC < 0 Calculated 95% for the CDF at zero, which estimates proportion of acidic lakes in the region EPA’s National Surface Waters Survey estimated 4.2% of lakes in the northeastern region of the US to be acidic. 95% LPR Confidence Interval: (3.0%, 7.5%) contains the National Surface Waters Survey estimate Cumulative distribution function of ANC based on local planar regression (LPR) smooth on spatial location, with 95% pointwise confidence intervals. For comparison, design-based empirical CDF and confidence bounds are also shown. Map of lake population and lakes included in the EMAP Northeastern Lakes survey. Illustration of local linear regression. Curves at the bottom of the graph are kernel weights. The solid lines show the local weighted least squares fit at the points of interest. The dotted line is the kernel smooth. For more information, see Everson-Stewart (2003), Nonparametric survey regression estimation in two-stage spatial sampling, unpublished masters project, Colorado State University, available at http://www.stat.colostate.edu/starmap/everson-stewart.report.pdf. For more information, see Johnson, A. (2003), Estimating Distribution Functions from Survey Data, unpublished masters project, Colorado State University, available at http://www.stat.colostate.edu/starmap/johnsonaa.report.pdf. CI for Proportion of Acidic Lakes with National Surface Waters Survey Estimate Illustration of the model mean and standard deviation bounds (left) and the CDF (right) for one of seven generated populations. This research is funded by U.S.EPA – Science To Achieve Results (STAR) Program Cooperative Agreements # CR – 829095 and # CR – 829096 The research described in this poster has been funded by the U.S. Environmental Protection Agency through STAR Cooperative Agreements CR-829095 awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University and CR82- 9096 awarded to Oregon State University. The poster has not been subjected to the Agency's review and therefore does not necessarily reflect the views of the Agency, and no official endorsement should be inferred Relative biases and mean square error ratios (relative to model-assisted local linear, LLR) for DB (design-based Horvitz-Thompson), CD0 and CD1 (parametric model-based using ratio and regression models), RKM0 and RKM1 (parametric model-assisted using ratio and regression models), and LLRB (local linear model-based)


Download ppt "Applications of Nonparametric Survey Regression Estimation in Aquatic Resources F. Jay Breidt, Siobhan Everson-Stewart, Alicia Johnson, Jean D. Opsomer."

Similar presentations


Ads by Google