Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Adjustment Procedures to Account for Nonignorable Missing Data in Environmental Surveys Breda Munoz Virginia Lesser R82-9096-01.

Similar presentations


Presentation on theme: "1 Adjustment Procedures to Account for Nonignorable Missing Data in Environmental Surveys Breda Munoz Virginia Lesser R82-9096-01."— Presentation transcript:

1 1 Adjustment Procedures to Account for Nonignorable Missing Data in Environmental Surveys Breda Munoz Virginia Lesser R82-9096-01

2 2 This presentation was supported under STAR Research Assistance Agreement No. CR82-9096-01 awarded by the U.S. Environmental Protection Agency to Oregon State University. It has not been formally reviewed by EPA. The views expressed in this presentation are solely those of authors and EPA does not endorse any products or commercial services mentioned in this presentation.

3 3 Outline  Missing data in environmental surveys  Nonignorable missing data mechanism  Model-based approach for nonignorable missing data  Design-based estimation and nonignorable missing data  Illustration  Summary

4 4 Missing Data in Environmental Surveys  Researchers in environmental studies must obtain access to selected sites to gather field data  Denial of access:  common problem in environmental surveys  unit non-response  affects the results of data analysis

5 5 Response Disposition 1995/1996 EMAP North Dakota Prairie Wetlands Studies (Lesser, 2001) Result 19951996 Private Landowners Agreed to access43%40% Refused access36%37% Undeliverable 2% Not returned/no contact16%14% Public Land 3% 7% Total 100%

6 6 Introduction  (Boward et.al.,1999) The 1995-1997 Maryland Biological Stream Survey Results: overall denial access rate of 10%.  ODFW habitat surveys overall rate of access denial (Flitcroft et.al., 2002):  1998: 10.0%  1999: 6.0%  2000: 12.5%

7 7 Assumptions  A probability sampling design to collect outcomes of a spatial random process Y  is a collection of sampling sites selected using the probability sampling design.  auxiliary variables

8 8 Smith, Skinner and Clark (1999), Rubin and Little (2002) X1X1 X2X2 YR Missing Mechanism: Missing Completely at Random (MCAR)

9 9 X1X1 X2X2 YR Missing Mechanism: Missing at Random (MAR) Smith, Skinner and Clark (1999), Rubin and Little (2002)

10 10 X1X1 X2X2 YR Missing Mechanism: Nonignorable Smith, Skinner and Clark (1999), Rubin and Little (2002)

11 11 Model-based Approach  Under a nonignorable mechanism: we model the joint probability of the data and the missing mechanism indicator (“response” indicator) :  R(s i ) ~ Bernoulli(p i ), Data model Missing Mechanism model covariates

12 12 Model-assisted estimation and nonignorable missing data  Assume the parameter of interest: Total of the response Y R

13 13 Model-assisted estimation and nonignorable missing data  Continuous form of the Horvitz-Thompson estimator for the total (Cordy, 1993):  Let be a collection of fixed values

14 14 Model-assisted estimation (cont.)  Sample size n: observed, n-n* missing nonignorable missing

15 15 Model-assisted estimation (cont.) denotes the

16 16 Model-assisted estimation (cont.)  Likelihood:

17 17 Model-assisted estimation (cont.)  Reparameterize model parameters ( Baker and Laird (1988 )): Expected cell counts

18 18 Model-assisted estimation (cont.)  Use EM algorithm to estimate expected counts of missing cells, M ij.  E-step:

19 19  M-step: iterative proportional fitting (IPF) (Bishop et.al., 1975)  Algorithm based on fit of marginal totals.  EM algorithm always converges to a solution when using IPF in the M-step (Baker and Laird, 1988) Model-assisted estimation (cont.)

20 20  Possible estimators for the total of Y:  Cell adjustment: Model-assisted estimation (cont.) adjustment weight (Little and Rubin, 2002)

21 21  Column adjustment: Model-assisted estimation (cont.)

22 22  Row adjustment: Model-assisted estimation (cont.)

23 23 Model-assisted estimation (cont.)  Variance estimators obtained using bootstrap  (Efron, 1994) Bootstrap produces asymptotically valid variance.

24 24 Illustration  We simulate a continuous multivariate normal spatial random process for y  Population: John Day Middle Fork stream reaches  143 stream reaches divided in survey segments (~1 mile)  6536 survey segments  Area of 785 mi 2

25 25 Illustration  The population of stream reaches was stratified in 6 strata based on the number of survey segments: “<10 ” “10-20” “20-30” “30-50” “50-100” “>100”  Nonignorable missing data was generated as:  Missing rates of 15%, 30% and 50% were created.

26 26

27 27 Population Summary Strata1Strata2Strata3Strata4Strata5Strata6 Size246433269105912083321 Class Class 1 Class 2 64.23% 35.77% 65.13% 34.87% 64.31% 35.69% 65.44% 34.56% 65.48% 34.52% 61.70% 38.30% Summary Minimum Mean Max -2.07 1.63 7.01 -2.99 1.68 7.95 -3.96 1.66 8.04 -2.18 1.70 6.15 -2.37 1.73 8.65 -5.47 1.80 9.87

28 28 Illustration  Sample size n = 100  Allocation proportional to number of survey segments on each strata  Q 1 = first sample quantile

29 29

30 30 Modified Bootstrap  We draw 1000 random samples of size 100 from the observed sample:  Independently across strata  Maintain proportional allocation  Maintain the row totals by the auxiliary variable  For each of the 1000 samples, we estimate  We obtain a standard error and MSE for each estimate  We repeat this process 1000 times

31 31 Summary


Download ppt "1 Adjustment Procedures to Account for Nonignorable Missing Data in Environmental Surveys Breda Munoz Virginia Lesser R82-9096-01."

Similar presentations


Ads by Google