Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multiple Imputation Methods for Imputing Earnings in the Survey of Income and Program Participation (SIPP) María García, Chandra Erdman, and Ben Klemens.

Similar presentations


Presentation on theme: "Multiple Imputation Methods for Imputing Earnings in the Survey of Income and Program Participation (SIPP) María García, Chandra Erdman, and Ben Klemens."— Presentation transcript:

1 Multiple Imputation Methods for Imputing Earnings in the Survey of Income and Program Participation (SIPP) María García, Chandra Erdman, and Ben Klemens

2 Outline  Background on the Survey of Income and Program Participation (SIPP)  Methods for missing data imputation - Randomized Hot deck - SRMI  Simulation study  Evaluation  Concluding remarks

3 Background on the SIPP  Longitudinal survey, data collected in panels with interviews at set frequencies (2- 4 years)  Demographic characteristics, assets, liabilities, labor force participation, earnings, etc.  Provide comprehensive information about income and program participation  Evaluate federal, state, and local programs and provide measures of economic well-being

4 Background on the SIPP  Hot deck for most missing data imputation  Recent major redesign  Research ways to improve data processing. -Explore alternative imputation methods -Focus on missing monthly job-level earnings (twelve variables) -Sequential Regression Multivariate Imputation (SRMI, Raghunathan et al., 2001)

5 Sequential Regression Multivariate Imputation (SRMI)

6 SRMI

7 Simulation Study  SRMI - R package mi (Su et al., 2011) - Job-level earnings indicator – logistic regression - Monthly earnings indicator imputed to positive – impute corresponding missing earnings using SRMI  Hot deck - TEA’s randomized hot deck (Klemens, 2012)  Multiple imputation

8 Simulation Study  Simulation data - Complete 2004 SIPP panel data – “true” - Randomly select multiple sets of 10% of observations for which the job-level earnings are to be set to missing (100 repetitions)  Explanatory variables - Age, sex, race, education, occupation, industry, firm size, job-type, hours, lead, lag, etc.

9 Average Difference in RMSE (SRMI – Hot Deck) MonthMean_DiffSE_Diff Jan-1257141 Feb-944189 Mar-2778150 Apr-1517122 May-202969 Jun-233054 Jul-232784 Aug-261791 Sep-2041187 Oct-4370399 Nov-1369403 Dec-1314122 No hay nada

10 Between-Imputation, Within-Imputation, and Total Variance of Mean Monthly Earnings for Some Months MonthVarianceSRMIHot-deck MeanSELowUppMeanSELowUpp Jan Between 12.301.0210.2914.31115.0511.5292.45137.65 Within 576.068.98558.45593.67664.6439.28587.57741.70 Total 591.449.11573.57609.31808.4545.75718.69898.21 Apr Between 7.080.635.858.3164.637.3450.2279.03 Within 337.572.75332.18342.96365.8024.71317.32414.27 Total 346.422.96340.62352.22446.5832.23383.35509.81 Aug Between 5.5600.474.646.48106.287.9790.64121.93 Within 537.563.37530.94544.18675.7177.98522.72828.70 Total 544.513.42537.79551.22808.5777.95655.63961.51 No hay nada

11 RMSE of Mean Monthly Earnings MonthSRMIHot DeckDifference Jan15.4722.15-6.67 Feb17.9019.95-2.05 Mar27.7425.192.54 Apr22.8022.95-0.15 May18.6420.87-2.24 Jun25.8517.508.35 July9.2520.73-11.48 Aug23.0824.27-1.19 Sep20.0320.70-0.67 Oct33.6667.33-33.66 Nov13.2727.83-14.55 Dec38.8726.7011.76 No hay nada

12 Concluding Remarks  Results show the model-based approach to imputation is a feasible alternative to hot deck for imputing missing values in the SIPP and should be further explore.  Model can incorporate more information than the hot- deck without depleting the donor pool.  Possibility to use any available auxiliary information. (e.g. administrative data)  Set up the model in a multiple imputation environment so we can estimate variances.  Disadvantage of using package mi for SRMI: computationally intensive

13 Thank you! maria.m.garcia@census.gov


Download ppt "Multiple Imputation Methods for Imputing Earnings in the Survey of Income and Program Participation (SIPP) María García, Chandra Erdman, and Ben Klemens."

Similar presentations


Ads by Google