1 SIPP IMPUTATION SCHEME AND DISCUSSION ITEMS Presenters: Nat McKee - Branch Chief Census Bureau Demographic Surveys Division (DSD) Income Surveys Programming.

Slides:



Advertisements
Similar presentations
ARK is a resource dedicated to making social and political information on Northern Ireland available to all.
Advertisements

1 Editing the Integrated Census in Israel. EDITING THE INTEGRATED CENSUS IN ISRAEL Prepared by Eva Rotenberg, Central Bureau of Statistics, Israel (1)
Data Imputation United Nations Statistics Division (UNSD) 16 March 2011 Santiago, Chile.
Survey Methodology Nonresponse EPID 626 Lecture 6.
Research on Improvements to Current SIPP Imputation Methods ASA-SRM SIPP Working Group September 16, 2008 Martha Stinson.
Harvard Center for Population and Development Studies1 Census Editing and the Art of Motorcycle Maintenance Michael J. Levin Center for Population and.
1 Initial Plans for the Re-engineered SIPP 2010 Electronic Prototype Field Test Jason M. Fields, Housing and Household Economic Statistics Division, US.
Tobacco Use Supplement To The Current Population Survey Users’ Workshop June 2009 Tips and Tricks of Handling the TUS Data James “Todd” Gibson Information.
United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan,
Part Three Target Market Selection and Research Target Markets: Segmentation and Evaluation 7 7.
NLSCY – Non-response. Non-response There are various reasons why there is non-response to a survey  Some related to the survey process Timing Poor frame.
Analysis of Complex Survey Data Day 5, Special topics: Developing weights and imputing data.
CE Overview Jay T. Ryan Chief, Division of Consumer Expenditure Survey December 8, 2010.
© John M. Abowd 2005, all rights reserved Household Samples John M. Abowd March 2005.
The Characteristics of Employed Female Caregivers and their Work Experience History Sheri Sharareh Craig Alfred O. Gottschalck U.S. Census Bureau Housing.
© John M. Abowd 2005, all rights reserved Analyzing Frames and Samples with Missing Data John M. Abowd March 2005.
Bridging the Gaps: Dealing with Major Survey Changes in Data Set Harmonization Joint Statistical Meetings Minneapolis, MN August 9, 2005 Presented by:
Labor Statistics in the United States Grace York March 2004.
Census Processing Procedures Matt Sobek Funded by the National Science Foundation Minnesota Population Center.
1 Health Status and The Retirement Decision Among the Early-Retirement-Age Population Shailesh Bhandari Economist Labor Force Statistics Branch Housing.
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
U.S. Census Bureau Demographic Census 2000 July 8, 2003.
Introduction to methodological issues in LS ethnicity research Julian Buxton Bola Akinwale.
Household Surveys ACS – CPS - AHS INFO 7470 / ECON 8500 Warren A. Brown University of Georgia February 22,
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 15.
Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census.
Effects of Income Imputation on Traditional Poverty Estimates The views expressed here are the authors and do not represent the official positions.
The American Community Survey Texas Transportation Planning Conference Dallas, Texas July 19, 2012.
Copyright 2010, The World Bank Group. All Rights Reserved. Part 1 Survey Design Produced in Collaboration between World Bank Institute and the Development.
Father Involvement and Child Well-Being: 2006 Survey of Income and Program Participation (SIPP) Child Well-Being Topical Module 1 By Jane Lawler Dye Fertility.
Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I.
1 General Description and Unique Aspects Lisa Clement Current Population Survey (CPS)
1 An Overview Gregory D. Weyland Current Population Survey (CPS)
12th Meeting of the Group of Experts on Business Registers
The ACS and the 2010 Census Richard Lycan and Charles Rynerson Population Research Center Portland State University GIS in Action March, 2011.
Recent Trends in Worker Quality: A Midwest Perspective Daniel Aaronson and Daniel Sullivan Federal Reserve Bank of Chicago November 2002.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
Introduction to the Public Use Microdata Sample (PUMS) File from the American Community Survey Updated February 2013.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Dynamics of Economic Well-being System (DEWS) Census Bureau’s replacement for SIPP.
Copyright 2010, The World Bank Group. All Rights Reserved. Data Processing and Tabulation, Part I.
The Uninsured in Alameda County 2010 December 2010.
1 Sources of gender statistics Angela Me UNECE Statistics Division.
United Nations Economic Commission for Europe Statistical Division Sources of gender statistics Angela Me UNECE Statistics Division.
© John M. Abowd 2007, all rights reserved Analyzing Frames and Samples with Missing Data John M. Abowd March 2007.
King Fahd University of Petroleum & Minerals Department of Management and Marketing MKT 345 Marketing Research Dr. Alhassan G. Abdul-Muhmin Editing and.
Poverty measurement: experience of the Republic of Moldova UNECE, Measuring poverty, 4 May 2015.
1 New Implementations of Noise for Tabular Magnitude Data, Synthetic Tabular Frequency and Microdata, and a Remote Microdata Analysis System Laura Zayatz.
Editing a Mixture of Canadian 2006 Census and Tax Data Mike Bankier Statistics Canada 2006 Work Session on Statistical Data Editing
Multiple Imputation Methods for Imputing Earnings in the Survey of Income and Program Participation (SIPP) María García, Chandra Erdman, and Ben Klemens.
Current Population Survey Sponsor: Bureau of Labor Statistics Collector: Census Bureau Purpose: Monthly Data for Analysis of Labor Market Conditions –CPS.
1 Reengineering the SIPP: An Assessment of the Use of Administrative Records Jim Farber and Sally Obenski US Census Bureau CNSTAT Panel January 26, 2007.
National design, fieldwork and data harmonization for Labour Force Survey Irena Svetin Statistical Office of the Republic of Slovenia September 2014.
The Challenge of Non- Response in Surveys. The Overall Response Rate The number of complete interviews divided by the number of eligible units in the.
Copyright 2010, The World Bank Group. All Rights Reserved. Reducing Non-Response Section A 1.
Copyright © Houghton Mifflin Company. All rights reserved. 7–17–1 What Is a Market? Requirements of a Market –Must need or desire a particular product.
© John M. Abowd 2007, all rights reserved General Methods for Missing Data John M. Abowd March 2007.
Household Surveys: American Community Survey & American Housing Survey Warren A. Brown February 8, 2007.
RESEARCH METHODS Lecture 29. DATA ANALYSIS Data Analysis Data processing and analysis is part of research design – decisions already made. During analysis.
Paolo Valente - UNECE Statistical Division Slide 1 Technology for census data coding, editing and imputation Paolo Valente (UNECE) UNECE Workshop on Census.
United Nations Workshop on Evaluation and Analysis of Census Data, 1-12 December 2014, Nay Pyi Taw, Myanmar DATA VALIDATION-I Evaluation of editing and.
UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation.
The Cognitive Survey for Mauritius – test and results Presented by: Mr Chettun Kumar ARIANAICK Statistician.
INFO 4470/ILRLE 4470 Visualization Tools and Data Quality John M. Abowd and Lars Vilhuber March 16, 2011.
U.S. Hispanic Population: Population Size and Composition.
Using Data from the National Survey of Children with Special Health Care Needs Centers for Disease Control and Prevention National Center for Health Statistics.
SHARELIFE Meeting Vienna – November, 5-6 The Italian experience in SHARE data cleaning Paccagnella Omar Omar Paccagnella SHARELIFE meeting November 6,
Editing and Imputing Income Data in the 2008 Integrated Census prepared by Yael Klejman Israel Central Bureau of Statistics UNITED NATIONS ECONOMIC.
Treatment of Missing Data Pres. 8
Presentation transcript:

1 SIPP IMPUTATION SCHEME AND DISCUSSION ITEMS Presenters: Nat McKee - Branch Chief Census Bureau Demographic Surveys Division (DSD) Income Surveys Programming Branch (SIPP) Zelda McBride -Supervisor Census Bureau Demographic Surveys Division (DSD) Income Surveys Programming Branch (SIPP) ASA/SRM SIPP WORKING GROUP MEETING September 16, 2008

2 OVERVIEW OF IMPUTATION TYPES OF MISSING DATA Item Non-Response as refusals, blanks, don’t know, incompatible answers Handled via hot deck imputation Unit Non-Response as person level non-interviews or insufficient partial Handled via Type Z and/or hot deck imputation

3 HOT DECK OVERVIEW File is sorted geographically – allocated data likely to come from geographically proximate case Replace missing data items with reported data from another similar person/household

4 EDITING STEPS Before Pass 1 – cold (initial) values are in the decks, missing data is not imputed yet Pass 1 – cold values are replaced by the live hot data but editing is not saved Pass 2 – the last values updated in Pass 1 are the starting Values for the edit pass

GENDER X AGE CATEGORIES INITIAL VALUES What did you have for lunch today? 1-Hamburger 2-Yogurt 3-Salad 4-Chicken 5-Roast Beef 6-Other MaleFemale 1. Under

< VALUES AFTER PASS 1 BEFORE EDITING F Nat, Tracy, Zelda, Jeff, Martha R R M

MF COUNTERS FOR DONOR USAGE

8 IMPUTING FOR MISSING DATA Process sequentially by unit for each section: demographics, household characteristics, labor force, assets, general income, health insurance and program participation If non missing data --- replaces the hot deck value If missing takes the last hot deck value and increments the counter Repeating the same edit program/imputation will give the same results each time (i.e. rerun – no changes – same donors, same results)

9 IMPUTATION MATRICES Matrix defined with stratifying parameters relevant to the item Sex, race, age (with categories) are used frequently in matrices Other specialized relevant variables are used too as when imputing class of worker a recode of industries is used in the matrix

10 USING PREVIOUS WAVE DATA Wave 2+ sometimes use previous wave data as a parameter in the hot deck Advantage – more consistency wave to wave Disadvantage – a particular donor has the potential to influence every wave

11 ALLOCATION FLAGS 0 – no imputation initialized 1 – hot deck imputation 2 – set to cold value 3 – logical (derived) 4 – used previous wave data

12 TYPE Z NONINTERVIEW Type Z Noninterview = Noninterviewed Person Within Interviewed Household: EPPINTVW (Wave 3) Frequency Percent =Noninterview in all 4 months =Interview (Self) =Interview (Proxy =Non-Interview - Type Z =Non-Interview - Psuedo Type Z =Children under 15 during ref period

13 TYPE Z IMPUTATION Type Z Imputation = Hierarchical sorting and merging Operation that matches type Z noninterviews with respondents based on demographic characteristics available for both. Imputes entire record from single donor.

14 ELIGIBILITY FOR TYPE Z IMPUTATION Type Z noninterview Wave 1, or for Wave 2+ no previous wave info available Type Z Eligibility TYPZIMP (Wave 3) Frequency Percent Not Eligible Eligible

15 ELIGIBILITY FOR TYPE Z DONORS Interview or sufficient partial interview sufficient partial = reached first asset question (completed Demographics, Labor Force Recipiency, General Income Recipiency, and Asset Intro.)

16 TYPE Z PROCESS determine if person is type Z or donor, create separate files for type Z and donors

17 TYPE Z PROCESS - CONTINUED create 4 levels of match keys for each person on both files –match keys are based on rotation group plus various demographic variables: age, race, sex, veteran status, marital status, relationship to reference person, educational attainment, parental status, spouse’s interview status –Level 1 keys are the most restrictive, level 4 are the least (designed to always find a match)

18 TYPE Z PROCESS - CONTINUED sort both files by match keys match files select best match for each type Z case: –level 1 match=best level 4=worst transfer data from donor record to type z record for matched cases

19 LITTLE TYPE Z Used in labor force edit to get job and labor force data from a donor

20 QUESTIONS?

21 DISCUSSION ISSUES ON HOW TO IMPROVE CURRENT IMPUTATIONS 1.What do we gain by doing type Z imputations vs. hot deck imputations? What are the trade-offs? 2.What is the threshold (or how should a threshold be determined) for identifying hot-deck overuse for a particular donor/cell? Does this need to be adjusted as the sample size changes (as in the case of a sample cut)?

22 DISCUSSION ISSUES ON HOW TO IMPROVE CURRENT IMPUTATIONS (CONTINUED) 3.What is the threshold (or how should a threshold be determined) for determining cold-deck overuse? 4.How do we determine optimum size for a particular hot deck? Is there a relationship between the number of cells in a hot deck matrix and the number of cases in the universe?

23 DISCUSSION ISSUES ON HOW TO IMPROVE CURRENT IMPUTATIONS (CONTINUED) 5.Currently, we do not distinguish between reported data and imputed data in the stratifying variables for particular hot decks. Do we need to be concerned about this? 6.Any objective, simple way to choose stratifying variables in a hot deck? 7.What methods/criteria should be used to determine quality of imputations?