1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.

Slides:



Advertisements
Similar presentations
Handling attrition and non- response in longitudinal data Harvey Goldstein University of Bristol.
Advertisements

Treatment of missing values
Analyzing Survey Data Angelina Hill, Associate Director of Academic Assessment 2009 Academic Assessment Workshop May 14 th & 15 th UNLV.
Statistics for Managers Using Microsoft® Excel 5th Edition
Some birds, a cool cat and a wolf
Survey Inference with Incomplete Data Trivellore Raghunathan Chair and Professor of Biostatistics, School of Public Health Research Professor, Institute.
Missing Data Issues in RCTs: What to Do When Data Are Missing? Analytic and Technical Support for Advancing Education Evaluations REL Directors Meeting.
Analysis of Complex Survey Data Day 5, Special topics: Developing weights and imputing data.
Adapting to missing data
How to Handle Missing Values in Multivariate Data By Jeff McNeal & Marlen Roberts 1.

Clustered or Multilevel Data
How to deal with missing data: INTRODUCTION
Modeling Achievement Trajectories When Attrition is Informative Betsy J. Feldman & Sophia Rabe- Hesketh.
Variables and Measurement (2.1) Variable - Characteristic that takes on varying levels among subjects –Qualitative - Levels are unordered categories (referred.
Formalizing the Concepts: Simple Random Sampling.
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
Statistical Methods for Missing Data Roberta Harnett MAR 550 October 30, 2007.
PEAS wprkshop 2 Non-response and what to do about it Gillian Raab Professor of Applied Statistics Napier University.
Measurement Error.
Definitions Observation unit Target population Sample Sampled population Sampling unit Sampling frame.
Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I.
Chapter 1 Introduction and Data Collection
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Sample Size Determination CHAPTER Eleven.
Design Effects: What are they and how do they affect your analysis? David R. Johnson Population Research Institute & Department of Sociology The Pennsylvania.
Secondary Data Analysis Linda K. Owens, PhD Assistant Director for Sampling and Analysis Survey Research Laboratory University of Illinois.
Chapter 8: Nonresponse Reading (read for concepts)
JENNIFER SAYLOR, PHD, RN, ANCS-BC UNIVERSITY OF DELAWARE SEPTEMBER 14, 2012 Essentials of Complex Data Analysis Utilizing National Survey.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Performance of Resampling Variance Estimation Techniques with Imputed Survey data.
G Lecture 11 G Session 12 Analyses with missing data What should be reported?  Hoyle and Panter  McDonald and Moon-Ho (2002)
Handling Attrition and Non- response in the 1970 British Cohort Study Tarek Mostafa Institute of Education – University of London.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 1-1 Statistics for Managers Using Microsoft ® Excel 4 th Edition Chapter.
Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
MEPS WORKSHOP Household Component Survey Estimation Issues Household Component Survey Estimation Issues Steve Machlin, Agency for Healthcare Research and.
SAMPLE SELECTION in Earnings Equation Cheti Nicoletti ISER, University of Essex.
SW 983 Missing Data Treatment Most of the slides presented here are from the Modern Missing Data Methods, 2011, 5 day course presented by the KUCRMDA,
Introduction to Secondary Data Analysis Young Ik Cho, PhD Research Associate Professor Survey Research Laboratory University of Illinois at Chicago Fall,
© John M. Abowd 2007, all rights reserved General Methods for Missing Data John M. Abowd March 2007.
1 G Lect 13W Imputation (data augmentation) of missing data Multiple imputation Examples G Multiple Regression Week 13 (Wednesday)
The Impact of Missing Data on the Detection of Nonuniform Differential Item Functioning W. Holmes Finch.
Missing Values Raymond Kim Pink Preechavanichwong Andrew Wendel October 27, 2015.
Introduction to Survey Sampling
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Sampling and Sampling Distributions Basic Business Statistics 11 th Edition.
Statistics Canada Citizenship and Immigration Canada Methodological issues.
Basic Business Statistics, 8e © 2002 Prentice-Hall, Inc. Chap 1-1 Inferential Statistics for Forecasting Dr. Ghada Abo-zaid Inferential Statistics for.
Analysis of Experiments
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
Tutorial I: Missing Value Analysis
CASE STUDY: NATIONAL SURVEY OF FAMILY GROWTH Karen E. Davis National Center for Health Statistics Coordinating Center for Health Information and Service.
1 of 22 INTRODUCTION TO SURVEY SAMPLING October 6, 2010 Linda Owens Survey Research Laboratory University of Illinois at Chicago
INFO 4470/ILRLE 4470 Visualization Tools and Data Quality John M. Abowd and Lars Vilhuber March 16, 2011.
1 Data Collection and Sampling ST Methods of Collecting Data The reliability and accuracy of the data affect the validity of the results of a statistical.
Using Data from the National Survey of Children with Special Health Care Needs Centers for Disease Control and Prevention National Center for Health Statistics.
Pre-Processing & Item Analysis DeShon Pre-Processing Method of Pre-processing depends on the type of measurement instrument used Method of Pre-processing.
Sample Design of the National Health Interview Survey (NHIS) Linda Tompkins Data Users Conference July 12, 2006 Centers for Disease Control and Prevention.
Sampling Design and Analysis MTH 494 LECTURE-11 Ossam Chohan Assistant Professor CIIT Abbottabad.
Best Practices for Handling Missing Data
Missing data: Why you should care about it and what to do about it
Handling Attrition and Non-response in the 1970 British Cohort Study
MISSING DATA AND DROPOUT
Maximum Likelihood & Missing data
Introduction to Survey Data Analysis
Multiple Imputation Using Stata
Presenter: Ting-Ting Chung July 11, 2017
The European Statistical Training Programme (ESTP)
CH2. Cleaning and Transforming Data
Analysis of missing responses to the sexual experience question in evaluation of an adolescent HIV risk reduction intervention Yu-li Hsieh, Barbara L.
Chapter 13: Item nonresponse
Presentation transcript:

1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois at Chicago

2 Survey Research Laboratory  Data cleaning/missing data  Sampling bias reduction Focus of the seminar

3 Survey Research Laboratory When analyzing survey data Understand & evaluate survey design 2. Screen the data 3. Adjust for sampling design

4 Survey Research Laboratory 1. Understand & evaluate survey  Conductor of survey  Sponsor of survey  Measured variables  Unit of analysis  Mode of data collection  Dates of data collection

5 Survey Research Laboratory  Geographic coverage  Respondent eligibility criteria  Sample design  Sample size & response rate 1. Understand & evaluate survey

6 Survey Research Laboratory  Nominal  Ordinal  Interval  Ratio Levels of measurement

7 Survey Research Laboratory Examine raw frequency distributions for… (a) out-of-range values (outliers) (b) missing values 2. Data screening

8 Survey Research Laboratory Out-of-range values:  Delete data  Recode values 2. Data screening

9 Survey Research Laboratory  can reduce effective sample size  may introduce bias Missing data:

10 Survey Research Laboratory  Refusals (question sensitivity)  Don’t know responses (cognitive problems, memory problems)  Not applicable  Data processing errors  Questionnaire programming errors  Design factors  Attrition in panel studies Reasons for missing data

11 Survey Research Laboratory  Reduced sample size – loss of statistical power  Data may no longer be representative– introduces bias  Difficult to identify effects Effects of ignoring missing data

12 Survey Research Laboratory Assumptions on missing data  Missing completely at random (MCAR)  Missing at random (MAR)  Ignorable  Nonignorable

13 Survey Research Laboratory Missing completely at random (MCAR)  Being missing is independent from any variables.  Cases with complete data are indistinguishable from cases with missing data.  Missing cases are a random sub- sample of original sample.

14 Survey Research Laboratory Missing at random (MR)  The probability of a variable being observed is independent of the true value of that variable controlling for one or more variables.  Example: Probability of missing income is unrelated to income within levels of education.

15 Survey Research Laboratory Ignorable missing data  The data are MAR.  The missing data mechanism is unrelated to the parameters we want to estimate.

16 Survey Research Laboratory Nonignorable missing data  The pattern of data missingness is non-MAR.

17 Survey Research Laboratory Methods of handling missing data  Listwise (casewise) deletion: uses only complete cases  Pairwise deletion: uses all available cases  Dummy variable adjustment: Missing value indicator method  Mean substitution: substitute mean value computed from available cases (cf. unconditional or conditional)

18 Survey Research Laboratory  Regression methods: predict value based on regression equation with other variables as predictors  Hot deck: identify the most similar case to the case with a missing and impute the value Methods of handling missing data

19 Survey Research Laboratory  Maximum likelihood methods: use all available data to generate maximum likelihood-based statistics. Methods of handling missing data

20 Survey Research Laboratory  Multiple imputation: combines the methods of ML to produce multiple data sets with imputed values for missing cases Methods of handling missing data

21 Survey Research Laboratory Types of survey sample designs  Simple Random Sampling  Systematic Sampling  Complex sample designs  stratified designs  cluster designs  mixed mode designs

22 Survey Research Laboratory Why complex sample designs?  Increased efficiency  Decreased costs

23 Survey Research Laboratory  Statistical software packages with an assumption of SRS underestimate the sampling variance.  Not accounting for the impact of complex sample design can lead to a biased estimate of the sampling variance (Type I error). Why complex sample designs?

24 Survey Research Laboratory Sample weights  Used to adjust for differing probabilities of selection.  In theory, simple random samples are self-weighted.  In practice, simple random samples are likely to also require adjustments for nonresponse.

25 Survey Research Laboratory Types of sample weights  Poststratification weights: designed to bring the sample proportions in demographic subgroups into agreement with the population proportion in the subgroups.  Nonresponse weights: designed to inflate the weights of survey respondents to compensate for nonrespondents with similar characteristics.  “Blow-up” (expansion) weights: provide estimates for the total population of interest.

26 Survey Research Laboratory Syntax examples of design-based analysis in STATA, SUDAAN, & SAS STATA svyset strata strata svyset psu psu svyset pweight finalwt svyreg fatitk age male black hispanic SUDAAN proc regress data=”c:\nhanes.sav” filetype=spss desgn=wr; nest strata psu; weight finalwt subpgroup sex race; levels 23; model fatintk = age sex race;

27 Survey Research Laboratory SAS proc surveyreg data=nhanes; strata strata; cluster psu; class sex race; model fatintk = age sex race; weight finalwt Syntax examples of design-based analysis in STATA, SUDAAN, & SAS

28 Survey Research Laboratory In summary, when analyzing survey data...  Understand & evaluate survey design  Screen the data – deal with missing data & outliers.  If necessary, adjust for study design using weights and appropriate computer software.

29 Survey Research Laboratory Thank You!