Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE Data Cleaning Stephanie Stuck MEA Vienna November 5/6 th.

Similar presentations


Presentation on theme: "Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE Data Cleaning Stephanie Stuck MEA Vienna November 5/6 th."— Presentation transcript:

1 Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE Data Cleaning Stephanie Stuck MEA Vienna November 5/6 th

2 2 General philosophy  Respondents are experts of their own lives, in general we (still ) take their answers very seriously  Only change data if you are sure it is wrong, if answers seem implausible but you are not sure what to do  indicate this via flag variable

3 3 General rules  Please use data files with original sampid to check and correct data (don’t use data version with sampid2)  Always write programs to correct data (STATA do or SPSS sps files) please never change data directly (e.g. no changes in editors)

4 4 General rules  Keep original variables (name: "varname_original”)  Add flag variables to indicate changes (name: "varname_flag)  Save corrected data files with new name (e.g. “filename_corrected”)

5 5  don’t always take wave 1 information for granted, it can be wrong, too  sometimes we will have to change wave 1 data, too  we will have another release of wave 1 data together with the public release of wave 2  Probably we will already have a minor update of release 2.0.1 early next year General rules

6 6 Very next steps  Check for country specific deviations! e.g. especially routing errors, ep071, ep098, hc module etc.  Send information on all country specific deviations to MEA, please don’t forget an English translation or explanation of deviations  Information on important deviations in central variables should be available to all FRB authors together with release 0

7 7 Very next steps Check financial amounts for implausible values, e.g. negative or very high amounts  outliers  zero values  wrong currencies  typing errors  “drunken interviewers” problem also consider frequencies of payments etc.

8 8 Wrong sampid, cvid or respid MEA already checks for mismatches within and between waves  Please ask survey agencies and send all information you have on renamed cases, mismatches etc. to MEA  Whenever you find new information on mismatches e.g. in remarks send the information to MEA  Please send data files with old and new ids for renamed cases to MEA, provide information on date and reason (if possible) in additional variables  Sometimes only the CV or only the individual modules (DN etc.) have to be renamed (especially but not only if respondents are exchanged within households). Please don’t forget to provide information where changes have to be done. MEA will correct files and send lists with hard cases to country teams to check/ask survey agencies again

9 9 General checks  Corrections based on checks of frequency distributions, e.g. outliers, values out of range  Corrections based on consistency checks  within and between modules and waves

10 10 More concrete  Check for empty cases  Check for duplicates  Check year of birth between coverscreen (cv_r and cv_h) and dn module, drop-offs and vignettes respectively, and possibly with the gross sample  Check gender CV/DN vs. drop-off/vignettes  Check for consistency of dates:  Check information on marital status:  Check respondent dummies  Check ch module against coververscreen  Check relation to coverscreen respondent

11 11 Interviewer remarks  Go through remarks  a lot of them are not helpful, but some are very important (e.g. exchanged respondent, amounts apply to all familiy members, different time horizons etc.)  Categorize problems as much as possible  Write programs to correct data if possible  Flag cases where unsure  Collect information on questions that caused a lot of problems / didn’t work for future waves

12 12 Open questions  Go through open questions and code answers into original values if possible  Priority list of variables education, employment status

13 13 How to go on  Your experience is very appreciated  Please send information on what you have done, what problems you found etc. to MEA  MEA will send out more information, results of our discussion now, ‘checking lists’, ‘common problems’, etc.  We should have another meeting/workshop maybe in February or we could have an extra meeting e.g. in Mannheim


Download ppt "Mannheim Research Institute for the Economics of Aging www.mea.uni-mannheim.de SHARE Data Cleaning Stephanie Stuck MEA Vienna November 5/6 th."

Similar presentations


Ads by Google