Copyright 2010, The World Bank Group. All Rights Reserved. Data Processing and Tabulation, Part I.

Slides:



Advertisements
Similar presentations
MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Data Entry Editing.
Advertisements

MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Data Entry Applications with Logic.
MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Secondary Editing.
Multiple Indicator Cluster Surveys Data Processing Workshop Data Entry Applications with Logic MICS Data Processing Workshop.
Preparing Data for Quantitative Analysis
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Processing and Fundamental Data Analysis CHAPTER fourteen.
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Processing and Fundamental Data Analysis CHAPTER fourteen.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Data Processing, Fundamental Data Analysis, and Statistical Testing of Differences CHAPTER.
7.Implications for Analysis: Parent/Youth Survey Data.
ILASFAA Annual Conference April 16-18, 2008 Common FAFSA Errors And how to avoid them.
INTERPRET MARKETING INFORMATION TO TEST HYPOTHESES AND/OR TO RESOLVE ISSUES. INDICATOR 3.05.
Welcome to class of Data Analysis -Dr. Satyendra Singh.
SOWK 6003 Social Work Research Week 10 Quantitative Data Analysis
Statistics—Chapter 4 Analyzing Frequency Distributions Read pp , , 116, ,
Databases.
Learning Objective Chapter 13 Data Processing, Basic Data Analysis, and Statistical Testing of Differences CHAPTER thirteen Data Processing, Basic Data.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 15.
McGraw-Hill/Irwin © 2004 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 9 Processing the Data.
Part 3: European Social Survey Variable names.
Data Processing, Fundamental Data
U. S. Bureau of Labor Statistics Marriage and Cohabitation Data in the National Longitudinal Surveys Alison Aughinbaugh NLS Summer Workshop 2007.
Chapter Sixteen Starting the Data Analysis Angel Gillis & Winston Jackson Research for Nurses: Research for Nurses: Methods and Interpretation.
Copyright 2010, The World Bank Group. All Rights Reserved. Testing and Documentation Part I.
Copyright 2010, The World Bank Group. All Rights Reserved. Part 1 Survey Design Produced in Collaboration between World Bank Institute and the Development.
Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I.
SHARE data cleaning meeting Frankfurt – December, 6 Some suggestions from the Italian experience Paccagnella Omar Omar Paccagnella Data cleaning meeting.
Introduction to fertility In Demography, the word ‘fertility’ refers to the number live births women have It is a major component of population change.
Multiple Indicator Cluster Surveys Data Processing Workshop Secondary Editing MICS Data Processing Workshop.
Copyright © 2008 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 24 Designing a Quantitative Analysis Strategy: From Data Collection to Interpretation.
Chapter Thirteen Validation & Editing Coding Machine Cleaning of Data Tabulation & Statistical Analysis Data Entry Overview of the Data Analysis.
1a. How many years have you been playing tennis on a regular basis? Number of years: __________ b. What is your level of play? Novice
King Fahd University of Petroleum & Minerals Department of Management and Marketing MKT 345 Marketing Research Dr. Alhassan G. Abdul-Muhmin Editing and.
HPRP: New Reports HPRP new reports and data entry reporting review April 2010.
Time Use Survey Coding and Processing Time Use Data.
Chapter Fourteen Data Preparation 14-1 Copyright © 2010 Pearson Education, Inc.
Welcome and time use data orientation Gretchen Donehower Day 1, Session 1, NTA Time Use and Gender Workshop Monday, May 21, 2012 Institute for Labor, Science.
Creating a Database Designing Structure, Capturing and Presenting Data.
Data Analysis: Preliminary Steps
Programming Logic and Design Sixth Edition Chapter 5 Looping.
Planning how to create the variables you need from the variables you have Jane E. Miller, PhD The Chicago Guide to Writing about Numbers, 2 nd edition.
Copyright 2010, The World Bank Group. All Rights Reserved. Questionnaire Design Part II Disclaimer: The questions shown in this section are not necessarily.
Copyright 2010, The World Bank Group. All Rights Reserved. Questionnaire Design Part I Disclaimer: The questions shown in this section are not necessarily.
Data Analysis.
Chapter Twelve Copyright © 2006 John Wiley & Sons, Inc. Data Processing, Fundamental Data Analysis, and Statistical Testing of Differences.
SW318 Social Work Statistics Slide 1 Frequency: Nominal Variable Practice Problem This question asks the frequency of widowed respondents of the survey.
Getting Applications, Rosters, Verification Ready for a CRE Performance Standard 1.
RESEARCH METHODS Lecture 29. DATA ANALYSIS Data Analysis Data processing and analysis is part of research design – decisions already made. During analysis.
Dr. Michael R. Hyman, NMSU Data Preparation. 2 File, Record, and Field.
Copyright 2010, The World Bank Group. All Rights Reserved. Part 2 Survey Design Produced in Collaboration between World Bank Institute and the Development.
Copyright 2010, The World Bank Group. All Rights Reserved. Testing and Documentation Part II.
Copyright 2010, The World Bank Group. All Rights Reserved. Managing Data Processing Section B.
Analysis of the characteristics of internet respondents to the 2011 Census to inform 2021 Census questionnaire design Orlaith Fraser & Cal Ghee.
DTC Quantitative Methods Summary of some SPSS commands Weeks 1 & 2, January 2012.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Copyright 2010, The World Bank Group. All Rights Reserved. DESIGN, PART 2 Questionnaire Design Quality assurance for census 1.
SHARELIFE Meeting Vienna – November, 5-6 The Italian experience in SHARE data cleaning Paccagnella Omar Omar Paccagnella SHARELIFE meeting November 6,
Data Processing, Fundamental Data Analysis, and the Statistical Testing of Differences Chapter Twelve.
Chapter Fourteen Copyright © 2004 John Wiley & Sons, Inc. Data Processing and Fundamental Data Analysis.
PROCESSING DATA.
CHAPTER 13 Data Processing, Basic Data Analysis, and the Statistical Testing Of Differences Copyright © 2000 by John Wiley & Sons, Inc.
Handling Data Designing Structure, Capturing and Presenting Data
Basic Marketing Research Customer Insights and Managerial Action
Warm up – Unit 4 Test – Financial Analysis
Handling Data Designing Structure, Capturing and Presenting Data
Data Processing, Basic Data Analysis, and the
Data Preparation (Click icon for audio) Dr. Michael R. Hyman, NMSU.
By A.Arul Xavier Department of mathematics
Indicator 3.05 Interpret marketing information to test hypotheses and/or to resolve issues.
Presentation transcript:

Copyright 2010, The World Bank Group. All Rights Reserved. Data Processing and Tabulation, Part I

Copyright 2010, The World Bank Group. All Rights Reserved. Importance of data processing If data aren’t clean and well edited: –analysis will be flawed –data may be inconsistent across years –country data may not compare well with those of other countries –publications may need to be reissued when errors are found Source: 2

Copyright 2010, The World Bank Group. All Rights Reserved. First steps Data from paper surveys need to be carefully entered into the computer –Minimize typing errors and correct those that are found –Code verbatim and “other, specify” responses if they actually belong in an existing category –Develop a system for controlling which questionnaires have and have not yet been entered Gather questionnaires that have issues so they will be easier to access and investigate later. Make sure the data file is an appropriate size and the number of records is on target 3

Copyright 2010, The World Bank Group. All Rights Reserved. What to look for Response values that are invalid –A code of “7” would be invalid for a question where the possible response codes are “1” and “2” Find invalid codes by producing a frequency distribution or by sorting the data on that variable. Determine the correct response by checking the paper questionnaire or by looking at which subsequent questions were answered. Logically inconsistent responses –Responses that can’t both be true. For example, a respondent who is recorded as being younger than her child Determine the correct response by checking the paper questionnaire or by looking at which subsequent questions were answered. 4

Copyright 2010, The World Bank Group. All Rights Reserved. What to look for Impossible responses –Responses that can’t be true. For example, a respondent who reports working more than 7 days a week. Check the paper questionnaire to determine if it is a typo and what the correct answer is. Improbable responses –Responses that are more likely to be a mistake than be true. For example, a 16 year old recorded as having a PhD. Check the paper questionnaire to determine correct responses. 5

Copyright 2010, The World Bank Group. All Rights Reserved. What to look for Omissions of responses or sections of the questionnaire –The response may have been missed during data entry or the interviewer may have failed to follow the skip instructions. Check the paper questionnaire and subsequent responses to determine what happened and if an answer is available. 6

Copyright 2010, The World Bank Group. All Rights Reserved. What to look for Responses to questions or sections of the questionnaire that should not have been asked –The interviewer may have failed to follow the skip instructions correctly. For example, both the employment and unemployment sections may have been asked of the same person, but only one of those sections can be appropriate. Use the responses and/or the sorting question to determine which section is valid and which has nonsense answers. Delete the nonsense section answers from the clean file. 7

Copyright 2010, The World Bank Group. All Rights Reserved. What to look for The number of responses to questions –This should be relatively stable and differences should be largely explainable by skips taking people to different follow- up questions. Unexplained differences may indicate skips that were not followed, coding errors, or other types of issues. The number of missing values –The number of missing values for a question should be appropriate. Questions that everyone gets should have few missing values. Questions that only some people get should have more. When a question sorts people to one of two follow-up questions, the number of responses to one of the questions should be about the same as the number of missing values for the other question. 8

Copyright 2010, The World Bank Group. All Rights Reserved. Variable transformations Some variables need to be transformed from what is easy for respondents to provide into what can be readily used by researchers. –If the respondent’s month and year of birth were recorded, age can be created from this and the month and year of the interview. Recording age and other such variables on the file eliminates the need to recreate it every time someone works with the data and reduces the risk of errors. 9

Copyright 2010, The World Bank Group. All Rights Reserved. Variable transformations - Counts Knowing how many of something there are or how many times it was done can be useful. –Variables for presence and number of children in the household, for example, can be created from the roster Children in the household may be the children of different adults (for example, sisters living together), so it is important to associate the correct adults and children. 10

Copyright 2010, The World Bank Group. All Rights Reserved. Variable transformations - Recodes Recodes change the category labels in a variable. –For example, marital status may contain “divorced,” “separated,” and “widowed” as separate categories but it may be that the agency will frequently want to combine these three groups. A recode of the marital status variable with these combined into one group (and with any other desired combinations) could be made. The original variable with the three separate groups should be maintained, as it provides valuable information. 11

Copyright 2010, The World Bank Group. All Rights Reserved. Variable transformations – Creating Concepts Multiple variables can be used together to create a concept of interest. –For example, unemployed can be created from variables for work during the past week, active job search, availability for work, waiting to start, and waiting to be recalled. The new variable makes the data easier to use and minimizes the risk of it being created incorrectly in the future. The variables that make up the new concept should be kept on the file because they make other concepts as well and because they can be used on their own for analysis. 12

Copyright 2010, The World Bank Group. All Rights Reserved. Variable transformations – Math Mathematical operations, such as creating rates, means, and medians may be desirable. –The unemployment rate may be created by dividing the number of unemployed by the civilian labor force and multiplying this by 100. Decisions about rounding and how many decimal places to store must be made. Source: 13