Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census.

Slides:



Advertisements
Similar presentations
Collecting data Chapter 6. What is data? Data is raw facts and figures. In order to process data it has to be collected. The method of collecting data.
Advertisements

SADC Course in Statistics The Use of Optical Character Recognition Technology In National Statistical Offices.
Data Imputation United Nations Statistics Division (UNSD) 16 March 2011 Santiago, Chile.
Module B-4: Processing ICT survey data TRAINING COURSE ON THE PRODUCTION OF STATISTICS ON THE INFORMATION ECONOMY Module B-4 Processing ICT Survey data.
Harvard Center for Population and Development Studies1 Census Editing and the Art of Motorcycle Maintenance Michael J. Levin Center for Population and.
United Nations Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Amman, Jordan,
Data Capture Methods. In this topic, we will be looking at: Methods of data capture When it would be appropriate to use each method Advantages and disadvantages.
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
AUTOMATIC DATA CAPTURE  a term to describe technologies which aim to immediately identify data with 100 percent accuracy.
UNSD Census Workshop Day 2 - Session 6 Data Capture: Optical Mark Recognition Andy Tye – International Manager DRS are Worldwide specialists in data capture.
Hardware, Software & Automatic input devices LO: Recognise hardware, software. Learning outcome: Correctly identify hardware and software. Recognise and.
Census Data Capture Challenge Intelligent Document Capture Solution UNSD Workshop - Minsk Dec 2008 Amir Angel Director of Government Projects.
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
UNSD Census Workshop Day 2 - Session 6 Data Capture: Optical Mark Recognition Andy Tye – International Manager DRS are Worldwide specialists in Census.
Sample Design.
Arun Srivastava. Types of Non-sampling Errors Specification errors, Coverage errors, Measurement or response errors, Non-response errors and Processing.
Copyright 2010, The World Bank Group. All Rights Reserved. Agricultural Coding and Data Processing Section B 1.
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
Copyright 2010, The World Bank Group. All Rights Reserved. Estimation and Weighting, Part I.
AS Module 2 Information; Management and Management and Manipulation or what to do with data, how to do it, and……... ensure it provides useful information.
Copyright 2010, The World Bank Group. All Rights Reserved. Questionnaire Design Issues Section B Disclaimer: The examples used are not necessarily good.
Input Devices Manual and Automatic By Laura and Gracie.
Regional Workshop on the 2010 World Programme on Population and Housing Censuses: International standards, contemporary technologies for census mapping.
Multiple Indicator Cluster Surveys Survey Design Workshop Sampling: Overview MICS Survey Design Workshop.
Copyright 2010, The World Bank Group. All Rights Reserved. Data Processing and Tabulation, Part I.
Software Systems for Survey and Census Yudi Agusta Statistics Indonesia (Chief of IT Division Regional Statistics Office of Bali Province) Joint Meeting.
Data Capture Overview United Nations Statistics Division
Copyright 2010, The World Bank Group. All Rights Reserved. Managing processes Core business of the NSO Part 2 Strengthening Statistics Produced in Collaboration.
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
Chap 1-1 Statistics for Managers Using Microsoft Excel ® 7 th Edition Chapter 1 Defining & Collecting Data Statistics for Managers Using Microsoft Excel.
Editing a Mixture of Canadian 2006 Census and Tax Data Mike Bankier Statistics Canada 2006 Work Session on Statistical Data Editing
Uganda – October 2009 Census Data Collection & Processing John Gomersall.
First Thoughts on Editing in Mixed Modes in the 2011 Census Heather Wagstaff and Ruth Wallis Methodology Directorate Office for National Statistics, U.K.
3.2 Data and Information. Overview Understand the difference between information and data. Discuss features important in form design such as: use of tick.
Copyright 2010, The World Bank Group. All Rights Reserved. ICT - a core management issue Part 1 Managing ICT resources Produced in Collaboration between.
Status of Data Capture Technology in Population and Housing Censuses in the ESCAP region Statistics Division ESCAP.
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
1 SIPP IMPUTATION SCHEME AND DISCUSSION ITEMS Presenters: Nat McKee - Branch Chief Census Bureau Demographic Surveys Division (DSD) Income Surveys Programming.
Statistical Expertise for Sound Decision Making Quality Assurance for Census Data Processing Jean-Michel Durr 28/1/20111Fourth meeting of the TCG - Lubjana.
Copyright 2010, The World Bank Group. All Rights Reserved. Part 2 Survey Design Produced in Collaboration between World Bank Institute and the Development.
Paolo Valente - UNECE Statistical Division Slide 1 Technology for census data coding, editing and imputation Paolo Valente (UNECE) UNECE Workshop on Census.
Copyright 2010, The World Bank Group. All Rights Reserved. Managing Data Processing Section B.
Data Processing of the 2010 Population and Housing Census September 2008, Bangkok, Thailand National Statistical Office, Thailand.
UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation.
Census Data Capture: ABS Experience 1991 to 2006 Noumea February 2008.
Sri Lanka. History  First Population & Housing Census : 1871  139 years ago  Last Population & Housing Census : 2001  After a lapse of 20 years 
United Nations Workshop on Evaluation and Analysis of Census Data, 1-12 December 2014, Nay Pyi Taw, Myanmar DATA VALIDATION-I Evaluation of editing and.
Census Processing Baku Training Module.  Discuss:  Processing Strategies  Processing operations  Quality Assurance for processing  Technology Issues.
OMR, OCR and MICR Software Group 2: Maaz Masood(Leader) Haris Khan Talha Mobeen Hasan Shariq.
Copyright 2010, The World Bank Group. All Rights Reserved. Agricultural Coding and Data Processing Section A 1.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Asunción,
The 2011 Census: Estimating the Population Alexa Courtney.
Copyright 2010, The World Bank Group. All Rights Reserved. DESIGN, PART 2 Questionnaire Design Quality assurance for census 1.
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
An Overview of Editing and Imputation Methods for the next Italian Censuses Gianpiero Bianchi, Antonia Manzari, Alessandra Reale UNECE-Eurostat Meeting.
Automatic Data Capture  Process where many techniques are used to automatically collect data without need for manual entry. Manual entry has following.
1 Handbook on Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods,
DATA COLLECTION Data Collection Data Verification and Validation.
CHAPTER 19 Data processing
UNSD Census Workshop Data Capture: Optical Mark Recognition
Selection and Use of Input Devices and Input Media High Volume Devices
UN Workshop on Data Capture, Bangkok Session 7 Data Capture
Software Systems for Survey and Census
The European Statistical Training Programme (ESTP)
Data Capture Process Stages
UNSD Census Workshop Day 2 - Session 6
The Computer-Assisted Personal
Discrepancy Management
Indicator 3.05 Interpret marketing information to test hypotheses and/or to resolve issues.
Chapter 13: Item nonresponse
Presentation transcript:

Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census 1

Copyright 2010, The World Bank Group. All Rights Reserved. COMPONENTS OF PROCESSING Data Capture Editing Imputation Tabulation 2

Copyright 2010, The World Bank Group. All Rights Reserved. DATA CAPTURE Key Entry Scanning Direct Entry 3

Copyright 2010, The World Bank Group. All Rights Reserved. KEY ENTRY Key entry as a data capture techniques has advantages and disadvantages: Pro Relatively inexpensive Skills readily available Employment 4

Copyright 2010, The World Bank Group. All Rights Reserved. KEY ENTRY (cont’d) Con Time consuming Requires many workstations Error prone 5

Copyright 2010, The World Bank Group. All Rights Reserved. SCANNING Scanning is a process similar to photocopying, but the current technology has advanced far beyond simple photocopying. There are three levels of scanning: 1.OMR: Optical Mark Recognition 2.OCR: Optical Character Recognition 3.ICR: Intelligent Character Recognition 6

Copyright 2010, The World Bank Group. All Rights Reserved. SCANNING (cont’d) Pro Fast processing Reliable 7

Copyright 2010, The World Bank Group. All Rights Reserved. Con Expensive upfront costs for equipment, software, & training Very precise requirements for paper, printing and processing 8 SCANNING (cont’d)

Copyright 2010, The World Bank Group. All Rights Reserved. DIRECT ENTRY Most Recent Innovations Enumerators use hand-held computers, or where internet is common Self enumeration through the internet 9

Copyright 2010, The World Bank Group. All Rights Reserved. DIRECT ENTRY Pro More efficient – Saves a step (immediate data capture) Improves data quality – Editing at respondent level – Timeliness Reduces some costs – Printing questionnaires 10

Copyright 2010, The World Bank Group. All Rights Reserved. DIRECT ENTRY Con Requires better trained enumerators Riskier – Hardware failure – PDA loss – Requires electricity Expensive 11

Copyright 2010, The World Bank Group. All Rights Reserved. EDITING The Editing Process Identifies errors Identifies non-response Identifies logical inconsistencies 12

Copyright 2010, The World Bank Group. All Rights Reserved. DEALING WITH ERRORS Replacement (Imputation) Weighting 13

Copyright 2010, The World Bank Group. All Rights Reserved. IMPUTATION Two classes of imputation Deterministic Stochastic 14

Copyright 2010, The World Bank Group. All Rights Reserved. DETERMINISTIC IMPUTATION Will yield the same answer each time – Missing data may be calculable from other values - e.g. Citizenship can be calculated from Place of Birth – Missing data is imputed by a sequential donor technique or any other method that will yield identical results 15

Copyright 2010, The World Bank Group. All Rights Reserved. DETERMINISTIC IMPUTATION Six main types – Deductive or Logical – Mean Value – Ratio/Regression – Sequential Hot Deck – Sequential Cold Deck – Nearest-Neighbor 16

Copyright 2010, The World Bank Group. All Rights Reserved. STOCHASTIC IMPUTATION Can yield different results if the process is rerun – Use of a random donor or other randomized approach – Use of randomized residuals to create realistic data Most deterministic methods have a stochastic counterpart 17

Copyright 2010, The World Bank Group. All Rights Reserved. IMPUTATION Which inconsistency to change? The general rule is to change as few values as possible Which values to impute? – For a 10 year old married university graduate, do we change the age or the education and marital status? – changing only the age can make the record consistent, therefore change age 18

Copyright 2010, The World Bank Group. All Rights Reserved. VALIDATION After imputation data are consistent. May or may not be correct Validation is the final step before certification and release 19

Copyright 2010, The World Bank Group. All Rights Reserved. VALIDATION May be subject to bias: Design Training Enumerator Respondent Processing 20

Copyright 2010, The World Bank Group. All Rights Reserved. CERTIFICATION Final step before data release NSO’s expression of confidence in the data 21