Ethiopian 2007 CENSUS DATA CAPTURING AND PROCESSING

Slides:



Advertisements
Similar presentations
MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Overview of Data Processing System.
Advertisements

Managing data using CSPro
Brief Overview of Data Processing of Afghanistan Household Listing, Pilot Census Results, Population and Housing Census and NRVA Survey Brief Overview.
UNSD Census Workshop Day 2 - Session 6 Data Capture: Optical Mark Recognition Andy Tye – International Manager DRS are Worldwide specialists in data capture.
General Statistics Office of Vietnam THE 2009 VIETNAM POPULATION AND HOUSING CENSUS.
Data capture of the PHC 2002 (Uganda) Experiences and lessons leant.
UNSD Census Workshop Day 2 - Session 6 Data Capture: Optical Mark Recognition Andy Tye – International Manager DRS are Worldwide specialists in Census.
Manual Data Processing of Census Data 2004 Population and Housing Census Statistics Sierra Leone Thekeka Moses Conteh Sierra Leone.
The Core Welfare Indicators Questionnaire: A CWIQ Option for Monitoring Poverty Reduction Strategies.
1 Use of scanning technology for data capture ICR System (Intelligent Character Recognition) Information and Communication Technology Center National Statistical.
1 Census 1996, 2001 & Community Survey (CS) United Nations Regional Workshop on Census Data Processing Contemporary Technology from Census Data Capturing.
By Cleophas Kiio Director, ICT 15-sep-101 The Best Practices in Census Data Processing Operation: Case of 2009 Census:
DRS Census Experience Andy Tye International Manager, DRS DRS Census Experience Andy Tye International Manager, DRS Census Meeting – New Caledonia Feb.
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
Sterling Chadee Director of Statistics. The processing of the data from the field enumeration began in July 2011 until September All data processors.
Changing the culture: Ethiopia’s commitment to dissemination and the multi-media approach By Yakob Mudesir Seid
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
1 DATA CAPTURE – PROCESSING 2006 POPULATION & HOUSING CENSUS OF NIGERIA Presented at UN Regional Workshop on Census Data Processing By Adesola Fatilewa.
Using OCR for Census Data Capture in China National Bureau of Statistics of China.
Workshop on International Standards, Contemporary Technologies and Regional Cooperation, Noumea, New Caledonia, 04–08 February 2008 Results Generated from.
The 2007 Population and Housing Census of Ethiopia Recent Experiences of Census Undertaking in Pastoral Areas And Application of New Technologies Samia.
Scanning Technology and Its Application in Ethiopia Yakob Mudesir Deputy Director General Central Statistical Agency of Ethiopia
UNSD Census Workshop Day 2 - Session 7 Data Capture: Intelligent Character Recognition Andy Tye – International Manager DRS are Worldwide specialists in.
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
European Conference on Quality in Official Statistics Session 26: Quality Issues in Census « Rome, 10 July 2008 « Quality Assurance and Control Programme.
Uganda – October 2009 Census Data Collection & Processing John Gomersall.
Multi-modal of data collection for the 2010 Population and Housing Census National Statistical Office, Thailand (Daejeon, Republic of Korea, April.
Census Data Processing: Contemporary Technologies for Data Capture Bangkok, Thailand September, 2008 By Jatan Kumar Saha Systems Analyst Bangladesh.
Statistical Expertise for Sound Decision Making Quality Assurance for Census Data Processing Jean-Michel Durr 28/1/20111Fourth meeting of the TCG - Lubjana.
1 BPS Statistics Indonesia New York, February 2011.
Data processing of the 1999 Vietnam Population Census.
Data processing of 2000 population and housing census of Mongolia Munkhbadar Jugder, Senior officer of Population and housing census bureau, NSC of Mongolia.
Data Processing of the 2010 Population and Housing Census September 2008, Bangkok, Thailand National Statistical Office, Thailand.
UNSD-UNESCAP Regional Workshop on Census Data Processing: Contemporary technologies for data capture, methodology and practice of data editing, documentation.
Regional Workshop on the 2010 World Programme on Population and Housing Censuses: International standards, contemporary technologies for census mapping.
Regional Workshop on the 2010 World Programme on Population and Housing Censuses: International standards, contemporary technologies for census mapping.
Census Data Capture with OCR Technology: Ghana’s Experience Presented at the UNSD Regional Workshop on Census Data Processing Dar es Salaam, Tanzania 9.
Use of Mobile Technology for Data Collection in Zimbabwe Experiences Gained and Lessons Learnt By Rodgers M. Sango Zimbabwe National Statistics Agency.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Asunción,
ADMINISTRATIVE STRUCTURE OF A COMPUTER CENTRE. The administrative structure is being organized in such a way that a skilled professional personnel is.
Census Planning and Management
ETHIOPIA 2017 POPULATION AND HOUSING CENSUS DATA CAPTURING AND PROCESSING STRATGEY UN Regional workshop on 2020 world programme on PHC: International.
National Population Commission (NPopC)
30TH May 2017 Dar-es-Salaam, Tanzania
Automating Accounts Payable
2000 POPULATION AND HOUSING CENSUS:
MANAGEMENT OF STATISTICAL PRODUCTION PROCESS METADATA IN ISIS
UN Reg. Workshop on the 2020 World Programme on
UNSD Census Workshop Data Capture: Optical Mark Recognition
UNSD Census Workshop Data Capture: Intelligent Character Recognition
Post Enumeration Survey Census
Quality Assurance in Maldives Population and Housing Census 2014
Population and Housing Census of Nepal: History, Lessons learned, and Initial planning for 2021 census Mahesh Kumar Subedi & Prakash Poudyal Central Bureau.
Optical Data Capture: Optical Character Recognition (OCR)
Egypt’s Population, Housing & Establishments e-Census, 2017
United Nations Regional Workshop on the 2020 World Programme
Omenya Nyahul Kenya National Bureau of Statistics
Data Capture Process Stages
United Nations Regional Workshop on the 2020 World Programme
Data Capture - ICR Typical Workflow
POPULATION AND HOUSING
UNSD Census Workshop Day 2 - Session 6
Optical Data Capture: Optical Mark Recognition (OMR)
Albania 2021 Population and Housing Census - Plans
Turkish Statistical Institute
5th Sudan Population and Housing Census
Population and Housing Census 2015, and Challenge
Manual Data Capture – Key Entry
Quality assurance in population and housing Census
Turkish Statistical Institute Demographic Statistics Department
Presentation transcript:

Ethiopian 2007 CENSUS DATA CAPTURING AND PROCESSING CENTRAL STATISTICAL AGENCY (CSA) APRIL, 2008

Background Information Population and Housing Census process is the largest data capturing exercise a country can undertake. It involves capturing of millions of forms The Central Statistics Agency (CSA) started using old techniques like Punched Card Reader as early 1960’s. Two Population and Housing Censuses have so far been conducted in Ethiopia. The first Population and Housing Census was carried out in 1984.

Background Information Cont’d . . . During the 1984 Census: Data capture was done on manual keyboard based entry using mainframe computer FORMSPEC data entry system was used It took more than 2 years to capture the data for about 42 million people. In the case of the 1994 Census: Data capture was again done on manual keyboard entry basis using PC’s CENTRY data entry system (IMPS) was used

Background Information Cont’d . . . It took about 18 months to capture the data for the population of about 53 million. About 180 data entry clerks were involved Around 90 Pc’s were used The entry work was done on 2-shift basis

Some Limitations of the Keyboard Manual Entry Method Time consuming Does not allow the availability of timely data The data will be weaker in representing the current or existing situation Subject to additional non-sampling errors Human error due to manual keying Due to the volume of the data, a 100% verification, as in the case of sample surveys, is difficult.

Limitations Cont . . . Involves a great deal of human resource management. Large number of data entry operators and equipment required

The Need for Alternative Solutions The need to have timely census results and the limitations discussed above forced the Agency to look for other alternatives This is obviously very important with regards to large volume of data like census. Hence the need to use the Scanning Technology

The Scanning Technology The Scanning Technology in general implements two basic techniques Mark recognition, like the Optical Mark Reader (OMR) Character recognition, like the Optical Character Recognition (OCR), and the Intelligent Character Recognition (ICR)

Scanning Technology Cont . . . OMR is the recognition of shaded marks (blobs) on the forms The positioning of these blobs on a form determines the alphanumeric characters they represent The character recognition is the recognition of alphanumeric characters on forms and they are of 2 types: OCR which is the recognition of machine printed characters and . .

Scanning Technology Cont . . . ICR which refers to the capture of hand- printed characters from a form For scanning of the 2007 Census the Optical Mark Reader (OMR) technique has been selected The Scanning Technology we use: PhotoScribe Series PS900 Scanners (DRS Scanning Technology Product)

DRS Photo Scribe Series PS900 High speed Imaging Mark Reader Windows XP professional CD R/WR drive Network connectivity A TFT monitor, Keyboard, mouse Speed: up to 8,500 forms / hour

The Scanning Process in General It mainly involves: Scanning / Data Capture – including IMAGE capturing Validation and Key-correction of scanned data Exporting the scanned and key-corrected data into ASCII or Text format The format suitable for electronic processing

Learning from Experiences of Other Countries Study tour made to two African countries Tanzania To learn from their successes Data capture of the 2002 Census of Tanzania was done in about 26 days General report tables were produced within 3 months from the start of the scanning

Experiences of Other Countries . . . Ghana To learn from their difficulties Data capture of the 2000 Census took about 6 months - ( forms from 29,000 EAs) 3 Scanners were used (Kodak, Fujitsu) The larger scanner was Kodak 500D Speed: About 500 forms/min Power failure was one of the major problems Loss of some data occurred as a result A large generator was installed to minimize the effect of the frequent power cut

Major Benefits of the Scanning Technology Significant decrease in time required to capture the data This helps to get timely data Users’ need satisfied (policy makers, planners, researchers, etc.) No need to worry to store millions of forms for long time in the future Scanning captures the whole content of a questionnaire in an electronic image format

Requirements for Effective Scanning Proper training Both on Hardware and Software This helps to “own” the technology Being able to use the technology after the departure of the trainers / technical advisors A reliable Network System A well organized space for forms and data flow is required

Registering & Organizing EA’s Received from the Field STRUCTURED SPACE FOR FILE FLOW Data Processing Center Warehouse Registering & Organizing EA’s Received from the Field Retrieval Receiving the Questionnaires Registering EA’s for Scanning Waiting Room Scanning Room Key-Correction Room Processing Center Store 1 2 3 4 5 6 7 8

Requirements for Effective Scanning - - - Proper file management and care Checking Batch (EA) IDs and orientation of forms Ensuring the EA code on each box is the same as the one on the questionnaires Proper recording of the in-coming and out- going questionnaires Close attention in detecting errors in the scanning process is required

Requirements for Effective Scanning - - - Ensuring the proper paper throughput through the scanner Ensuring smooth running of the scanning machines Maintenance Cleaning (daily) An arrangement to minimize the effect of Power Interruption is required

Major Activities Accomplished in the Course of the Census Taking Data from the Pilot Census was successfully scanned (OMR), key-corrected, exported to text format, tabulated and tested. One scanner (PS 900 Photo Scribe) was used to capture the pilot data Technical experts from the DRS company assisted in capturing, validating and exporting the pilot data Training in scanning technology was given : 16 professionals were trained

Major Activities Accomplished - - - Hardware and Software training conducted The training in general took about 7 working days SOSKITW for Windows :- a DRS software package for scanning was introduced Components of the SOSKITW Software : SOSGen : - used to generate scanning decodes for completed OMR forms (How marks on forms are interpreted and stored) SOSInp : - used to scan, validate and export scanned data.

Major Activities Accomplished - - - Equipment purchased and installed 10 additional PS900 iM2 DRS Scanners 16 high capacity PC’s for key-correction Census data processing work plan prepared Recruitment of temporary staff Staff training (scanning technology, CSPro) Retrieval and organization of completed forms Scanning and validation Computer editing and tabulation (For each activity: duration and responsible body are indicated)

Major Activities Accomplished - - - Census data processing teams organized Batch header database group Scanning and validation team Technical desk heads Shift supervisors Two senior programmers responsible for the overall scanning process Other sub-professional staff assigned 4 batch header scanning technicians 16 data validation workers

Major Activities Accomplished - - - The scanning room organized An air conditioner for the scanning room installed A high capacity automatic generator installed to ensure uninterrupted power supply Batch Header Database organized EA Control Forms completed in 2 parts during dispatch Same EA ID on both parts of the control form Same Enumerator Number on each part No. of Households in the EA filled-in The scannable part detached and scanned in office

Completed Census Forms Completed forms retrieved from the field (about 90,000 EA’s) Reception and organization of filled-in forms completed About 33 teams for registering and organizing forms were organized 3 persons assigned per team Retrieval of each EA checked and registered Presence of all form types checked (each EA) Control forms are also used to check the completeness of EA’s

Completed Census Forms - - - Types of the 2007 Census Forms Short questionnaires Long questionnaires Household Listing Forms Summary Forms Community Level Forms EA Control Forms (Batch Header Forms) EA ID’s and no. of households filled-in Unique Enumerator No. assigned Scanned to create EA Database

Long Questionnaire

Batch Control Form Summary Form

Actual Scanning Process - Census Forms Organized forms taken from store to the waiting room Batch Header information printed and associated with its respective EA box The existence of each EA verified Checked EAs sent to the scanning room Scanned forms are finally sent back to the stores Captured data are validated and key-corrected Key-correction involved checking and correcting: Missing marks Multi-marks Partial marks

Actual Scanning Process - - - Scanned and validated data is exported to TEXT format Format suitable for computer editing and tabulation Backup of the scanned / captured data is taken : on the Database Server externally, on high capacity tape cartridges HP Ultrium Data Cartridge 400 GB

Actual Scanning Process - - - All Census forms have been scanned : The scanning of the 10 sedentary Regions was carried from mid Aug. 2007 to mid Dec 2008 The scanning for Affar and Somali Regions took about one month including checking (mid Jan - mid Feb 2008) 44 scanning operators were assigned 11 scanners used 2 shifts per day, 7 days per week Validation and key-correction of the scanned data is done

Census Forms Scanning Process Key-Correction

Data Cleaning / Computer Editing Scanned, key-corrected and exported data Batch Edit Program based on Edit Specs provided by subject matter specialists developed and run on the data. The software to be used in editing the data is the Census and Survey Processing System (CSPro) And Batch Edit Application (.bch) is the component of CSPro used to clean the data through editing and imputation processes

Report Generation / Tabulation Raising factors attached to the edited long questionnaire data Tabulation programs (in CSPro) are prepared and tested Tables in accordance with the Tabulation Plan will be produced Final data will be organized in various formats (ASCII, SPSS) Final data will be sent to the Central Databank for achieving and dissemination purposes.

Problems Encountered I. Scanning : A batch might slip through un scanned during data capture A batch might also be scanned in parts only Misplacement of scanned forms in wrong boxes Limited storage space on the scanning machines Scanners become full– that makes scanning difficult Scanned images should constantly be moved to the storage server The location of scanned images on the storage server may sometimes not be found

Problems Encountered - - - II. Key Correction: Problems in retrieving scanned images for key correction was encountered Key correction took longer time as it is done manually The key correction process, as stared earlier, was based on fixing: Missing marks Multi-marks Partial marks

Problems Encountered - - - III. Processing the data : Large volume of data – takes long time (8 hrs) Frequent power failure highly affects the processing sessions The tabulation component of CSPro software sometimes fails unpredictably (It is a newly developed tabulation system)

In summary : Registration and organization of all completed Census Forms done The scanning and key correction of the Census questionnaires completed The scanning of the Household Listing forms is done Draft Census preliminary results have been produced Additional Comment: Quick manual review (editing and coding) of the filled-in forms might be needed prior to the scanning process

Thank You !