SADC Course in Statistics The Use of Optical Character Recognition Technology In National Statistical Offices.

Slides:



Advertisements
Similar presentations
1 of 18 Information Dissemination New Digital Opportunities IMARK Investing in Information for Development Information Dissemination New Digital Opportunities.
Advertisements

ASYCUDA Overview … a summary of the objectives of ASYCUDA implementation projects and features of the software for the Customs computer system.
1 From the data to the report Module 2. 2 Introduction Welcome Housekeeping Introductions Name, job, district, team.
SADC Course in Statistics (Session 20)
SADC Course in Statistics Samples and Populations (Session 02)
Managing data using CSPro
SADC Course in Statistics Session 4a: The National Statistical Office and the National Statistics Act.
The role of enumerators in Statistical Data collection
SADC Course in Statistics Preparing a structured field report.
SADC Course in Statistics Types and Sources of Errors in Statistical Data.
SADC Course in Statistics General approaches to sample size determinations (Session 12)
SADC Course in Statistics Reporting on the web site Module I4, Sessions 14 and 15.
SADC Course in Statistics Producing a product portfolio Module I3 Session
SADC Course in Statistics Objectives and analysis Module B2, Session 14.
SADC Course in Statistics Revision on tests for means using CAST (Session 17)
SADC Course in Statistics Exploratory Data Analysis for single variables Module B2 Session 12.
Independence Module X NJ APA Teacher Training - Module X 1.
Module 12 WSP quality assurance tool 1. Module 12 WSP quality assurance tool Session structure Introduction About the tool Using the tool Supporting materials.
INTRODUCTION ABOUT OMR. INDEX  Concept/Definition  Form Design  Scanners & Software  Storage  Accuracy  OMR Advantages  Commercial Suppliers.
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
System Implementations American corporations spend about $300 Billion a year on software implementation/upgrade projects.
1 Mobilizing Resources for Censuses: Strategies for Reducing Census Costs/ Perspectives of Donor Countries Based on Japanese Experience Takehiro Fukui.
Brief Overview of Data Processing of Afghanistan Household Listing, Pilot Census Results, Population and Housing Census and NRVA Survey Brief Overview.
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
Workshop on international standards, contemporary technologies and regional cooperation Noumea, New Caledonia, 4 – 8 February 2008 Introduction to Optical.
UNSD Census Workshop Day 2 - Session 6 Data Capture: Optical Mark Recognition Andy Tye – International Manager DRS are Worldwide specialists in data capture.
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
UNSD Census Workshop Day 2 - Session 6 Data Capture: Optical Mark Recognition Andy Tye – International Manager DRS are Worldwide specialists in Census.
Copyright 2010, The World Bank Group. All Rights Reserved. PROCESSING, Part 1 Data capture, editing, imputation and tabulation Quality assurance for census.
1 Use of scanning technology for data capture ICR System (Intelligent Character Recognition) Information and Communication Technology Center National Statistical.
Tool for Assessing Statistical Capacity (TASC) The development of TASC was sponsored by United States Agency for International Development.
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
Sterling Chadee Director of Statistics. The processing of the data from the field enumeration began in July 2011 until September All data processors.
Input Devices Manual and Automatic By Laura and Gracie.
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
Regional Workshop on the 2010 World Programme on Population and Housing Censuses: International standards, contemporary technologies for census mapping.
Question 23 As an accountant of an organization, discuss why it might be necessary to initiate systems analysis. {6 marks} Giving reasons for your answer,
Software Systems for Survey and Census Yudi Agusta Statistics Indonesia (Chief of IT Division Regional Statistics Office of Bali Province) Joint Meeting.
System Analysis and Design
Data Capture Overview United Nations Statistics Division
UNSD Census Workshop Day 2 - Session 7 Data Capture: Intelligent Character Recognition Andy Tye – International Manager DRS are Worldwide specialists in.
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
Uganda – October 2009 Census Data Collection & Processing John Gomersall.
3.2 Data and Information. Overview Understand the difference between information and data. Discuss features important in form design such as: use of tick.
Statistical Expertise for Sound Decision Making Quality Assurance for Census Data Processing Jean-Michel Durr 28/1/20111Fourth meeting of the TCG - Lubjana.
UNSD Workshop Tanzania June 2008 JOHN GOMERSALL ANDY TYE.
Data Processing of the 2010 Population and Housing Census September 2008, Bangkok, Thailand National Statistical Office, Thailand.
Census Data Capture: ABS Experience 1991 to 2006 Noumea February 2008.
Regional Workshop on the 2010 World Programme on Population and Housing Censuses: International standards, contemporary technologies for census mapping.
Census Data Capture with OCR Technology: Ghana’s Experience Presented at the UNSD Regional Workshop on Census Data Processing Dar es Salaam, Tanzania 9.
Census Processing Baku Training Module.  Discuss:  Processing Strategies  Processing operations  Quality Assurance for processing  Technology Issues.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Asunción,
Sampling Design and Analysis MTH 494 Ossam Chohan Assistant Professor CIIT Abbottabad.
United Nations Regional Workshop on the 2010 World Programme on Population and Housing Censuses: Census Evaluation and Post Enumeration Surveys, Asunción,
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
1a) Explain the use of voice recognition in embedded systems. Embedded systems are devices that contain a microprocessor controlling them Embedded systems.
The Big Picture Things to think about What different ways are there to collect information automatically? What are the advantages and disadvantages of.
MANAGEMENT INFORMATION SYSTEM
DATA COLLECTION Data Collection Data Verification and Validation.
UN Reg. Workshop on the 2020 World Programme on
UNSD Census Workshop Data Capture: Optical Mark Recognition
UNSD Census Workshop Data Capture: Intelligent Character Recognition
OCR GCSE ICT Data capture methods.
Databases.
Optical Data Capture: Optical Character Recognition (OCR)
Software Systems for Survey and Census
Data Capture Process Stages
Data Capture - ICR Typical Workflow
UNSD Census Workshop Day 2 - Session 6
Optical Data Capture: Optical Mark Recognition (OMR)
Presentation transcript:

SADC Course in Statistics The Use of Optical Character Recognition Technology In National Statistical Offices

To put your footer here go to View > Header and Footer 2 What is Optical Character Recognition? It is a technology that recognises and captures alphanumeric characters on a computer at high speed. It provides complete form processing and documents capture solution. It is sometimes called Optical Character Reader (OCR) or Intelligent Character Reader (ICR).

To put your footer here go to View > Header and Footer 3 Why do National Statistical Offices require OCR? Most NSOs are moving from the traditional way of doing things by adopting Optical Character Recognition technology. Its use may offer the following benefits/advantages: It allows the NSO to process information more quickly, more accurately and more efficiently thus allowing them to release and disseminate data timeously to support the evidence-based decision making process.

To put your footer here go to View > Header and Footer 4 Why require OCR? contd It reduces the data entry time and increases its accuracy when compared to the use of manual data entry operators. It allows validation rules to be incorporated in the system so as to validate and correct the data. Errors can be identified using different colours that facilitate the review and correction process. Scanned forms are stored digitally thus eliminating the need for physical storage of questionnaires for these can be destroyed after the initial scanning, recognition and repair.

To put your footer here go to View > Header and Footer 5 Why require OCR? contd The system stores data in a database thus facilitating data analysis. It reduces the number of data entry personnel.

To put your footer here go to View > Header and Footer 6 What are the disadvantages of OCR? The speed of gathering data in the field by enumerators is severely reduced for the filling in of OCR/ICR forms needs more care to write in the specified boxes. Has a severe limitation when it comes to human handwriting. Variation in enumerator handwriting can cause problems in form processing and may thus decrease the character recognition rate. Errors in filling of questionnaires decrease the rate of recognition. Printing quality can cause problems if it is too dark or too light. This may reduce the recognition rate of characters.

To put your footer here go to View > Header and Footer 7 Factors to consider when implementing OCR. Although OCR has advantages in speeding data processing, analysis and ultimately the release of data, adoption of this technology becomes an organisational consideration. The following considerations come to mind: Does the organisation have the capacity to use the technology, and if not, is it possible to outsource skills, funding the exercise of outsourcing and are there possibilities of creating capacity in the immediate future. How comparable is the quality of data obtained through the use of OCR/ICR to that obtained through the use of human labour particularly at data entry.

To put your footer here go to View > Header and Footer 8 Factors to consider contd Differences in the error rate between OCR/ICR and the traditional use of data entry personnel. Cost implication of the technology as compared to the use of human labour. In the South African case, the planned use of OCR technology in the Census 2001 was expected to reduce cost compared with the 1996 Census by between 30 and 40 percent. #The above factors are basically querying, whether Optical Character Recognition is an appropriate technology in National Statistical Offices.

To put your footer here go to View > Header and Footer 9 Factors to consider contd The need to clearly define the roles or responsibilities of the District Office, Provincial Office and Head Office. This entails deciding where manual editing of questionnaires, data entry and final analysis and production of statistical data or information will be done. Pilot testing questionnaires to evaluate enumerator training, data entry by enumerators and using OCR technology e.g. character recognition. This activity requires funding and the question to ask is; Do National Statistical Offices have the funds to carry out these activities?

To put your footer here go to View > Header and Footer 10 How to obtain good results from scanning? There are three requirements: quality of the form. appropriate preparation of field staff and their supplies. appropriate design of the quality control activities.

To put your footer here go to View > Header and Footer 11 Quality of the form The quality of the form may be increased in one of the following ways: Select adequate paper quality. Use paper heavier than 80 grams per square meters to avoid paper crashes or over read the other side of a single page. Source a reliable print press. Select an appropriate drop out colour, usually red to allow the system to pick up only the meaningful information from an OCR form. It advisable to use marks or ticks as much as possible. Avoid using open ended questions.

To put your footer here go to View > Header and Footer 12 Preparation of field staff and their supplies Emphasis should be placed on the following aspects: Careful handling and filing of materials or documents. This means that enumerators should have appropriate supplies such as a documents bag, several black pencils, correctors or erasers among other supplies. Training of field staff should pay attention on aspects of how to write numeric or alphabetic characters so as to achieve maximum character recognition. Spend time emphasising scanning hand writing.

To put your footer here go to View > Header and Footer 13 Field staff and their supplies contd Adequate instructions stating that each box should contain only one character, characters should not extend outside the designated boxes and unnecessary lines of characters such as points, strokes are prohibited, strokes should not be ended with extensions, all lines should be connected without breaks and all lines and dots should be pressed with the same pressure. Ensure that all answers in the questionnaire are numeric codes.

To put your footer here go to View > Header and Footer 14 Field staff and their supplies contd Instructions should be given on reasons of error reading by OCR, e.g. bad condition of the form because it is dirty, folded or crumbled or forms are incompletely filled.

To put your footer here go to View > Header and Footer 15 Quality control process A number of quality control processes have to be put in place to ensure the following: that all questionnaires have been scanned completely, with no omissions and duplications. Quality assurance tests are done on the quality of recognition to ensure that acceptable recognition rates are maintained.

To put your footer here go to View > Header and Footer 16 Sources – PG/DOCUMENTS/STATISTICS/JOURNALVOL1FULL.P DFhttp:// PG/DOCUMENTS/STATISTICS/JOURNALVOL1FULL.P DF – guide/capture_ch06.pdfhttp://intranet.unescap.org/stat/pop-it/pop- guide/capture_ch06.pdf –National Sample Census of Agriculture 2002/2003, Volume 1: Technical and Operation Report, September 2006.

To put your footer here go to View > Header and Footer 17