UN Workshop on Data Capture, Dar es Salaam Session 7 Data Capture

Slides:



Advertisements
Similar presentations
Introduction to Support Vector Machines (SVM)
Advertisements

By: Mani Baghaei Fard.  During recent years number of moving vehicles in roads and highways has been considerably increased.
CAPTURE SOFTWARE Please take a few moments to review the following slides. Please take a few moments to review the following slides. The filing of documents.
INTRODUCTION ABOUT OMR. INDEX  Concept/Definition  Form Design  Scanners & Software  Storage  Accuracy  OMR Advantages  Commercial Suppliers.
HARDWARE INPUT DEVICES ITGS. Strand 3.1 Hardware Input Devices Keyboards Pointing devices: Mice Touch pads Reading tools: Optical mark recognition (OMR)
Commercial Data Processing Lesson 2: The Data Processing Cycle.
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
Chapter 2: Pattern Recognition
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
California Car License Plate Recognition System ZhengHui Hu Advisor: Dr. Kang.
AUTOMATIC DATA CAPTURE  a term to describe technologies which aim to immediately identify data with 100 percent accuracy.
بسم الله الرحمن الرحيم معالج الحروف الضوئي OCR. Introduction Definition : OCR stands for O ptical C haracter R ecognition refers to the branch of computer.
Oral Defense by Sunny Tang 15 Aug 2003
Biomedical Image Analysis and Machine Learning BMI 731 Winter 2005 Kun Huang Department of Biomedical Informatics Ohio State University.
Complete the below… Input Complete the below… Processing Input Complete the below…
Input devices, processing and output devices Hardware Senior I.
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
UNSD Census Workshop Day 2 - Session 6 Data Capture: Optical Mark Recognition Andy Tye – International Manager DRS are Worldwide specialists in data capture.
Census Data Capture Challenge Intelligent Document Capture Solution UNSD Workshop - Minsk Dec 2008 Amir Angel Director of Government Projects.
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
Classification with Hyperplanes Defines a boundary between various points of data which represent examples plotted in multidimensional space according.
 By the end of this, you should be able to state the difference between DATE and INFORMAITON.
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
Input Devices Manual and Automatic By Laura and Gracie.
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
Regional Workshop on the 2010 World Programme on Population and Housing Censuses: International standards, contemporary technologies for census mapping.
IN THE MEANTIME…. INTERIM SOLUTIONS TO AUTOMATED DATA CAPTURE.
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
© Beta Systems Software AG Process Stages of Census Surveys Richard J. Lang, International Manager September 2008, Bangkok.
Data Capture Overview United Nations Statistics Division
UNSD Census Workshop Day 2 - Session 7 Data Capture: Intelligent Character Recognition Andy Tye – International Manager DRS are Worldwide specialists in.
UNSD Regional Workshop on Census Data Processing for the English speaking African Countries: Contemporary technologies for data capture, methodology and.
Data Capture Understand the concept of data encoding. Describe methods of data capture and identify appropriate contexts for their.
Uganda – October 2009 Census Data Collection & Processing John Gomersall.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
Regional Workshop on the 2010 World Programme on Population and Housing Censuses: International standards, contemporary technologies for census mapping.
Regional Workshop on the 2010 World Programme on Population and Housing Censuses: International standards, contemporary technologies for census mapping.
OMR, OCR and MICR Software Group 2: Maaz Masood(Leader) Haris Khan Talha Mobeen Hasan Shariq.
Slide 1 A Free sample background from © 2003 By Default! HANDLING DATA IN INFORMATION SYSTEM 19 July 2005 Tuesday Lower 6.
UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.
The Big Picture Things to think about What different ways are there to collect information automatically? What are the advantages and disadvantages of.
Optical Character Recognition
Signature Recognition Using Neural Networks and Rule Based Decision Systems CSC 8810 Computational Intelligence Instructor Dr. Yanqing Zhang Presented.
Input devices Device that accepts data and instructions from the outside world Keyboard Mouse Trackball Joystick Light pen Touch Screen Scanner Bar code.
Unit 2 Technology Systems
Standard Input Devices
OCR Reading.
Business Scanner Proposition Epson Workforce DS-30
S.Rajeswari Head , Scientific Information Resource Division
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
UNSD Census Workshop Data Capture: Optical Mark Recognition
UNSD Census Workshop Data Capture: Intelligent Character Recognition
LECTURE Course Name: Computer Application
Chapter 5 - Input.
Damiano Bolzoni, Sandro Etalle, Pieter H. Hartel
UN Workshop on Data Capture, Bangkok Session 7 Data Capture
UN Workshop on Data Capture, Minsk Session 15 Data Capture Process with Optical Character Recognition Image Character Recognition Intelligent Recognition.
A New Approach to Track Multiple Vehicles With the Combination of Robust Detection and Two Classifiers Weidong Min , Mengdan Fan, Xiaoguang Guo, and Qing.
Optical Data Capture: Optical Character Recognition (OCR)
Data Capture Process Stages
Data Capture - ICR Typical Workflow
UNSD Census Workshop Day 2 - Session 6
Optical Data Capture: Optical Mark Recognition (OMR)
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Data Capture F451 - AS Computing.
Input and Output devices in a Computer
Xiao-Yu Zhang, Shupeng Wang, Xiaochun Yun
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Presentation transcript:

UN Workshop on Data Capture, Dar es Salaam Session 7 Data Capture Richard Lang International Manager

OCR Optical Character Recognition Agenda OCR Optical Character Recognition ICR Intelligent Character Recognition DFR Dynamic Form Recognition 12/9/2018

OCR = optical character recognition Technology was first invented in 1929 Gustav Tauschek obtained a patent on OCR in Germany Mechanical device that used templates First commercial system was installed at Readers Digest in 1955 Years later donated to the Smithsonian Institution Today Recognition of machine written text is now considered largely a solved problem Accuracy rates exceed 99% 12/9/2018

OCR Beta Systems well experienced with this recognition engines in Banks in Germany OCR A ⑁ Chair ⑀ Hook ⑂ Fork Austria OCR B + Plus 12/9/2018

ICR Intelligent Character Recognition The technique is far ahead of OCR because of ongoing development of ICR Handwriting recognition system Allows different styles of handwriting to be learned by a computer during / before processing to improve accuracy and recognition rates 12/9/2018

ICR Process: Capturing the image with Scanners Processing by (ICR) and/or (OCR) Segmentation is a very important step Decision if the homogenous criteria belong to the foreground or to the background Human editors can do that depending on the context Compare also computer tomography: according to different results from radio waves reflected from different angels the computer can reconstruct the picture With the first step only a suitable starting point (sets of pixels) is possible The increasing process links all closer pixels (computation of valleys and peaks with high degree of confidence) 12/9/2018

ICR Process: Pre-processing Deskew Shift, rotate Stretch 12/9/2018

ICR Process: Less / More Contrast Enhance Less / More Contrast Clean up (de-noise, halftone removal) to enable the recognition engine to give best results 12/9/2018

ICR Process: Feature extraction Data reduction 12/9/2018

Classification A one was written 90 % = 1 8 % = 7 2 % = 4 ICR Process: 90 % = 1 8 % = 7 2 % = 4 12/9/2018

ICR Algorithm: Neural Network Using kNN k-Nearest Neighbour SVM Support Vector Machine Minimize simultaneously the empirical classification error and maximize the geometric margin; hence they are also known as maximum margin classifiers 12/9/2018

ICR Process: After different classification alternatives the appropriate confidence will be provided Recognition Limitation only for most probable characters e.g. if only characters 3,6,0 are possible the engine can also be limited to this set and the results are much better Voting Machine Usability: security, efficiency and Accuracy 12/9/2018

Dynamic Field Recognition No fixed position is required If form is only ½ available still ½ readable No special Forms are required No timing tracks are necessary on the forms for OMR but results are also available the same time no cleaning of LEDs in the scanner necessary Robust against vertical / horizontal stretching or shrinking (e.g. different printers) 12/9/2018

Dynamic Field Recognition Recognizes: features (word as pixel cloud) boxes, lines and symbols 12/9/2018

Hardware- / Software - Requirement Scanner PC Network Disc Storage only necessary if images are needed for audit purposes Software Scan Software One Recognition and Voting Software for OMR, OCR, ICR, Barcode 12/9/2018

Cost Comparatives in general OMR Cost Comparatives in general   OMR from image Dedicated OMR Scanner Forms Design Same Forms Production - Up to 50% More Enumerator Training Up to double the cost Scanners PC Low cost PC PC Operators Servers Cost of more/new flexibility low high 12/9/2018

ICR Advantages Better than: Manual keying 90 % (plus) correct keys Manual = higher substitution rate than automated recognition Time consuming Deliberate manipulation possible OMR, because OMR is space consuming OCR, because OCR is machine written and therefore of limited use 12/9/2018

ICR Advantages Clear accuracy for OMR because of dirt removal by software depending on the mark size and figure Can detect line and can ignore dirt Clear result 12/9/2018

ICR Advantages Barcode, OCR, OMR, and ICR Recognition with one Software 12/9/2018

ICR Advantages Pro: Only rejected characters/fields need correction Rest of the form untouched With new technologies open for future faster, better quality With standardized correction mode Handwriting of the corresponding country will be recognized The previously mentioned advantages do not have to be repeated here again 12/9/2018

Thank you for your attention 12/9/2018