Presentation is loading. Please wait.

Presentation is loading. Please wait.

UNSD Census Workshop Data Capture: Intelligent Character Recognition

Similar presentations


Presentation on theme: "UNSD Census Workshop Data Capture: Intelligent Character Recognition"— Presentation transcript:

1 UNSD Census Workshop Data Capture: Intelligent Character Recognition
Day 2 - Session 8 Data Capture: Intelligent Character Recognition Andy Tye – International Manager DRS are Worldwide specialists in Census data capture

2 Data Capture Intelligent Character Recognition (ICR) Elements
Form design Hardware/Software requirements Scanners Computer infrastructure Workflow Accuracy Advantages Disadvantages DRS are Worldwide specialists in Census data capture

3 Data Capture - ICR Forms design Typical stock grade paper (90GSM)
Corner Stones advised Dropout colour is recommended but not essential DRS are Worldwide specialists in Census data capture

4 Data Capture - ICR Hardware requirements Software requirements
Image Scanners TWAIN or ISIS Database Server (Full redundancy) Storage Server – Terabytes (Raid 5, Mirrored, etc.) Network (Gb preferred) Administrator PC CS-Pro PCs Key correction PCs (Verification) Character Inspection PCs (Mass verification - optional) Scanner PCs Automatic data capture PCs Software requirements MS-SQL or other database Data Storage, Archive and Retrieval Backup Software Software for Administrator PC CS-Pro for analysis and reporting PCs Software for Key correction PCs Software for Character inspection PCs Software for Scanner PCs Software for automatic data capture DRS are Worldwide specialists in Census data capture

5 Data Capture - ICR Typical Workflow ICR
DRS are Worldwide specialists in Census data capture

6 Data Capture - ICR Typical Workflow
Paper Movement – Processing Centre/s DRS are Worldwide specialists in Census data capture

7 Data Capture - ICR Typical Workflow Receiving
DRS are Worldwide specialists in Census data capture

8 Data Capture - ICR Typical Workflow Logging/Checking Open Batch
Verify Contents Register Batch DRS are Worldwide specialists in Census data capture

9 Data Capture - ICR Typical Workflow Sifting Orientation Other Forms
DRS are Worldwide specialists in Census data capture

10 Data Capture - ICR Typical Workflow Spine removal Cut Booklets
30,000/day DRS are Worldwide specialists in Census data capture

11 Data Capture - ICR Typical Workflow Scanning Double Sided High Speed
Double Detection Ease of Use DRS are Worldwide specialists in Census data capture

12 Data Capture - ICR Typical Workflow Scanning/sorting
Automatic Identification Data Capture DRS are Worldwide specialists in Census data capture

13 Data Capture - ICR Typical Workflow Storage Conditions Retrieval Space
DRS are Worldwide specialists in Census data capture

14 Data Capture - ICR Typical Workflow
Image Movement/Data Extraction – Processing Centre/s DRS are Worldwide specialists in Census data capture

15 Data Capture - ICR Typical Workflow Image interpretation
Automated Process Background Task Page Identification De-skew Image Clean up Pre-defined Areas DRS are Worldwide specialists in Census data capture

16 Data Capture - ICR Typical Workflow Character inspection Tiling
High Confidence Operator Decision Field Context Tall to Short DRS are Worldwide specialists in Census data capture

17 Data Capture - ICR Typical Workflow Key correction Low Confidence
Operator Decision From Context External Verification DRS are Worldwide specialists in Census data capture

18 Data Capture - ICR Typical Workflow Key Correction ASCII File
CSV Format 1 Line/Form CSPro Import DRS are Worldwide specialists in Census data capture

19 Data Capture - ICR Typical Workflow ICR
DRS are Worldwide specialists in Census data capture

20 Data Capture - ICR Accuracy Handprint
This is always the first question Handprint Numeric only in isolated fields 98% Numeric only in semi constrained fields 95-96% Alpha upper case only 90% Alpha lower case only 85-87% Alpha mixed case 75-80% Alpha/Numeric mixed case 50% or less reduce by 5% if there are special characters not a-z and 0-9 The accuracy level post data correction (e.g. the final output accuracy) should be 100% (subject to good operators) DRS are Worldwide specialists in Census data capture

21 Data Capture - ICR Accuracy continued…
The accuracy of all modern ICR engines are pretty much comparable The major differences with suppliers solutions are the methods and workflow utilised with each offering False positive detection takes 10 times longer than entry of characters recognised with low confidence – false positives (substitutions) are the most expensive errors DRS are Worldwide specialists in Census data capture

22 Data Capture - ICR Accuracy continued… Accuracy can be improved by:
Restricting the responses to any given question Using external verification Using multiple ICR engines to ‘vote’ which is expensive Training your ICR engines on local hand writing styles (If possible) DRS are Worldwide specialists in Census data capture

23 Data Capture - ICR Advantages No specialist hardware required
An image archive can be automatically produced of every form Very high speed scanning can be achieved Both OMR and ICR can be interpreted using ICR software Forms designed for ICR relatively easy to fill in. Locally printed forms can be used. Allows capturing much more complex data than with OMR alone DRS are Worldwide specialists in Census data capture

24 Data Capture - ICR Disadvantages
Significant hardware/software and trained IT staff will be required Accuracy dependant on manual intervention High calibre IT staff are required to support the ICR system More complex cost/benefit analysis than with OMR alone DRS are Worldwide specialists in Census data capture

25 Data Capture - ICR Indicative Costs
For 65 Million Population Census (20M Single Sided A4 household form) Processing period of 12 Weeks (8 hours/day 5 days/week) Hardware $800k-$1M in total Software $700k-$1.3M in total Total Indicative Costs are $1.5M to $2.3M No. of Staff in total 6-10 Managers PC Operators DRS are Worldwide specialists in Census data capture

26 Data Capture - OMR ‘the forms are filled in correctly and
Summary ICR offers considerable flexibility at the cost of higher skilled IT personnel The single most important factor for timely and accurate data capture is to make sure ‘the forms are filled in correctly and are returned in good condition’ DRS are Worldwide specialists in Census data capture

27 UNSD Census Workshop Thank you for listening
Day 2 - Session 8 Thank you for listening Andy Tye – International Manager DRS are Worldwide specialists in Census data capture


Download ppt "UNSD Census Workshop Data Capture: Intelligent Character Recognition"

Similar presentations


Ads by Google