Download presentation
Presentation is loading. Please wait.
Published byEmma Freeman Modified over 9 years ago
1
Dr. István Marosi Scansoft-Recognita, Inc., Hungary SSIP 2005, Szeged Character Recognition Internals
2
04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition User assisted correction Result exportation
3
04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Get image B/W Scanning Gray Scanning Color Scanning Load from image file Preprocess image Layout recognition Text recognition User assisted correction Result exportation
4
04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Get image Preprocess image Color separation Thresholding Despeckling Rotation Deskewing Layout recognition Text recognition User assisted correction Result exportation Color Separation De-speckle, de-skew
5
04 Jul 2005Istvan Marosi The Preprocessed Image Joined chars
6
04 Jul 2005Istvan Marosi Joined chars The Preprocessed Image
7
04 Jul 2005Istvan Marosi The Preprocessed Image Joined chars
8
04 Jul 2005Istvan Marosi The Preprocessed Image Broken chars
9
04 Jul 2005Istvan Marosi The Preprocessed Image Broken chars
10
04 Jul 2005Istvan Marosi The Preprocessed Image Broken chars
11
04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text zones Columns of flowed text Tables Inverse text Graphic zones Text recognition User assisted correction Result exportation
12
04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text zones Graphic zones Line Art Photo Text recognition User assisted correction Result exportation
13
04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition Segmentation Calculation of Feature Vector Elements Classification Language Analysis Voting User assisted correction Result exportation
14
04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?
15
04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?
16
04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?
17
04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?
18
04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?
19
04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?
20
04 Jul 2005Istvan Marosi Segmentation What are those pixel groups belonging to a single letter?
21
04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition Segmentation Calculation of Feature Vector Elements Classification Language Analysis Voting User assisted correction Result exportation
22
04 Jul 2005Istvan Marosi Calculation of FV Elements: Contour Tracing Find a (new) white-black transition Follow the “edge” of the pixels using the MIN or MAX rule Administrate the already traced white-black transitions Collect information while going around And repeat the process on new shapes...
23
04 Jul 2005Istvan Marosi Contour Tracing Find a (new) white-black transition Follow the “edge” of the pixels using the MIN or MAX rule Administrate the already traced white-black transitions Collect information while going around And repeat the process on new shapes...
24
04 Jul 2005Istvan Marosi Contour Tracing Find a (new) white-black transition Follow the “edge” of the pixels using the MIN or MAX rule if black(a) then turn(ccw) else if black(b) then forward else turn(cw) ab
25
04 Jul 2005Istvan Marosi Contour Tracing Find a (new) white-black transition Follow the “edge” of the pixels using the MIN or MAX rule if black(a) then turn(ccw) else if black(b) then forward else turn(cw) ab if white(b) then turn(cw) else if white(a) then forward else turn(ccw) a b
26
04 Jul 2005Istvan Marosi Contour Tracing Find a (new) white-black transition Follow the “edge” of the pixels using the MIN or MAX rule Administrate the already traced white-black transitions Collect information while going around And repeat the process on new shapes...
27
04 Jul 2005Istvan Marosi Some Easily Calculatable Data Problem #1 Turning CW: I n =I n-1 +1 Turning CCW: I n =I n-1 -1 Going Forward: I n =I n-1
28
04 Jul 2005Istvan Marosi Some Easily Calculatable Data Problem #2 Turning CW: I n =I n-1 +1 Turning CCW: I n =I n-1 -1 Going Forward: I n =I n-1
29
04 Jul 2005Istvan Marosi Some Easily Calculatable Data Problem #3 Going Up: I n =I n-1 -X n Going Down: I n =I n-1 +X n Going Right: I n =I n-1 Going Left: I n =I n-1
30
04 Jul 2005Istvan Marosi Some Easily Calculatable Data Problem #4 Going Up: I n =I n-1 -X n Going Down: I n =I n-1 +X n Going Right: I n =I n-1 Going Left: I n =I n-1
31
04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition Segmentation Calculation of Feature Vector Elements Classification Language Analysis Voting User assisted correction Result exportation A B A B
32
04 Jul 2005Istvan Marosi A B A B Classification; Training models Restricted Coulomb Energy (RCE) Network (Dr. Leon Cooper, Dr. Charles Elbaum)
33
04 Jul 2005Istvan Marosi Classification; Training models Restricted Coulomb Energy (RCE) Network (Dr. Leon Cooper, Dr. Charles Elbaum) A B A B
34
04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS)
35
04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS) Default radius R max
36
04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS)
37
04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS) Default radius R max
38
04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS)
39
04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS)
40
04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS) Default radius R max
41
04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS)
42
04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS) Decreased radius
43
04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS)
44
04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS) Decreased radius R min
45
04 Jul 2005Istvan Marosi Classification; Training models Nestor Learning System (NLS) Pass 2 Decreased radius
46
04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition Segmentation Calculation of Feature Vector Elements Classification Language Analysis Voting User assisted correction Result exportation
47
04 Jul 2005Istvan Marosi Voting Text recognition in OmniPage Pro OCR Engines available: Caere’s engine (codename: Salt & Pepper) Recognita’s engine (codename: Paprika) ScanSoft’s engine (codename: Fireworx)
48
04 Jul 2005Istvan Marosi Text recognition in OmniPage Pro OCR Engines available: Caere’s engine (Salt & Pepper) Uses a Matrix Matching based algorithm feature set: 40 cells of an 8x5 grid good overall description of a shape weaker at detailed structure Recognita’s engine (Paprika) Uses a Contour Tracing based algorithm feture set: convex and concave arcs on the contour good detailed description of a shape weaker at overall structure Voting
49
04 Jul 2005Istvan Marosi Text recognition in OmniPage Pro OCR Engines available: Caere’s engine (Salt & Pepper) Recognita’s engine (Paprika) ScanSoft’s engine (Fireworx) Segmentation algorithms: Voting
50
04 Jul 2005Istvan Marosi Text recognition in OmniPage Pro OCR Engines available: Caere’s engine (Salt & Pepper) Recognita’s engine (Paprika) ScanSoft’s engine (Fireworx) Segmentation algorithms: Developed by independent groups Have different strengths and weaknesses Voting
51
04 Jul 2005Istvan Marosi Text recognition in OmniPage Pro OCR Engines available Segmentation algorithms Conclusion: They are complementary Let’s create a voting system Voting
52
04 Jul 2005Istvan Marosi Voting strategies External „Black box” voting ~20% gain Image Paprika Salt & Pepper Vote Txt 3Txt 1 Dict Final Txt Voting Fire- worx Txt 2
53
04 Jul 2005Istvan Marosi Voting strategies External „Black box” voting Internal „Shape” voting Voting Image Paprika Fire- worx Bronze Txt 3 Txt 2 Dict Final Txt Salt & Pepper Txt 1
54
04 Jul 2005Istvan Marosi Paprika Original segmentation: Every independent connected component is a character Good segmentation:recognize Bad segmentation:reject Image Recognize original segmentation K.B. Voting
55
04 Jul 2005Istvan Marosi Paprika Image Recognize original segmentation Txt 2 Train adaptive classifier from original shapes K.B. Adaptive K.B. Voting Txt 1
56
04 Jul 2005Istvan Marosi Paprika Try several segmentations Loop if unrecognizable Image Recognize original segmentation Txt 2 Train adaptive classifier from original shapes Recognize broken and joined shapes K.B. Adaptive K.B. Voting Txt 1 Dict
57
04 Jul 2005Istvan Marosi Paprika Image Recognize original segmentation Txt 2 Train adaptive classifier from original shapes Recognize broken and joined shapes K.B. Adaptive K.B. Train adaptive classifier from ‘ugly’ shapes Voting Txt 1 Dict
58
04 Jul 2005Istvan Marosi Paprika Image Recognize original segmentation Txt 3 Txt 2 Train adaptive classifier from original shapes Recognize broken and joined shapes K.B. Adaptive K.B. Train adaptive classifier from ‘ugly’ shapes Recognize more broken and joined shapes Try several segmentations Loop if unrecognizable Voting Txt 1 Dict
59
04 Jul 2005Istvan Marosi Image Paprika Fire- worx Bronze Txt 3 Txt 1 Dict Final Txt Salt & Pepper Txt 1 Voting strategies ~60% gain Voting
60
04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition User assisted correction By the user’s random editing... Pop-up verifier Manual Training By proofreading of doubtful words Result exportation
61
04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition User assisted correction By the user’s random editing... By proofreading of doubtful words Correct: User dictionary Changed: IntelliTrain Remember trained characters Apply them on following pages Result exportation
62
04 Jul 2005Istvan Marosi IntelliTrain Recognized word: sorneUüng
63
04 Jul 2005Istvan Marosi IntelliTrain Recognized word: sorneUüng Fixed word: something
64
04 Jul 2005Istvan Marosi IntelliTrain Recognized word: sorneUüng Fixed word: something
65
04 Jul 2005Istvan Marosi IntelliTrain Recognized word: sorneUüng Fixed word: something Substitutions found:m rn thi Uü
66
04 Jul 2005Istvan Marosi IntelliTrain Recognized word: sorneUüng Fixed word: something Substitutions found:m rn thi Uü Perform automatically: Learn image pattern and substitution info Find similar substituted (‘ blue ’) text on actual page Match against pattern of substitution and correct Find such errors on following pages, too
67
04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition User assisted correction Result exportation Combine pages into a Document Header / Footer recognition Page numbers Hyperlinks (e.g. „ See Table 20 ”) Save results
68
04 Jul 2005Istvan Marosi OCR Internals Main tasks of an OCR system: Image acquisition Layout recognition Text recognition User assisted correction Result exportation Combine pages into a Document Save results doc file e-mail Speech synthesizer
69
04 Jul 2005Istvan Marosi
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.