Presentation is loading. Please wait.

Presentation is loading. Please wait.

Zhenwen Dai Jӧrg Lücke Frankfurt Institute for Advanced Studies,

Similar presentations


Presentation on theme: "Zhenwen Dai Jӧrg Lücke Frankfurt Institute for Advanced Studies,"— Presentation transcript:

1 Autonomous Cleaning of Corrupted Scanned Documents A Generative Modeling Approach
Zhenwen Dai Jӧrg Lücke Frankfurt Institute for Advanced Studies, Dept. of Physics, Goethe-University Frankfurt

2 A document cleaning problem

3 What method can save us? Optical Character Recognition (OCR)

4 OCR Software ? ? vs. input OCR (FineReader 11) Character Segmentation
Character Classification

5 What method can save us? Optical Character Recognition (OCR)
Automatic Image Inpainting

6 Automatic Image Inpainting

7 Automatic Image Inpainting
Unable to identify the defects because corruption and characters consist of same features solution requires knowledge of explicit character representations

8 What else? Optical Character Recognition (OCR)
Automatic Image Inpainting Image Denoising? Problem requires a new solution!

9 Our Approach training data is only the page of corrupted document
no label information a limited alphabet (currently) input our approach

10 How does it work without supervision?
Characters are salient self-repeating patterns. Corruptions are more irregular. Related to Sparse Coding input our approach

11 The Flow of Our Approach
b a y s e Learning A Character Model on Image Patches Cut into Image Patches Character Detection & Recognition

12 A Probabilistic Generative Model
Show a character generation process. A character representation (parameters) Feature Vectors (RGB color) mask param.

13 Pixel-wise Background
A Tour of Generation Select a character. Translate to the position. Generate a background. Overlap character with background according to mask. Prior Prob. 0.2 0.2 0.2 0.2 0.2 masks features Pixel-wise Background Distribution Translation by [12,10]T Learning

14 Maximum Likelihood Iterative Parameter Update Rules from EM:
prior prob. posterior tn t2 t1 t0 parameter set std A posterior distribution is needed for every image patch in the update rules.

15 Posterior Computation Problem
A posterior distribution is needed for every image patch in the update rules. Similar to template matching A pre-selection approximation Which character? A ? B ? C ? D ? E ? inference Where? ? ? ? hidden space (truncated variational EM) pre-selection (Lücke & Eggert, JMLR 2010) (Yuille & Kersten, TiCS 2006)

16 An Intuitive Illustration of Pre-selection
Select some local features according to parameters. Very few features A number of good guesses A B C D E B C A E D Features in image patches B D (Lücke & Eggert, JMLR 2010) (Yuille & Kersten, TiCS 2006)

17 Learn the Character Representations
Input: image patches (Gabor wavelets) A learning course: (about 25 mins) chars mask feature std chars mask feature std feature std 1 4 2 5 3 6 (heat map) (heat map)

18 Learn the Character Representations
Input: image patches (Gabor wavelets) A learning course: (about 25 mins) chars mask feature std chars mask feature std feature std 1 4 2 5 3 6 (heat map) (heat map)

19 Document Cleaning How to recognize characters against noise?
Character segmentation fails. Our model – one char per patch It is a non-trivial task. Try to explore from the model as much as possible.

20 Document Cleaning Procedure
Inference of every patch with the learned model Paint a clean character at the detected position. Erase the character from the original document. Accept original Fully visible=1 Clean Characters from the Corrupted Document reconstructed reconstructed

21 Document Cleaning Procedure
Inference of every patch with the learned model Iterate until no more reconstruction. Accept Reject original reconstructed Fully visible=1 Fully visible=0 more than one character per patch iteration 2 Reject Accept reconstructed Fully visible=0 Fully visible=1 reconstructed iteration 1 (about 1 min per iteration)

22 Before Cleaning

23 After Iteration 1

24 After Iteration 2

25 After Iteration 3

26 More Experiments More characters (9 chars)
Rotated, random placed More characters (9 chars) Unusual character set (Klingon) Irregular placement (randomly placed, rotated) Occluded by spilled ink 9 chars Klingon Occluded original reconstructed

27 Recognition Rates

28 False Positives

29 Not only a Character Model
Detect and count cells on microscopic image data in collaboration with Thilo Figge and Carl Svensson

30 Summary Addressed the corrupted document cleaning problem.
Followed a probabilistic generative approach. Autonomous cleaning of a document is possible. Demonstrated efficiency and robustness. The dataset will be available online soon. Future directions: Extended to large alphabet by incorporating prior knowledge of documents. Extended to various different applications.

31 Acknowledgement

32 Thanks for your attention!

33 Learned Character Representations
Cut the document into small patches. Run the learning algorithm.

34 Performance “bayes” 9 chars Klingon Randomly placed Occluded
Recognition Rates OCR 56.5% 75.4% 0.8% 41.6% Our algorithm 100% 97.4% False Positives 297 285 231 86 413 3 6

35 Document Cleaning Procedure
Character vs. Noise? MAP inference can only choose among learned characters. Define a novel quality measure. y a MAP mask param. mask posterior difference Threshold: 0.5

36


Download ppt "Zhenwen Dai Jӧrg Lücke Frankfurt Institute for Advanced Studies,"

Similar presentations


Ads by Google