Zhenwen Dai Jӧrg Lücke Frankfurt Institute for Advanced Studies,

Slides:

Advertisements

Similar presentations

Part 2: Unsupervised Learning

Advertisements

Applications of one-class classification

A Bayesian Approach to Recognition Moshe Blank Ita Lifshitz Reverend Thomas Bayes

Part Based Models Andrew Harp. Part Based Models Physical arrangement of features.

Computer vision: models, learning and inference Chapter 18 Models for style and identity.

Learning Representations. Maximum likelihood s r s?s? World Activity Probabilistic model of neuronal firing as a function of s Generative Model.

K Means Clustering , Nearest Cluster and Gaussian Mixture

Vision Based Control Motion Matt Baker Kevin VanDyke.

Computer vision: models, learning and inference

Foreground Modeling The Shape of Things that Came Nathan Jacobs Advisor: Robert Pless Computer Science Washington University in St. Louis.

Automatic Identification of Bacterial Types using Statistical Image Modeling Sigal Trattner, Dr. Hayit Greenspan, Prof. Shimon Abboud Department of Biomedical.

Generative Models for Image Analysis Stuart Geman (with E. Borenstein, L.-B. Chang, W. Zhang)

Model: Parts and Structure. History of Idea Fischler & Elschlager 1973 Yuille ‘91 Brunelli & Poggio ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis,

Self Taught Learning : Transfer learning from unlabeled data Presented by: Shankar B S DMML Lab Rajat Raina et al, CS, Stanford ICML 2007.

Lecture 20 Object recognition I

A Study of Approaches for Object Recognition

Lecture 5: Learning models using EM

Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.

Co-training LING 572 Fei Xia 02/21/06. Overview Proposed by Blum and Mitchell (1998) Important work: –(Nigam and Ghani, 2000) –(Goldman and Zhou, 2000)

Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

Autoencoders Mostafa Heidarpour

Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

A Bidirectional Matching Algorithm for Deformable Pattern Detection with Application to Handwritten Word Retrieval by K.W. Cheung, D.Y. Yeung, R.T. Chin.

Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.

Introduction to machine learning

Semi-Supervised Learning

CIVS, Statistics Dept. UCLA Deformable Template as Active Basis Zhangzhang Si UCLA Department of Statistics Ying Nian Wu, Zhangzhang Si, Chuck.

Exercise Session 10 – Image Categorization

Classification with Hyperplanes Defines a boundary between various points of data which represent examples plotted in multidimensional space according.

Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.

Pattern Recognition Vidya Manian Dept. of Electrical and Computer Engineering University of Puerto Rico INEL 5046, Spring 2007

UNSD-ESCWA Regional Workshop on Census Data Processing in the ESCWA region: Contemporary technologies for data capture, methodology and practice of data.

Mehdi Ghayoumi Kent State University Computer Science Department Summer 2015 Exposition on Cyber Infrastructure and Big Data.

Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)

Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.

SVCL Automatic detection of object based Region-of-Interest for image compression Sunhyoung Han.

University of Toronto Aug. 11, 2004 Learning the “Epitome” of a Video Sequence Information Processing Workshop 2004 Vincent Cheung Probabilistic and Statistical.

#MOTION ESTIMATION AND OCCLUSION DETECTION #BLURRED VIDEO WITH LAYERS

Line detection Assume there is a binary image, we use F(ά,X)=0 as the parametric equation of a curve with a vector of parameters ά=[α 1, …, α m ] and X=[x.

Video Segmentation Prepared By M. Alburbar Supervised By: Mr. Nael Abu Ras University of Palestine Interactive Multimedia Application Development.

Bayesian Parameter Estimation Liad Serruya. Agenda Introduction Bayesian decision theory Scale-Invariant Learning Bayesian “One-Shot” Learning.

MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:

Fitting: The Hough transform

Face Detection Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL

Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,

Bayesian Generalized Kernel Mixed Models Zhihua Zhang, Guang Dai and Michael I. Jordan JMLR 2011.

Powerpoint Templates Page 1 Powerpoint Templates Scalable Text Classification with Sparse Generative Modeling Antti PuurulaWaikato University.

A New Method for Automatic Clothing Tagging Utilizing Image-Click-Ads Introduction Conclusion Can We Do Better to Reduce Workload?

Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.

1 End-to-End Learning for Automatic Cell Phenotyping Paolo Emilio Barbano, Koray Kavukcuoglu, Marco Scoffier, Yann LeCun April 26, 2006.

Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.

May 2003 SUT Color image segmentation – an innovative approach Amin Fazel May 2003 Sharif University of Technology Course Presentation base on a paper.

1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.

Martina Uray Heinz Mayer Joanneum Research Graz Institute of Digital Image Processing Horst Bischof Graz University of Technology Institute for Computer.

Surface Defect Inspection: an Artificial Immune Approach Dr. Hong Zheng and Dr. Saeid Nahavandi School of Engineering and Technology.

Gaussian Mixture Model classification of Multi-Color Fluorescence In Situ Hybridization (M-FISH) Images Amin Fazel 2006 Department of Computer Science.

Deeply learned face representations are sparse, selective, and robust

Article Review Todd Hricik.

Introductory Seminar on Research: Fall 2017

Presented by Wanxue Dong

Introduction to Sensor Interpretation

The EM Algorithm With Applications To Image Epitome

EM Algorithm and its Applications

Introduction to Sensor Interpretation

NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &

Ch4: Backpropagation (BP)

Presentation transcript:

Autonomous Cleaning of Corrupted Scanned Documents A Generative Modeling Approach Zhenwen Dai Jӧrg Lücke Frankfurt Institute for Advanced Studies, Dept. of Physics, Goethe-University Frankfurt

A document cleaning problem

What method can save us? Optical Character Recognition (OCR)

OCR Software ? ? vs. input OCR (FineReader 11) Character Segmentation Character Classification

What method can save us? Optical Character Recognition (OCR) Automatic Image Inpainting

Automatic Image Inpainting

Automatic Image Inpainting Unable to identify the defects because corruption and characters consist of same features solution requires knowledge of explicit character representations

What else? Optical Character Recognition (OCR) Automatic Image Inpainting Image Denoising? … Problem requires a new solution!

Our Approach training data is only the page of corrupted document no label information a limited alphabet (currently) input our approach

How does it work without supervision? Characters are salient self-repeating patterns. Corruptions are more irregular. Related to Sparse Coding input our approach

The Flow of Our Approach b a y s e Learning A Character Model on Image Patches Cut into Image Patches Character Detection & Recognition

A Probabilistic Generative Model Show a character generation process. A character representation (parameters) Feature Vectors (RGB color) mask param.

Pixel-wise Background A Tour of Generation Select a character. Translate to the position. Generate a background. Overlap character with background according to mask. Prior Prob. 0.2 0.2 0.2 0.2 0.2 masks features Pixel-wise Background Distribution Translation by [12,10]T Learning

Maximum Likelihood Iterative Parameter Update Rules from EM: prior prob. posterior tn t2 t1 t0 parameter set std A posterior distribution is needed for every image patch in the update rules.

Posterior Computation Problem A posterior distribution is needed for every image patch in the update rules. Similar to template matching A pre-selection approximation Which character? A ? B ? C ? D ? E ? inference Where? ? ? ? hidden space (truncated variational EM) pre-selection (Lücke & Eggert, JMLR 2010) (Yuille & Kersten, TiCS 2006)

An Intuitive Illustration of Pre-selection Select some local features according to parameters. Very few features A number of good guesses A B C D E B C A E D Features in image patches B D (Lücke & Eggert, JMLR 2010) (Yuille & Kersten, TiCS 2006)

Learn the Character Representations Input: image patches (Gabor wavelets) A learning course: (about 25 mins) chars mask feature std chars mask feature std feature std 1 4 2 5 3 6 (heat map) (heat map)

Learn the Character Representations Input: image patches (Gabor wavelets) A learning course: (about 25 mins) chars mask feature std chars mask feature std feature std 1 4 2 5 3 6 (heat map) (heat map)

Document Cleaning How to recognize characters against noise? Character segmentation fails. Our model – one char per patch It is a non-trivial task. Try to explore from the model as much as possible.

Document Cleaning Procedure Inference of every patch with the learned model Paint a clean character at the detected position. Erase the character from the original document. Accept original Fully visible=1 Clean Characters from the Corrupted Document reconstructed reconstructed

Document Cleaning Procedure Inference of every patch with the learned model Iterate until no more reconstruction. Accept Reject original reconstructed Fully visible=1 Fully visible=0 more than one character per patch iteration 2 Reject Accept reconstructed Fully visible=0 Fully visible=1 reconstructed iteration 1 (about 1 min per iteration)

Before Cleaning

After Iteration 1

After Iteration 2

After Iteration 3

More Experiments More characters (9 chars) Rotated, random placed More characters (9 chars) Unusual character set (Klingon) Irregular placement (randomly placed, rotated) Occluded by spilled ink 9 chars Klingon Occluded original reconstructed

Recognition Rates

False Positives

Not only a Character Model Detect and count cells on microscopic image data in collaboration with Thilo Figge and Carl Svensson

Summary Addressed the corrupted document cleaning problem. Followed a probabilistic generative approach. Autonomous cleaning of a document is possible. Demonstrated efficiency and robustness. The dataset will be available online soon. Future directions: Extended to large alphabet by incorporating prior knowledge of documents. Extended to various different applications.

Acknowledgement http://fias.uni-frankfurt.de/cnml

Thanks for your attention!

Learned Character Representations Cut the document into small patches. Run the learning algorithm.

Performance “bayes” 9 chars Klingon Randomly placed Occluded Recognition Rates OCR 56.5% 75.4% 0.8% 41.6% Our algorithm 100% 97.4% False Positives 297 285 231 86 413 3 6

Document Cleaning Procedure Character vs. Noise? MAP inference can only choose among learned characters. Define a novel quality measure. y a MAP mask param. mask posterior difference Threshold: 0.5