Lukáš Neumann and Jiří Matas Centre for Machine Perception, Department of Cybernetics Czech Technical University, Prague 1.

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.

Advertisements

Medical Image Registration Kumar Rajamani. Registration Spatial transform that maps points from one image to corresponding points in another image.

Evaluating Color Descriptors for Object and Scene Recognition Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, and Cees G.M. Snoek,

Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.

On Combining Multiple Segmentations in Scene Text Recognition

- Recovering Human Body Configurations: Combining Segmentation and Recognition (CVPR’04) Greg Mori, Xiaofeng Ren, Alexei A. Efros and Jitendra Malik -

IIIT Hyderabad Pose Invariant Palmprint Recognition Chhaya Methani and Anoop Namboodiri Centre for Visual Information Technology IIIT, Hyderabad, INDIA.

Proportion Priors for Image Sequence Segmentation Claudia Nieuwenhuis, etc. ICCV 2013 Oral.

1 Building a Dictionary of Image Fragments Zicheng Liao Ali Farhadi Yang Wang Ian Endres David Forsyth Department of Computer Science, University of Illinois.

I MAGE SEGMENTATION AND 3 D MODELING TO BOOST TEXT RECOGNITION IN NATURAL SCENES Shounak Gore 04/26/11.

A Robust Approach for Local Interest Point Detection in Line-Drawing Images 1 The Anh Pham, Mathieu Delalandre, Sabine Barrat and Jean-Yves Ramel RFAI.

High-level Component Filtering for Robust Scene Text Detection

Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA

Contour Based Approaches for Visual Object Recognition Jamie Shotton University of Cambridge Joint work with Roberto Cipolla, Andrew Blake.

Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.

Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,

Chapter 1: Introduction to Pattern Recognition

Text Detection in Video Min Cai Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

A Study of Approaches for Object Recognition

Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2005 with a lot of slides stolen from Steve Seitz and.

Distinctive Image Feature from Scale-Invariant KeyPoints

Recognition Of Textual Signs Final Project for “Probabilistic Graphics Models” Submitted by: Ezra Hoch, Golan Pundak, Yonatan Amit.

Object Recognition Using Distinctive Image Feature From Scale-Invariant Key point D. Lowe, IJCV 2004 Presenting – Anat Kaspi.

Feature Subset Selection using Minimum Cost Spanning Trees Mike Farah Supervisor: Dr. Sid Ray.

Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.

California Car License Plate Recognition System ZhengHui Hu Advisor: Dr. Kang.

Con-Text: Text Detection Using Background Connectivity for Fine-Grained Object Classification Sezer Karaoglu, Jan van Gemert, Theo Gevers 1.

IIIT HyderabadUMASS AMHERST Robust Recognition of Documents by Fusing Results of Word Clusters Venkat Rasagna 1, Anand Kumar 1, C. V. Jawahar 1, R. Manmatha.

Overview Introduction to local features

Oriented Local Binary Patterns for Offline Writer Identification

Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab

Course Syllabus 1.Color 2.Camera models, camera calibration 3.Advanced image pre-processing Line detection Corner detection Maximally stable extremal regions.

1 Recognition of Multi-Fonts Character in Early-Modern Printed Books Chisato Ishikawa(1), Naomi Ashida(1)*, Yurie Enomoto(1), Masami Takata(1), Tsukasa.

Overview Harris interest points Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points Evaluation and comparison of different.

The Correspondence Problem and “Interest Point” Detection Václav Hlaváč Center for Machine Perception Czech Technical University Prague

Jifeng Dai 2011/09/27.  Introduction  Structural SVM  Kernel Design  Segmentation and parameter learning  Object Feature Descriptors  Experimental.

End-to-End Text Recognition with Convolutional Neural Networks

S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.

CS 6825: Binary Image Processing – binary blob metrics

ENT 273 Object Recognition and Feature Detection Hema C.R.

Learning a Fast Emulator of a Binary Decision Process Center for Machine Perception Czech Technical University, Prague ACCV 2007, Tokyo, Japan Jan Šochman.

Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.

Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.

Gili Werner. Motivation Detecting text in a natural scene is an important part of many Computer Vision tasks.

Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.

Roee Litman, Alexander Bronstein, Michael Bronstein

Limitations of Cotemporary Classification Algorithms Major limitations of classification algorithms like Adaboost, SVMs, or Naïve Bayes include, Requirement.

18 th August 2006 International Conference on Pattern Recognition 2006 Epipolar Geometry from Two Correspondences Michal Perďoch, Jiří Matas, Ondřej Chum.

Expectation-Maximization (EM) Case Studies

CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.

Adaboost and Object Detection Xu and Arun. Principle of Adaboost Three cobblers with their wits combined equal Zhuge Liang the master mind. Failure is.

Notes on HW 1 grading I gave full credit as long as you gave a description, confusion matrix, and working code Many people’s descriptions were quite short.

Scene Text Extraction Using Focus of Mobile Camera Egyul Kim, SeongHun Lee, JinHyung Kim Artificial Intelligence & Pattern Recognition Lab, KAIST, Korea.

Preliminary Transformations Presented By: -Mona Saudagar Under Guidance of: - Prof. S. V. Jain Multi Oriented Text Recognition In Digital Images.

Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons.

Shadow Detection in Remotely Sensed Images Based on Self-Adaptive Feature Selection Jiahang Liu, Tao Fang, and Deren Li IEEE TRANSACTIONS ON GEOSCIENCE.

IT472: Digital Image Processing

复杂图像/视频文本检测、跟踪和识别 Xu-Cheng Yin (殷绪成) Ph.D./Prof. 模式识别技术创新实验室

Recognition of biological cells – development

Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas

Approximate Models for Fast and Accurate Epipolar Geometry Estimation

Real-Time Human Pose Recognition in Parts from Single Depth Image

Fast and Robust Object Tracking with Adaptive Detection

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Text Detection in Images and Video

Aim of the project Take your image Submit it to the search engine

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Handwritten Characters Recognition Based on an HMM Model

Topological Signatures For Fast Mobility Analysis

Presentation transcript:

Lukáš Neumann and Jiří Matas Centre for Machine Perception, Department of Cybernetics Czech Technical University, Prague 1

Neumann, Matas, ICDAR 2015 Problem Introduction Contributions: 1. Text Fragments – Generalization of character detection 2. Stroke Support Pixels 3. Text-line Resegmentation Experiments Conclusion 2/22

Neumann, Matas, ICDAR 2015  Text ◦ Anything that can be represented as a sequence of Unicode characters 3/22

Neumann, Matas, ICDAR 2015 Scene Text (Text in the Wild)  Typically short snippet(s) of text, arbitrary script and orientation, non-standard fonts, out-of-vocabulary words, complex backgrounds  Image/video taken by a camera Text in the wild Other text 4/22

Neumann, Matas, ICDAR 2015  Region-based methods assume: one region (connected component) represents one character  We generalize this assumption by detecting arbitrary Text Fragments in a single pass  Text Fragment ◦ Part of a Character ◦ Character ◦ Group of Characters ◦ Word 5/22

Neumann, Matas, ICDAR 2015  Text Fragments in the majority of scripts and fonts share the “strokeness” property  This observation was popularized in the Stroke Width Transform [1] to detect individual characters [1] B. Epshtein et al., “Detecting text in natural scenes with stroke width transform,” in CVPR /22

Neumann, Matas, ICDAR 2015  Text Fragment candidates detected as MSERs over multiple scales and color projections  MSERs classified as either ◦ Character (character or a character part) ◦ Multi-character (group of characters or words) ◦ Background  Characters and multi-characters grouped into text lines with an efficient exhaustive search strategy [2]  Each text line is refined using a local text model  Character segmentations are recognized using an OCR module trained on synthetic data [3] [2] L. Neumann, J. Matas, “Text localization in real-world images using efficiently pruned exhaustive search,” in ICDAR 2011 [3] L. Neumann, J. Matas, “On combining multiple segmentations in scene text recognition,” in ICDAR /22

Neumann, Matas, ICDAR 2015  Area A of a stroke is approximately equal to the product of the stroke axis length s l and the stroke width s w  Stroke area ratio A s / A is a very discriminative feature to eliminate non-text regions  A character can be “drawn” by a circular brush with a possibly changing diameter d i equal the stroke width s w sweeping a curve S – the stroke axis.  The non-constant diameter models characters made of strokes of different width swsw w s l didi = S 8/22

Neumann, Matas, ICDAR 2015  The stroke is “in the mind of the writer” (it could be easily found in a online handwriting setup)  The Stroke Support Pixels (SSP) is a subset of pixels that lie on the stroke (but unlike skeleton, it does not have to be continuous)  The subset is found as local maxima in a region’s distance map  Stroke area discretization effects are compensated by weighing all SSPs in a 3x3 neighborhood 9/22

Neumann, Matas, ICDAR 2015  Less sensitive to discretization effects and scale change than standard skeleton algorithms; detection trivial 10/22

Neumann, Matas, ICDAR 2015  Less sensitive to discretization effects and scale change than standard skeleton algorithms 11/22

Neumann, Matas, ICDAR /22

Neumann, Matas, ICDAR 2015 Character/ FragmentMulti-characterBackground * only not rotation invariant, replaced in current work to achieve full rotation invariance 13/22

Neumann, Matas, ICDAR 2015  Key feature in the classification  Works for wide variety of scripts and fonts  Example: MSERs 460 Character Multi-character Non-character MSER 14/22

Neumann, Matas, ICDAR 2015  Not all characters (even their fragments or groups) are detected as MSERs  Characters which are detected can have many different segmentations (over-complete representation)  The detected Text Fragments are used to initialize a hypotheses-verification iterative process  For each text line, a local color model is iteratively updated using a standard graph cut framework  The graph cut is initialized using the stroke support pixels  Note that unlike with MSERs, the segmentation is not limited to threshold a scalar value 15/22

Neumann, Matas, ICDAR 2015 Source ImageMSER detectionInitialization Iteration #1 Iteration #2Final iteration (#6) After every iteration: the text box position is re-estimated connected components are classified (character, multi, non- char ) stroke support pixels in green 16/22

Neumann, Matas, ICDAR 2015 Source ImageText Fragment detection Final Segmentation Latin (stencil), Hebrew Script 17/22

Neumann, Matas, ICDAR 2015 Source ImageText Fragment detection Final Segmentation Indian (Kanada), “Latin”, Armenian Script 18/22

Neumann, Matas, ICDAR 2015 pipelinerecallprecisionf Proposed method Yin et al. [4] TexStar (ICDAR’13 winner) our previous method [3] Kim (ICDAR’11 winner) ICDAR 2013 Dataset – Text Localization [4] X.-C. Yin, X. Yin, K. Huang, and H.-W. Hao, “Robust text detection in natural scene images,”, TPAMI /22

Neumann, Matas, ICDAR 2015 TAXI CARLINGD8LL iMacTHE DOLLAR ARMSPANTENE PROV 20/22

Neumann, Matas, ICDAR 2015  Arbitrary Text Fragments detected in a single pass  An efficiently calculated “strokeness” feature exploited to discriminate between Text Fragments and background clutter  Detected Text-lines are refined by re-segmentation in a hypotheses-verification iterative process that exploits local text line properties  Competitive results with the state-of-the-art  Online demo available at  Current and future work ◦ Rotation-invariant real-time character detector (~ 5fps) ◦ OCR accuracy improvement 21/22

Neumann, Matas, ICDAR 2015 Thank you for your attention! 22/22