S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.

Slides:

Advertisements

Similar presentations

Applications of one-class classification

Advertisements

Word Spotting DTW.

電腦視覺 Computer and Robot Vision I

November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.

Transforming images to images

Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Document Image Processing

Image Segmentation Image segmentation (segmentace obrazu) –division or separation of the image into segments (connected regions) of similar properties.

Lecture 07 Segmentation Lecture 07 Segmentation Mata kuliah: T Computer Vision Tahun: 2010.

Identifying Image Spam Authorship with a Variable Bin-width Histogram-based Projective Clustering Song Gao, Chengcui Zhang, Wei Bang Chen Department of.

電腦視覺 Computer and Robot Vision I Chapter2: Binary Machine Vision: Thresholding and Segmentation Instructor: Shih-Shinh Huang 1.

Computer Vision Lecture 16: Region Representation

Computer Vision Detecting the existence, pose and position of known objects within an image Michael Horne, Philip Sterne (Supervisor)

September 10, 2013Computer Vision Lecture 3: Binary Image Processing 1Thresholding Here, the right image is created from the left image by thresholding,

Esmail Hadi Houssein ID/  Motivation  Problem Overview  License plate segmentation  Character segmentation  Character Recognition.

Thresholding Otsu’s Thresholding Method Threshold Detection Methods Optimal Thresholding Multi-Spectral Thresholding 6.2. Edge-based.

Announcements Final Exam May 16 th, 8 am (not my idea). Practice quiz handout 5/8. Review session: think about good times. PS5: For challenge problems,

Text Detection in Video Min Cai Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Segmentation Divide the image into segments. Each segment:

Quadtrees, Octrees and their Applications in Digital Image Processing

Fitting a Model to Data Reading: 15.1,

Smart Traveller with Visual Translator. What is Smart Traveller? Mobile Device which is convenience for a traveller to carry Mobile Device which is convenience.

A Novel 2D To 3D Image Technique Based On Object- Oriented Conversion.

Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.

E.G.M. PetrakisBinary Image Processing1 Binary Image Analysis Segmentation produces homogenous regions –each region has uniform gray-level –each region.

Scale-Invariant Feature Transform (SIFT) Jinxiang Chai.

VEHICLE NUMBER PLATE RECOGNITION SYSTEM. Information and constraints Character recognition using moments. Character recognition using OCR. Signature.

Tal Mor  Create an automatic system that given an image of a room and a color, will color the room walls  Maintaining the original texture.

Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.

Chapter 2. Image Analysis. Image Analysis Domains Frequency Domain Spatial Domain.

: Chapter 10: Image Recognition 1 Montri Karnjanadecha ac.th/~montri Image Processing.

Edge Linking & Boundary Detection

Segmentation Course web page: vision.cis.udel.edu/~cv May 7, 2003  Lecture 31.

CS 6825: Binary Image Processing – binary blob metrics

September 23, 2014Computer Vision Lecture 5: Binary Image Processing 1 Binary Images Binary images are grayscale images with only two possible levels of.

Digital Image Processing CCS331 Relationships of Pixel 1.

Chapter 10 Image Segmentation.

Chapter 10, Part II Edge Linking and Boundary Detection The methods discussed in the previous section yield pixels lying only on edges. This section.

G52IVG, School of Computer Science, University of Nottingham 1 Edge Detection and Image Segmentation.

Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.

Pixel Connectivity Pixel connectivity is a central concept of both edge- and region- based approaches to segmentation The notation of pixel connectivity.

1 Document Image Matching Based on Component Blocks Fuhui Long, Hanchuan Peng, Zheru Chi, and Wanchi Siu Center for Multimedia Signal Processing, Department.

CS654: Digital Image Analysis Lecture 25: Hough Transform Slide credits: Guillermo Sapiro, Mubarak Shah, Derek Hoiem.

CS654: Digital Image Analysis

Mathematical Morphology Mathematical morphology (matematická morfologie) –A special image analysis discipline based on morphological transformations of.

Edge Detection and Geometric Primitive Extraction Jinxiang Chai.

Presented By Lingzhou Lu & Ziliang Jiao. Domain ● Optical Character Recogntion (OCR) ● Upper-case letters only.

October 16, 2014Computer Vision Lecture 12: Image Segmentation II 1 Hough Transform The Hough transform is a very general technique for feature detection.

Digital Image Processing

Wonjun Kim and Changick Kim, Member, IEEE

Image Segmentation Nitin Rane. Image Segmentation Introduction Thresholding Region Splitting Region Labeling Statistical Region Description Application.

Digital Image Processing Lecture 17: Segmentation: Canny Edge Detector & Hough Transform Prof. Charlene Tsai.

Preliminary Transformations Presented By: -Mona Saudagar Under Guidance of: - Prof. S. V. Jain Multi Oriented Text Recognition In Digital Images.

Instructor: Mircea Nicolescu Lecture 5 CS 485 / 685 Computer Vision.

Chapter 6 Skeleton & Morphological Operation. Image Processing for Pattern Recognition Feature Extraction Acquisition Preprocessing Classification Post.

Digital Image Processing CCS331 Relationships of Pixel 1.

Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons.

Hough Transform CS 691 E Spring Outline Hough transform Homography Reading: FP Chapter 15.1 (text) Some slides from Lazebnik.

Optical Character Recognition

Course : T Computer Vision

Chapter 10 Image Segmentation

S.Rajeswari Head , Scientific Information Resource Division

Image Segmentation – Edge Detection

Computer Vision Lecture 12: Image Segmentation II

Computer Vision Lecture 5: Binary Image Processing

Fitting Curve Models to Edges

ECE 692 – Advanced Topics in Computer Vision

Pattern Recognition and Training

Introduction to Artificial Intelligence Lecture 22: Computer Vision II

Presentation transcript:

S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014

Outline Optical Character Recognition (OCR). OCR for the Historical Documents. Text Lines Segmentation Approaches.  Profile Projection.  Hough Transform.  Level Set Method.  Affinity Propagation.  Steerable Directional Technique.

Optical Character Recognition (OCR) The electronic translation of images into machine-editable text.

Optical Character Recognition (OCR) There are four major stages which must be done in any optical characters recognition: 1) Preprocessing. 2)Segmentation. 3)Feature extraction. 4)Recognition.

Optical Character Recognition (OCR) Preprocessing: – Noise reduction. – Binarization or Gray scale image. – Compression in the amount of data to be analyzed.

Segmentation: – The isolation of various writing units, such as paragraphs, sentences, words, or letters. Optical Character Recognition (OCR)

Representation: – Extracts the most relevant information from the text image which helps the recognition stage to recognize the text. – This information is the features of each symbol that is needed to distinguish it from other symbols. Optical Character Recognition (OCR)

Recognition: – Recognition stage is the last and the main decision making stage. – It is a classification process that identifies each unknown symbol and assigns it into a predefined class. – This classification is based on the extracted features which are the output of the previous stage. Optical Character Recognition (OCR)

Historical documents processing is a challenging task for various reasons: 1) Lack of standard alphabets and presence of unknown fonts. 2) Low quality. OCR for the Historical Documents

3) The lack of constraints on page layout.

OCR for the Historical Documents 4) The complexity of handwriting. 5) The variability of skew between the different text-lines and within the same text-line.

6) Spaces between lines are narrow and variable. 7) The existence of small components. 8) Distinguishing noise from text. OCR for the Historical Documents

Text Lines Segmentation Approaches There are many techniques for text lines segmentation:  Profile Projection.  Hough Transform.  Level Set Method.  Affinity Propagation.  Steerable Directional Technique.

Projection Profile Summing pixel values along the horizontal axis for each y value.

Projection Profile Example:  Input image.

Projection Profile Example:  Skew Correction.

Projection Profile Example:  Horizontal Projection.

Projection Profile Example:  Peaks detection

Example:  Positions for segmentation. Projection Profile

Example:  Image for each text line. Projection Profile

For skewed or fluctuating text lines, the image may be divided into vertical strips. Subdivision the page into columns. Determination of the minimal values of the histograms resulting from horizontal projections for all the columns. Drawing horizontal stroke by means of each minimal value inside a column. The link between these strokes allows the separation of two adjacent lines. Projection Profile

Hough Transform The Hough transform is used for locating straight lines in images. Text line is best align matches the black pixels. Any black pixel has an infinite number of lines that could pass through this pixel.

There are two ways to represent the lines : – y = mx + c – x cos θ + y sin θ = ρ  Each line has a unique value (m, c) or (ρ, θ) which is called accumulator.  There is a vote for the accumulator when the line passes through a black pixel.  The text line is the line that has the maximum accumulator. Hough Transform

Level-set Method Instead of directly segmenting on a binary image, it is converted to a probability map, where each element represents the probability of this pixel belonging to a text line.

Level-set Method The probability map is analyzed using the level set method to segment text lines by determining the boundary of neighboring text lines. The zero value for the boundary, automatically grows, merges, and stops to the final text line boundary.

Connected Components Clustering Grouping many connected components in a cluster by using grouping algorithms, each cluster represents a separate text line.

Affinity Propagation The algorithm first estimates local orientation at each primary component of a word to build a sparse similarity graph.  At each point, the region is divided into five regions.  The Breadth-First Search algorithm is applied to find disjoint sets in the similarity graph.  There exist a path from each element to every other element in the set.

Steerable Directional Local Profile Technique One of the connected components based approaches is steerable directional technique. Adaptive local connectivity map (ALCM) is generated using a steerable directional filter.

Steerable Directional Local Profile Technique Firstly, a steerable filter is used to determine foreground intensity along multiple directions at each pixel while generating the ALCM.

Steerable Directional Local Profile Technique The ALCM is then binarized using an adaptive thresholding algorithm to get a rough estimate of the location of the text lines.

This approach has difficulties and limitations when it comes to the binarization of the ALCM images. Especially when text lines in the document are very close to each other. Steerable Directional Technique

To solve the problem: 1) Steerable dynamic directional filter is applied. Angle value is taken instead of the density value. Steerable Directional Technique

2) apply a mode filter to extract each paragraph in the document and its orientation. Steerable Directional Technique

3) a steerable static directional filter is applied. - the direction of the kernel is taken from the paragraph map.

Steerable Directional Technique 4) Thresholding

Horizontal Projection Technique To use Projection Technique: – First : paragraph segmentation.

Horizontal Projection Technique To use Projection Technique: – Second: Skew Correction.

Horizontal Projection Technique To use Projection Technique: – Third: Horizontal Projection.

Horizontal Projection Technique To use Projection Technique: – Fourth: Profile Analysis. There are some drawbacks makes finding he maximum and the minimum in the profile more complicated.  Short line will provide low peak that might be ignored  very narrow lines, or the lines that including many overlapping components will not produce significant peaks

Horizontal Projection Technique To use Projection Technique: – Fourth: Profile Analysis. To solve this problem, the profile should be smoothed.