Prénom Nom Document Analysis: Segmentation & Layout Analysis Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Slides:



Advertisements
Similar presentations
Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Advertisements

Image Analysis Phases Image pre-processing –Noise suppression, linear and non-linear filters, deconvolution, etc. Image segmentation –Detection of objects.
Document Processing Methods for Telugu and other SE Asian Scripts
Document Image Processing
How the edges of a line, paragraph, object, or table are positioned horizontally and vertically between the margins or on a page.
Lecture 07 Segmentation Lecture 07 Segmentation Mata kuliah: T Computer Vision Tahun: 2010.
Each pixel is 0 or 1, background or foreground Image processing to
Document Image Processing 1.Fourier Transforms 2.Hough Transforms 3.Docstrum 4.Text vs Graphics.
Segmentation (2): edge detection
Quadtrees, Octrees and their Applications in Digital Image Processing
Prénom Nom Document Analysis: Parameter Estimation for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Prénom Nom Document Analysis: Structure Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Text Detection in Video Min Cai Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.
Aletheia Apostolos Antonacopoulos PRImA Lab, The University of Salford, United Kingdom
4. Ad-hoc I: Hierarchical clustering
Segmentation Divide the image into segments. Each segment:
Prénom Nom Document Analysis: TextRecognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Quadtrees, Octrees and their Applications in Digital Image Processing
Mark Dixon Page 1 22 – Problem Solving. Mark Dixon Page 2 Session Aims & Objectives Aims –to provide a more explicit understanding of problem solving.
Prénom Nom Document Analysis: Non Parametric Methods for Pattern Recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
CSE (c) S. Tanimoto, 2007 Segmentation and Labeling 1 Segmentation and Labeling Outline: Edge detection Chain coding of curves Segmentation into.
Prénom Nom Document Analysis: Fundamentals of pattern recognition Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
E.G.M. PetrakisBinary Image Processing1 Binary Image Analysis Segmentation produces homogenous regions –each region has uniform gray-level –each region.
1 ENTITY test is port a: in bit; end ENTITY test; DRC LVS ERC Circuit Design Functional Design and Logic Design Physical Design Physical Verification and.
A table is an arrangement of data (words and numbers) in rows and columns. Tables range in complexity from those with only two columns and a title to.
Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.
FEATURE EXTRACTION FOR JAVA CHARACTER RECOGNITION Rudy Adipranata, Liliana, Meiliana Indrawijaya, Gregorius Satia Budhi Informatics Department, Petra Christian.
Oriented Local Binary Patterns for Offline Writer Identification
October 14, 2014Computer Vision Lecture 11: Image Segmentation I 1Contours How should we represent contours? A good contour representation should meet.
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
CS 6825: Binary Image Processing – binary blob metrics
Avoiding Segmentation in Multi-digit Numeral String Recognition by Combining Single and Two-digit Classifiers Trained without Negative Examples Dan Ciresan.
©2007 by the McGraw-Hill Companies, Inc. All rights reserved. 2/e PPTPPT.
Columns run horizontally in tables and rows run from left to right.
Digital Image Processing CCS331 Relationships of Pixel 1.
Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
CS654: Digital Image Analysis Lecture 5: Pixels Relationships.
UC Berkeley CS294-9 Fall Document Image Analysis Lecture 11: Word Recognition and Segmentation Richard J. Fateman Henry S. Baird University of.
Text From Corners: A Novel Approach to Detect Text and Caption in Videos Xu Zhao, Kai-Hsiang Lin, Yun Fu, Member, IEEE, Yuxiao Hu, Member, IEEE, Yuncai.
Digital Image Processing
October 1, 2013Computer Vision Lecture 9: From Edges to Contours 1 Canny Edge Detector However, usually there will still be noise in the array E[i, j],
Image Segmentation Nitin Rane. Image Segmentation Introduction Thresholding Region Splitting Region Labeling Statistical Region Description Application.
TOPIC 12 IMAGE SEGMENTATION & MORPHOLOGY. Image segmentation is approached from three different perspectives :. Region detection: each pixel is assigned.
Essential components of the implementation are:  Formation of the network and weight initialization routine  Pixel analysis of images for symbol detection.
Course 3 Binary Image Binary Images have only two gray levels: “1” and “0”, i.e., black / white. —— save memory —— fast processing —— many features of.
Relationship between pixels Neighbors of a pixel – 4-neighbors (N,S,W,E pixels) == N 4 (p). A pixel p at coordinates (x,y) has four horizontal and vertical.
Real-Time Hierarchical Scene Segmentation and Classification Andre Uckermann, Christof Elbrechter, Robert Haschke and Helge Ritter John Grossmann.
NLP&CC 2012 报告人:许灿辉 单 位:北京大学计算机科学技术研究所 Integration of Text Information and Graphic Composite for PDF Document Analysis 基于复合图文整合的 PDF 文档分析 Integration of.
1 Double-Patterning Aware DSA Template Guided Cut Redistribution for Advanced 1-D Gridded Designs Zhi-Wen Lin and Yao-Wen Chang National Taiwan University.
Laying out Elements with CSS
Course : T Computer Vision
IMAGE PROCESSING RECOGNITION AND CLASSIFICATION
COMP 9517 Computer Vision Segmentation 7/2/2018 COMP 9517 S2, 2017.
Computer Vision Lecture 13: Image Segmentation III
Mean Shift Segmentation
Computer Vision Lecture 12: Image Segmentation II
DTP Terms & Techniques You will need to understand basic terms and techniques used in DTP, as well as file types used within DTP and their advantages and.
Text Detection in Images and Video
Silhouette Intersection
Chapter 11 Review.
Binary Image processing بهمن 92
Handwritten Characters Recognition Based on an HMM Model
Fourier Transform of Boundaries
Computer and Robot Vision I
Presentation transcript:

Prénom Nom Document Analysis: Segmentation & Layout Analysis Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008

© Prof. Rolf Ingold 2 Outline  Objectives of layout analysis  Classification of layout analysis methods  Top down methods  Run length smearing algorithm  Bottom-up methods  Connected component extraction  A model driven approach  Conclusions

© Prof. Rolf Ingold 3 Objectives of layout analysis and segmentation  The role of segmentation is to split a document image into regions of interest  Regions of interest may be of different granularity levels: graphics or text blocs, text lines, words, characters  The goal of layout analysis is to get a hierarchical description of segmented objects

© Prof. Rolf Ingold 4 Segmentation strategies  Segmentation produces a hierarchy of physical objects  Two strategies can be used  top-down segmentation: starting with the entire image, split it recursively down to elementary shapes  bottom-up segmentation: starting at pixel level, detect connected components and group them hierarchically  Hybrid methods combine both strategies  Segmentation methods can be  data-driven using only data properties (without contextual knowledge)‏  model-driven, i.e., using contextual knowledge

© Prof. Rolf Ingold 5 Top-down methods  Top-down methods decompose the entire page into a hierarchy of  rectangular regions  Top-down approaches perform recursive XY-cuts  horizontal and vertical projection profile analysis  white streams (spaces) analysis  run length smoothing algorithm (RLSA)‏

© Prof. Rolf Ingold 6 Recursive XY-Cut  The page is cut alternatively horizontally and vertically according to white spaces  Robust for most printed modern documents  Supposes page images to be unskewed  Does not work for all kind of layouts  Non rectangular formatting  Complex mosaics (illustration next)  Resulting hierarchy may not reflect the natural structure (illustration below)‏

© Prof. Rolf Ingold 7 Top-Down Segmentation  Recursive splitting can be performed by horizontal and vertical profile analysis  images need to be "unskewed" !

© Prof. Rolf Ingold 8 Top-Down Segmentation (2)  Order in which X-Y cuts are performed is critical

© Prof. Rolf Ingold 9 White streams analysis  Principle: detect maximal rectangular white blocs  split regions recursively according to thresholds

© Prof. Rolf Ingold 10 Run Length Smearing Algorithm (RLSA)‏  The Run Length Smearing Algorithm (RLSA) is a morphological operator  it replaces white runs that are smaller or equal to a given threshold by black runs  it can be applied horizontally as well as vertically

© Prof. Rolf Ingold 11 RLSA based segmentation  RLSA can be used to segment a page into blocs using three steps  applied horizontally  applied vertically  combined by logical and operator  Threshold values are critical and have to be chosen  according to document class  using statistical white space analysis

© Prof. Rolf Ingold 12 Bottom-up methods  Bottom-up methods start at pixel levels and groups them together in a hierarchy of  multi-rectangular regions (shapes delimited by horizontal and vertical segments)‏  arbitrary shapes  Bottom up methods use  connected component extraction  region grouping

© Prof. Rolf Ingold 13 Connected components  In a binary image, a connected component is a set of black pixels connected by 4- or 8-adjacency five 4-connected componentstwo 8-connected components

© Prof. Rolf Ingold 14 Extraction of connected components  Connected components can be extracted by different algorithms  By a one pass full image scanning process, from top to bottom and from left to right  By a border following algorithm, using as first pixel a border pixel supposed to be known

© Prof. Rolf Ingold 15 Scanning based CC Extraction for each scan line l y for each black run r if on line l y-1 there is no run k-adjacent to r create a new component containing r else if on line l y-1 there exist one run r’ k-adjacent to r add r to the component containing r’ else if on line l y-1 there exist several runs r i k-adjacent to r merge all components containing such a r i add r to that component merge

© Prof. Rolf Ingold 16 PQ d R2R2 Border following algorithm consider P 0  S having a 4-neighbor Q 0  S P ← P 0 ; Q ← Q 0 ; d ← direction of Q according to P ; repeat let R i be the neighbor of P in direction (d+i) mod 8 if R 2  S then Q ← R 2 ; d ← (d+2) mod 8; else if R 1  S then P ← R 2 ; Q ← R 1 ; else P ← R 1 ; d ← (d  2) mod 8; add P to the contour until P = P 0 and Q = Q 0 P Q d R2R2 R1R1

© Prof. Rolf Ingold 17 Illustration of connected components

© Prof. Rolf Ingold 18 Connected components from RLSA  Connected components can be used to detect characters  Word can be located using RLSA

© Prof. Rolf Ingold 19 Grouping components  Grouping connected components is non trivial  Grouping rules are based on  relative positioning  distances and thresholds  component classification  Parameters can be estimated statistically

© Prof. Rolf Ingold 20 Allen's relations in 2D space  Relative positioning of two rectangles generate 169 configurations !

© Prof. Rolf Ingold 21 Threshold estimation  Thresholds can be estimated on statistical distributions of  horizontal spaces for character grouping into words and word grouping into text lines  vertical spacing for grouping text lines into text blocs

© Prof. Rolf Ingold 22 Distributions of component sizes  Components can be classified into  symbols  letters  hairlines  punctuation according to their size

© Prof. Rolf Ingold 23 Region grouping

© Prof. Rolf Ingold 24 Docstrum  The docstrum method [O'Gorman] is using a graph that connects each connected component to its k closest neighbors

© Prof. Rolf Ingold Model driven layout analysis [Azokly95]

© Prof. Rolf Ingold Component hierarchy

© Prof. Rolf Ingold Generic macrostructures  In a model-driven approach, generic macrostructures are used  a formal language describes margins and separators

© Prof. Rolf Ingold Formal description of macrostructures VOLUME Article IS WIDTH = 160; HEIGHT = 240; PAGE Garde IS... END; PAGE Paire IS HSEP hs1 = (4, 3, LEFT, RIGHT, BLANK); LAYER Principal IS VSEP vs1 = (40, 65, TOP, hs1, BLANK); VSEP vs2 = ([50,60], 4, hs1, BOTTOM, BLANK); REGION Centre = (vs2, RIGHT, hs1, BOTTOM, ANY, NORMAL); REGION Marge = (LEFT, vs2, hs1, BOTTOM, TEXT, SMALL);... END; LAYER Secondaire IS HSEP hs2 = ([10,220], 2, LEFT, RIGHT, BLANK) SUBST hs1; HSEP hs3 = ([20,240], 2, LEFT, RIGHT, BLANK) SUBST BOTTOM; REGION Figure = (LEFT, RIGHT, hs2, hs3, {TABLE, GRAPHICS}); END; PAGE Impaire IS... END; END;

© Prof. Rolf Ingold Evaluation of segmentation results  Segmentation is rarely perfect; it generates  undersegmentation: real components are merged  oversegmentation: a single component is split  Special metrics have been developed to evaluate a segmentation result  In ICDAR'03 and ICDAR'05 scientific contests were organized

© Prof. Rolf Ingold Conclusion  Segmentation is a crucial step in document analysis  Segmentation is almost solved for  printed documents with regular layout  form analysis  Results are rarely perfect  Contextual knowledge may improve the results  Advanced pattern recognition method are required  Segmentation remains an open problem for uncontrolled handwriting and graphical documents