Prénom Nom Document Analysis: Segmentation & Layout Analysis Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Prénom Nom Document Analysis: Segmentation & Layout Analysis Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008

© Prof. Rolf Ingold 2 Outline  Objectives of layout analysis  Classification of layout analysis methods  Top down methods  Run length smearing algorithm  Bottom-up methods  Connected component extraction  A model driven approach  Conclusions

© Prof. Rolf Ingold 3 Objectives of layout analysis and segmentation  The role of segmentation is to split a document image into regions of interest  Regions of interest may be of different granularity levels: graphics or text blocs, text lines, words, characters  The goal of layout analysis is to get a hierarchical description of segmented objects

© Prof. Rolf Ingold 4 Segmentation strategies  Segmentation produces a hierarchy of physical objects  Two strategies can be used  top-down segmentation: starting with the entire image, split it recursively down to elementary shapes  bottom-up segmentation: starting at pixel level, detect connected components and group them hierarchically  Hybrid methods combine both strategies  Segmentation methods can be  data-driven using only data properties (without contextual knowledge)‏  model-driven, i.e., using contextual knowledge

© Prof. Rolf Ingold 5 Top-down methods  Top-down methods decompose the entire page into a hierarchy of  rectangular regions  Top-down approaches perform recursive XY-cuts  horizontal and vertical projection profile analysis  white streams (spaces) analysis  run length smoothing algorithm (RLSA)‏

© Prof. Rolf Ingold 6 Recursive XY-Cut  The page is cut alternatively horizontally and vertically according to white spaces  Robust for most printed modern documents  Supposes page images to be unskewed  Does not work for all kind of layouts  Non rectangular formatting  Complex mosaics (illustration next)  Resulting hierarchy may not reflect the natural structure (illustration below)‏

© Prof. Rolf Ingold 10 Run Length Smearing Algorithm (RLSA)‏  The Run Length Smearing Algorithm (RLSA) is a morphological operator  it replaces white runs that are smaller or equal to a given threshold by black runs  it can be applied horizontally as well as vertically

© Prof. Rolf Ingold 11 RLSA based segmentation  RLSA can be used to segment a page into blocs using three steps  applied horizontally  applied vertically  combined by logical and operator  Threshold values are critical and have to be chosen  according to document class  using statistical white space analysis

© Prof. Rolf Ingold 12 Bottom-up methods  Bottom-up methods start at pixel levels and groups them together in a hierarchy of  multi-rectangular regions (shapes delimited by horizontal and vertical segments)‏  arbitrary shapes  Bottom up methods use  connected component extraction  region grouping

© Prof. Rolf Ingold 14 Extraction of connected components  Connected components can be extracted by different algorithms  By a one pass full image scanning process, from top to bottom and from left to right  By a border following algorithm, using as first pixel a border pixel supposed to be known

© Prof. Rolf Ingold 15 Scanning based CC Extraction for each scan line l y for each black run r if on line l y-1 there is no run k-adjacent to r create a new component containing r else if on line l y-1 there exist one run r’ k-adjacent to r add r to the component containing r’ else if on line l y-1 there exist several runs r i k-adjacent to r merge all components containing such a r i add r to that component merge

© Prof. Rolf Ingold 16 PQ d R2R2 Border following algorithm consider P 0  S having a 4-neighbor Q 0  S P ← P 0 ; Q ← Q 0 ; d ← direction of Q according to P ; repeat let R i be the neighbor of P in direction (d+i) mod 8 if R 2  S then Q ← R 2 ; d ← (d+2) mod 8; else if R 1  S then P ← R 2 ; Q ← R 1 ; else P ← R 1 ; d ← (d  2) mod 8; add P to the contour until P = P 0 and Q = Q 0 P Q d R2R2 R1R1

© Prof. Rolf Ingold 19 Grouping components  Grouping connected components is non trivial  Grouping rules are based on  relative positioning  distances and thresholds  component classification  Parameters can be estimated statistically

© Prof. Rolf Ingold 21 Threshold estimation  Thresholds can be estimated on statistical distributions of  horizontal spaces for character grouping into words and word grouping into text lines  vertical spacing for grouping text lines into text blocs

© Prof. Rolf Ingold Formal description of macrostructures VOLUME Article IS WIDTH = 160; HEIGHT = 240; PAGE Garde IS... END; PAGE Paire IS HSEP hs1 = (4, 3, LEFT, RIGHT, BLANK); LAYER Principal IS VSEP vs1 = (40, 65, TOP, hs1, BLANK); VSEP vs2 = ([50,60], 4, hs1, BOTTOM, BLANK); REGION Centre = (vs2, RIGHT, hs1, BOTTOM, ANY, NORMAL); REGION Marge = (LEFT, vs2, hs1, BOTTOM, TEXT, SMALL);... END; LAYER Secondaire IS HSEP hs2 = ([10,220], 2, LEFT, RIGHT, BLANK) SUBST hs1; HSEP hs3 = ([20,240], 2, LEFT, RIGHT, BLANK) SUBST BOTTOM; REGION Figure = (LEFT, RIGHT, hs2, hs3, {TABLE, GRAPHICS}); END; PAGE Impaire IS... END; END;

© Prof. Rolf Ingold Evaluation of segmentation results  Segmentation is rarely perfect; it generates  undersegmentation: real components are merged  oversegmentation: a single component is split  Special metrics have been developed to evaluate a segmentation result  In ICDAR'03 and ICDAR'05 scientific contests were organized

© Prof. Rolf Ingold Conclusion  Segmentation is a crucial step in document analysis  Segmentation is almost solved for  printed documents with regular layout  form analysis  Results are rarely perfect  Contextual knowledge may improve the results  Advanced pattern recognition method are required  Segmentation remains an open problem for uncontrolled handwriting and graphical documents

Prénom Nom Document Analysis: Segmentation & Layout Analysis Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Similar presentations

Presentation on theme: "Prénom Nom Document Analysis: Segmentation & Layout Analysis Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Prénom Nom Document Analysis: Segmentation & Layout Analysis Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Similar presentations

Presentation on theme: "Prénom Nom Document Analysis: Segmentation & Layout Analysis Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008."— Presentation transcript:

Similar presentations

About project

Feedback