Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Similar presentations


Presentation on theme: "Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008."— Presentation transcript:

1 Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008

2 © Prof. Rolf Ingold 2 Outline  Image acquisition  Image enhancement  Foreground / background separation  Binarization  Color clustering  Skew detection and correction  Skew estimation  Deskewing  Text normalization

3 © Prof. Rolf Ingold 3 Image acquisition  Document images are acquired by  drum scanners  flatbed scanners  high resolution digital cameras  specialized book scanners  or extracted from  3D scene images  video sequences

4 © Prof. Rolf Ingold 4 Image quality  Various types of document images  binary images (fax)‏  gray level images (256 levels)‏  RGB images (24 bits, or more)‏  at different resolutions  200 dpi (low fax quality)‏  dpi (standard resolution for office automation)‏  Mpixels for A4 format  600 dpi or higher for special applications  Images may be  degraded  distorted, non planar  noisy, with artifacts (JPEG)‏

5 © Prof. Rolf Ingold 5 Document image examples 200 dpi images400 dpi images

6 © Prof. Rolf Ingold 6 Overview of document image processing  Image preprocessing is an initial step of document analysis  it aims at preparing the image for further processing  The most important initial steps are  Image enhancement  Binarization, i.e., foreground / background separation  Skew correction  More specialized techniques are used locally  Text size normalization  Slant correction ...

7 © Prof. Rolf Ingold 7 Image enhancement  Classical image filtering algorithms are applied  To reduce or remove color information  To enhance the contrast between foreground and background  To correct irregular illumination  To strengthen contours  To smooth contours  To remove salt and pepper noise  To thin or thicken strokes  …  Image enhancement is often combined with segmentation or shape analysis

8 © Prof. Rolf Ingold 8 Foreground / background separation  Document image analysis requires the separation between foreground (ink) and background (paper)‏  Foreground / background is trivial for simple document classes  Binarization determined by appropriate threshold  Problems arise in following situations  Non uniform background (mixing colors and “reverse video”)‏  Textured backgrounds  Halftoning artifacts  Non uniformly illuminated documents  Degraded documents (bad inking, old paper, with holes, …)‏  Paper Transparency, ink traversing

9 © Prof. Rolf Ingold 9 Binarization in presence of dithering  In case of dithering a low pass filter should first be used to smooth the background

10 © Prof. Rolf Ingold 10 Niblack’s method  Niblack’s method is using a local threshold where   x,y and  x,y represent respectively the mean and standard deviation of gray levels in a N x N neighborhood around pixel x,y  k is a constant between 0 and 1 (suggested value 0.2)‏  R is the range of gray levels

11 © Prof. Rolf Ingold 11 Sauvola's method  Sauvola at al. has proposed a variant which assumes that text is dark in bright background where  R =128,  k =0.5  Problems remain when the hypothesis is not true (even after reversing)‏

12 © Prof. Rolf Ingold 12 Binarization in case of colored background Binarisation by global thresholding and Sauvola's method

13 © Prof. Rolf Ingold 13 Comparison of binarization techniques  Original image  Fisher  Fisher (wind.)‏  Yanowitz B.  Niblack  Sauvola et al.  INSA, Lyon from F. Lebourgeois, INSA, Lyon

14 © Prof. Rolf Ingold 14 Color clustering  For rich colored documents  Check, forms, …  Geographic maps  Historical documents  Advertising foreground background separation is performed by color clustering  Color clustering may be achieved automatically  k-means  Gaussian mixtures  …

15 © Prof. Rolf Ingold 15 Skew detection and correction  Most document image recognition algorithms need perfectly, horizontally and vertically aligned text  Very often, acquisition systems are not accurate enough  Skew correction requires two steps  Skew estimation (with a precision < 1 degree)‏  Image deskewing (rotation with a small angle)‏  For book reading systems, due to page curvatures, more sophisticated image correction algorithms are required

16 © Prof. Rolf Ingold 16 Skew estimation  Many different methods have been proposed for skew estimation for printed documents  Margin detection  by white stream analysis  by projection profile analysis  Hough transforms  at pixel level  of centers of connected components  Linear regressions  of centers of connected components  Most methods can be applied on down-sampled images  Skew detection for handwriting is more difficult, but less useful

17 © Prof. Rolf Ingold 17 Projection profiles  Projection profiles are simple histograms accumulating pixels along a line or a column

18 © Prof. Rolf Ingold 18 Hough Transform  The Hough transform is a global transformation  mapping the spatial space (x,y) to a parametric space ( ,  )‏  each pixel is accumulated on a beam of lines defined in polar coordinates, i.e

19 © Prof. Rolf Ingold 19 Skew estimation by Hough transform  The Hough transform allows to estimate the skew angle

20 © Prof. Rolf Ingold 20 Deskewing of document image  Deskewing requires an image rotation  rotation of color or gray level images needs re-sampling  rotation of binary images has several pitfalls  they introduce distortions and noise  they are not reversible (except for Pythagoras angles)‏  Deskewing can also be approximated  by combining two affine transforms

21 © Prof. Rolf Ingold 21 Rotation of binary images  Pixel based rotations of binary images introduce distortions  this artifact can be avoided by connected component replacement

22 © Prof. Rolf Ingold 22 Rotation of binary images (2)‏  Better results are obtained by rotating the original gray level image (before binarization)‏

23 © Prof. Rolf Ingold 23 Normalization of character size  For text recognition normalization of character sizes is often required  Size normalization can be achieved  By bounding boxes of isolated characters  By base line, ascenders and descenders

24 © Prof. Rolf Ingold 24 Normalization techniques for handwriting  In case of handwriting additional normalization may be applied  size normalization for ascenders and descenders  slant correction  Slant estimation is performed by averaging the direction of the median of straight vertical segments

25 © Prof. Rolf Ingold 25 Run Length Smearing Algorithm (RLSA)‏  The Run Length Smearing Algorithm (RLSA) consists in replacing white runs by black runs, if their length is smaller than a given threshold  it can be applied horizontally or vertically  RLSA is often usefull for segmentation


Download ppt "Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008."

Similar presentations


Ads by Google