# Document Image Processing

## Presentation on theme: "Document Image Processing"— Presentation transcript:

Document Image Processing
Geometrical Transforms Linear Filters Morphological Operations Connected Component Labeling Binarization Contour Tracing X-Y Cuts Smearing Fourier Transform Hough Transform Docstrum Moments and Features

Transform Invariants Geometric Transforms

Affine Transforms Affine transforms cover a linear combination of translations, scale, and rotation I(x,y) is the original image I’(x’,y’) is the transformed image Type Properties Meaning Translation aij = 0; i,j = 1,2 Scaling a12=a21=0 Rotation a11= a12=- a21= a22= Slant a11= a12= a21= a22= 1 : rotation angle : slant angle

Linear Filters Convolution Equation Smoothing Low pass filter
Vertical Line Sensitive filter Vertical edge Sensitive filter Enhancement filter Laplacian Edge Operator

Morphological Operators

Dilation For each background pixel superimpose the structuring element on top of the input image so that the origin of the structuring element coincides with the input pixel position. If at least one pixel in the structuring element coincides with a foreground pixel in the image underneath, then the input pixel is set to the foreground value. If all the corresponding pixels in the image are background, however, the input pixel is left at the background value.

Erosion For each foreground pixel superimpose the structuring element on top of the input image so that the origin of the structuring element coincides with the input pixel position. If every pixel in the structuring element coincides with a foreground pixel in the image underneath, then the input pixel is left as is. If any pixel coincides with background, however, the input pixel is changed to background.

Opening and Closing Opening: Erosion followed by Dilation using the same kernel Closing: Dilation followed by Erosion using the same kernel

Hit and Miss Kernel has 1s, 0s, and don’t-care
If the 1s and 0s in the kerenel exactly match 1s and 0s in image, then the pixel underneath the origin is set to 1 else 0 Corner finding kernels Final result is “OR” of the outputs used to locate isolated points in a binary image. used to locate the end points on a binary skeleton -four hit-and-miss passes - one for each rotation used to locate the triple points (junctions) on a skeleton.

Thinning NT(P1) = no. of 0 to 1 transitions in the ordered sequence ,<P2, P3, P9, P2> NZ(P1) = no. of non-zero neighbors of P1 Set P1 to 0 If 1<NZ(P1)<7 AND If NT(P1) = 1 AND P2.P4.P8 = 0 OR NT(P2) .NE. 1 AND P2.P4.P6 = 0 OR NT(P4) .NE. 1 Use both kernels and their 90o variations Consider all pixels on the boundaries of foreground regions. Delete pixel that has more than one foreground neighbor, as long as doing so does not locally disconnect Iterate until convergence.

Vornoi Diagrams and Convex Hulls
Thickening can be performed by thinning the background Convex hull of a binary shape can be visualized by imagining stretching an elastic band around the shape. The elastic band will follow the convex contours of the shape, but will `bridge' the concave contours. 1a and 1b are used for skeletonization of background. On each thickening iteration till convergence, each element is used in turn, and in each of its 90° rotations. Structuring elements 2a and 2b are used similarly to prune the skeleton until convergence to get VORNOI diagram.

Connected Component Labeling
Scan the image by moving along a row reach a point p to be labeled Examines neighbors of p which have already been encountered in the scan (i) to the left of p, (ii) above it, and (iii and iv) the two upper diagonal terms. If all four neighbors are 0, assign a new label to p else if only one neighbor is 1 assign its label to p else if one or more of the neighbors are 1 assign one of the labels to p and note the equivalences. After completing the scan, the equivalent label pairs are sorted into equivalence classes and a unique label is assigned to each class.

Binarization

Adaptive (T= mean) threshold with 7x7 neighborhood Original gray scale Global threshold Adaptive (T=mean-C) threshold with 7x7 neighborhood; C=7 and C=10 Using T= median instead of the mean

Contour Tracing

Chain Code Contours

Features Geometrical Features Structural Features Moments
Sizes in x and y direction, aspect ratio, perimeter, area Maximum and minimum distances from boundary to center of mass Compactness = Perimeter2 / (4 Pi . Area) Signatures = projection profiles Structural Features Number of holes Euler Number = no. of components – no. of holes Moments = area of the object = center of mass

X-Y Cuts Autocorrelation function of the projection profile, k is the lag parameter If k=kp is the first peak following the peak at k=0, sharpness of peak is given by

Smearing Run Length Smearing (RLS)
Change runs of white pixels of length below a threshold to black Vertical RLS and AND Horizontal RLS

Fourier Transform

Document Images and FT

Hough Transform Parametric Form Global Peaks in Accumulator Space

Docstrum Slope Histograms Use local information
Connect a mark (component) with K (=4..6) neighbors Histogram of the slopes More efficient than projection profiles Docstrum is the radius and angle plot of the slopes

Similar presentations