Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chaincode Generation Contour separation extracted by algorithm Image Chaincode contour Represented as an array of coordinates and corresponding slopes.

Similar presentations


Presentation on theme: "Chaincode Generation Contour separation extracted by algorithm Image Chaincode contour Represented as an array of coordinates and corresponding slopes."— Presentation transcript:

1 Chaincode Generation Contour separation extracted by algorithm Image Chaincode contour Represented as an array of coordinates and corresponding slopes (0..7) at each contour point Input 2 13 4 0 7 6 5 Eight Contour Directions Y X Status Slope Mode Curvature data information data information end of chain Output CEDAR

2 Algorithm Start at upper right corner of image Travel to the left, down a row, until you move from white pixel to black pixel Travel counter-clock wise around boundary, storing visited pixels, and marking pixels as necessary, until you return to the start of the contour - Array of bytes representing pixels. - Value 0 for black and 255 for white. Contour representation of the image Input New object, so mark pixel and store it Is it marked? At lower left corner? Yes No CEDAR Output

3 Pre-scan Digit Recognition Use fast digit recognizer - POLY OR CP on each appropriate component in address block Chaincode contour of connected components in address block - Recognition choice with confidence on each component - Confidence of characters are typically low - Confidence of “real” numerals are typically high Input Output CEDAR

4 POLY Digit Recognizer CEDAR Method –1240 binary pixel pair features used –Linear discriminant classifier used Performance –1000 digits per second on a RS 6000 –94% recognition rate on a standard test set –useful in separating alpha characters and numerals Feature Extraction –Set of 1240 binary (on, off) features –Features are based on whether particular pixels or pairs of pixels are BLACK –Pixel pairs are empirically determined –Consider distinguishing “7” and “2”

5 CEDAR Classification –Uses linear discriminant functions –Training: 1241 weights (one for each of the features) plus a constant are determined for each of the 10 classes –Testing: For each new test image do the following: For each digit class (0..9) create a sum consisting of all the weights corresponding to a feature that is “on”, add in the constant Compare the 10 sums and choose the largest value This is the top choice class –Output: Ranked list of the 10 classes sorted by the sums

6 CP, Digit Recognizer CEDAR Method –combines a 3-layer back propagation neural network classifier using Curvature features with POLY –Top 2 choices of POLY and Top 2 choice of Curvature recognizer are combined using logistic regression Performance –170 digits per second on a RS 6000 –96% recognition rate on standard test set

7 Curvature, Digit Recognizer CEDAR Input –Binary image of digit size normalized by imposing a 4x4 grid on the image –Since the features are region based (as opposed to pixel based) this form of size normalization is effective

8 CEDAR Feature Extraction –Set of 296 real-valued features 208 based on contour shape (slope and curvature) –For each of the 16 regions in the 4x4 grid determine percent of pixels with each of the 8 possible slopes percent of pixels with each of 5 ranges of curvature computed over a neighborhood of 12-pixel window dS S 0 1 2 3 4 5 6 7 1 Slopes -2 -1 0 1 2 Curvatures

9 CEDAR 84 based on stroke transitions between regions –Chaincode represents the contour as a sequence of boundary pixels, so there is a notion of “moving” from part of the image to another –In a 4x4 grid, there are 84 possible transitions 4 based on size, location, and number of interior contours –Image is divided into 3 regions: UPPER, MIDDLE, and LOWER –Determine the center of region bounded by interior contours –Location of center determines which of 3 features is set –Value of feature is ratio of “hole” area to area of bounding box –last feature stores number of interior contours present

10 CEDAR Classification –Uses a 3-layer back propagation neural network 296 input nodes for feature values 80 hidden nodes 10 output nodes (1 for each digit class) –Connections between nodes have associated weights determined during training –Output node reporting the highest value corresponds to classifier’s top choice

11 Thresholding Performance Graph CEDAR

12 GSC - Top Level Put bounding box around the image Hyper GSC Recognizer Is confidence level of first class 0? GSC Recognizer Output Input Output NO YES CEDAR

13 GSC, Digit Recognizer CEDAR Method –512 binary valued features representing Gradient, Structural, Concavity characteristics of the image –Uses a nearest neighbor classifier Performance –100 digits per second on a RS 6000 –97% recognition rate on standard test set

14 CEDAR Image Processing –Size normalization accomplished by imposing a 4x4 grid Grid is determined by partitioning the image horizontally and vertically into 4 equal pixel mass partitions Uniform and Variable Gridding % Reject

15 CEDAR Feature Extraction –Set of 512 binary features –Choice of features motivated by belief that multi-scale features have the best chance of capturing the difference between classes of digits or characters

16 CEDAR –192 Gradient features (finest scale) Gradient is the angle perpendicular to local direction of the contour boundary and is computed at every pixel Quantized to 12 different ranges of angles Histogram of occurrences of angles (ranges) for each of the 16 regions in the 4x4 grid are computed Histogram values that cross a threshold are turned “on” –192 Structural features (intermediate scale) 12 structures consisting of groups of pixels form mini-strokes –horizontal strokeupper and lower surfaces –vertical strokeleft and right surfaces –diagonal risingupper and lower surfaces –diagonal fallingupper and lower surfaces –corners(4) If any pixel group falling in a region (4x4) satisfies the rule for a mini-stroke, the feature is “on”

17 CEDAR –128 Concavity features (coarsest scale) 16 pixel density features –Does the percentage of “on” pixels in region (4 x 4) exceed a threshold 32 large stroke features –Does region (4 x 4) contain a horizontal run or vertical run of “on” pixels greater in length than a threshold 80 concavity features –Does region (4 x 4) contain a concavity pointing up down left right enclosed “hole”

18 CEDAR Classification –Identifies 6 nearest neighbors from among templates –Takes the weighted vote of the neighbors where each neighbor’s vote is weighted by its proximity to the test vector –Performance of classifier is dependent on how representative the templates are of the set of “all possible” digits

19 Gradient Features Input Output Put a 4 x 4 non-uniform grid on the image by placing sampling of a equimass divisions of the histogram Smooth the image by filtering Convolve the image with 3x3 Sobel operators to find the gradients Dividing the range of direction in 12 non- overlapping regions each of 2*pi/12 radians Do a histogram based thresholding for each sampling region In each of the 4x4 regions if there are no pixels with gradient values in a particular range then set the corresponding bit in feature vector to 1 (12 bit feature vector for each region corresponding to 12 bins of directions) 12x4x4 = 128 bit feature vector CEDAR

20 NOYES Structural Features Place a 4x4 fixed grid on the image Set the corresponding bit in feature vector to 1 (12 bit feature vector for each region signifying the 12 rules) Apply a set of 12 rules to each pixel to find the stroke and corner features Set the corresponding bit in feature vector to 1 12x4x4 = 128 bit feature vector as structural features Input Output For each of the 4x4 region is the no pixels satisfying a rule > the threshold set for the rule? CEDAR

21 NO Concavity Features Place 4x4 fixed grid on the image Convolve the image with a starlike operator by shooting rays in 8 directions and determining what each ray hits Define eight types of pixels depending on the way the rays shoot out from the pixel hit the boundary For each type of pixel define a threshold. For each type of pixel set aside a bit in the feature vector for each of the regions Set the corresponding bit in the feature vector to 1Set the corresponding bit in the feature vector to 0 IS (no of corresponding type of pixel)/(area of region) > threshold set for the type of pixel. 8x4x4 bit feature vector as the concavity features YES Input Output CEDAR

22 Word Recognition Control - Word Image - Lexicon - Word Recognizer - 1 (WMR) - Word Recognizer - 2 (CMR) Call WMR with expanded lexicon WMR results Call CMR with n-best WMR choices (n<11) CMR results Classifiers concur ? REJECT ACCEPT WMR top choice ACCEPT CMR top choice ACCEPT common top choice conf = LOconf = HI conf = LO NOYES conf(top) = MED conf = MED Input Output CEDAR conf = HI

23 WMR Over-segmentation of word into characters so that no two characters remain merged Features extracted from each segment - Chaincode of Word Image - Lexicon Rank the lexicon based on matching score Input Match one or more (up to four) segments with each character of a single lexicon entry Derive “goodness” of match between segments and a lexicon entry Score match for all lexicon entries Output CEDAR

24 WMR Features 74 chaincode based features are extracted - 2 global and 72 local features. Distribution of the 8 directional slopes for 9 (3 x 3) sub-images form the 72 local feature. –global features F g i = sigmoid ( ) for i = 1, 2 where H 1 = X max - X min, V 1 = Y max - Y min for aspect ratio H 2 = N horizontal_stroke, V 2 = N vertical_stroke for aspect ratio –local feature F l ij = for i = 1, 2,... 9 and j = 0, 1,... 7 where s ij = number of components with slope j from sub- image i N i = number of components from sub-image i S j = max ( ) H i - V i ViVi s ij NiSjNiSj NiNi i CEDAR

25 WMR w[7.6] w[7.2] r[3.8] w[5.0] w[8.6] o[7.6]r[6.3] d[4.9] w[5.0] o[6.6] o[6.0] o[7.2] o[10.6] d[6.5] d[4.4] r[7.5] r[6.4] o[7.8]r[8.6] o[8.7]r[7.4] r[7.6] o[8.3] o[7.7]r[5.8] 123456789 o[6.1] Find the best way of accounting for characters ‘w’, ‘o’, ‘r’, ‘d’ buy consuming all segments 1 to 8 in the process Distance between lexicon entry ‘word’ first character ‘w’ and the image between: - segments 1 and 4 is 5.0 - segments 1 and 3 is 7.2 - segments 1 and 2 is 7.6 CEDAR

26 CMR Over segmentation of characters so that no two characters remain merged Features extracted from each segment - Chaincode of word image - Lexicon Rank the lexicon based on “goodness” score Input Recognize one or more (up to four) segments as a single character of the alphabet Obtain character strings (ASCII) corresponding to the segments in the word image Derive “goodness” of match between character string and lexicon entries Output CEDAR

27 CMR i[.8], l[.8] u[.5], v[.2] w[.6], m[.3] w[.7] i[.7] u[.3] m[.2] m[.1] r[.4] d[.8] o[.5] -Image from 1 to 3 is a in with 0.5 confidence -Image from segment 1 to 4 is a ‘w’ with 0.7 confidence -Image from segment 1 to 5 is a ‘w’ with 0.6 confidence and an ‘m’ with 0.3 confidence Find the best path in graph from segment 1 to 8 w o r d CEDAR

28 Hover System img ftrslex ftrs w o r d l e v e l phrase level match length match gaps match word lengths match match ascenders match r e j e c t match descenders + + + a c c e p t CEDAR


Download ppt "Chaincode Generation Contour separation extracted by algorithm Image Chaincode contour Represented as an array of coordinates and corresponding slopes."

Similar presentations


Ads by Google