Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spatial Business Detection and Recognition from Images Alexander Darino Weeks 10 & 11.

Similar presentations


Presentation on theme: "Spatial Business Detection and Recognition from Images Alexander Darino Weeks 10 & 11."— Presentation transcript:

1 Spatial Business Detection and Recognition from Images Alexander Darino Weeks 10 & 11

2 STR Implementation STR Implementation: “Automatic Detection and Recognition of Signs From Natural Scenes” Multiresolution- based potential characters detection Character/layout geometry and color properties analysis Local affine rectification Refined Detection

3 One Font per classifier, a-z A-Z Generate alphabet templates Resize & center templates; Divide into grid (7x7) Apply several 2D Gabor filters to each grid patch – Different orientations, frequencies, variances, – For each pixel, yields real/imaginary component of transformation Feed data into Linear Discriminant Analysis – Reduces features and forms classifier at same time

4 2D Gabor Filter Convolution of Gaussian x Sine wave

5 Training Process

6 Character Determination Each grid patch has it’s own LDA classifier; classifier returns vector of probabilities for each symbol To classify overall character, recursively consider all 9-neighborhoods, multiply corresponding probabilities together When only one grid-patch remains, highest probability wins

7

8 Recognition Process Color Properties Analysis: Choose channel with highest confidence of best distinguishing foreground from background Binarization Threshold (50% of Otsu’s Method) Intermediate Representation: Trim, Resize, and Center Binary Image Perform OCR on variations of Int. Rep: stretched, eroded (gaussian-based), diluted Aggregate and return votes

9 Recognition Process Example: “G” using Trebuchet-MS Classifier Query Character (Actual Size) Intermediate Representation (Actual Size)

10 abcdefghijklmno pqrstuvwxyz ABCDEFGHIJKLMN OPQRSTUVWXYZ

11 Recognition Process Example: “G” using Trebuchet-MS Classifier Variation (Actual Size) Identified Character: g Variation (Actual Size) Identified Character: s Variation (Actual Size) Identified Character: G

12 Recognition Process Example: “G” using Trebuchet-MS Classifier Variation (Actual Size) Identified Character: g Variation (Actual Size) Identified Character: g Variation (Actual Size) Identified Character: B

13 Recognition Process Example: “G” using Trebuchet-MS Classifier Variation (Actual Size) Identified Character: G Variation (Actual Size) Identified Character: G Variation (Actual Size) Identified Character: B

14 Recognition Process Example: “G” using Trebuchet-MS Classifier Variation (Actual Size) Identified Character: B Variation (Actual Size) Identified Character: B Variation (Actual Size) Identified Character: G

15 Recognition Process Example: “G” using Trebuchet-MS Classifier Variation (Actual Size) Identified Character: G Variation (Actual Size) Identified Character: B Variation (Actual Size) Identified Character: a

16 Recognition Process Example: “G” using Trebuchet-MS Classifier Final Results: – B: 5/15 – G: 5/15 – g: 3/15 – a : 1 (6.6%) – s : 1 (6.6%)

17 “GEORGE” (Trebuchet-MS) Votes: E: 14/15 t: 1/15

18 “GEORGE” (Trebuchet-MS) Votes: j: 13/15 i: 2/15 ‘j’ is the default when unable to decide Should invert during preprocessing

19 “GEORGE” (Trebuchet-MS) Votes: j: 13/15 i: 1/15 M: 1/15 ‘j’ is the default when unable to decide Should invert during preprocessing

20 “GEORGE” (Trebuchet-MS) Votes: B: 5/15 G: 5/15 g: 3/15 a: 1/15 s: 1/15

21 “GEORGE” (Trebuchet-MS) Votes: j: 12/15 Y: 2/15 X: 1/15 ‘j’ is the default when unable to decide Should invert during preprocessing or training

22 Note on the “Inversion Problem” Easy to fix; common problem in OCR systems Will likely detect and correct during preprocessing state as opposed to training More training data: slower, less reliable Preprocessing: like trying many different lenses at the eye doctor and taking your best guess with each lense.

23 “BAKERY” (Actual: ‘Tw-Cen-MT’, Used: ‘Arial’) abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ

24 “BAKERY” (Actual: ‘Tw-Cen-MT’, Used: ‘Arial’) Votes: B: 9/15 j: 3/15 H: 2/15 F: 1/15

25 “BAKERY” (Actual: ‘Tw-Cen-MT’, Used: ‘Arial’) Votes: A: 9/15 j: 5/15 n: 1/15

26 “BAKERY” (Actual: ‘Tw-Cen-MT’, Used: ‘Arial’) Votes: K: 12/15 j: 2/15 H: 1/15

27 “BAKERY” (Actual: ‘Tw-Cen-MT’, Used: ‘Arial’) Votes: E: 5/15 j: 3/15 L: 3/15 r: 2/15 F: 2/15

28 “BAKERY” (Actual: ‘Tw-Cen-MT’, Used: ‘Arial’) Votes: p: 12/15 j: 3/15 PR

29 “BAKERY” (Actual: ‘Tw-Cen-MT’, Used: ‘Arial’) Votes: Y: 12/15 j: 3/15

30 “UNIVERSITY” (Used: Times New Roman) abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ

31 “UNIVERSITY” (Used: Times New Roman) Votes: U: 8/15 C: 3/15 j: 2/15 s: 1/15 O: 1/15

32 “UNIVERSITY” (Used: Times New Roman) Votes: N: 12/15 j: 3/15

33 “UNIVERSITY” (Used: Times New Roman) Votes: l(‘el’): 9/15 I(‘eye’): 6/15

34 “UNIVERSITY” (Used: Times New Roman) Votes: v: 9/15 j: 3/15 V: 3/15

35 “UNIVERSITY” (Used: Times New Roman) Votes: F: 9/15 L: 5/15 l (‘el’): 1/15

36 “UNIVERSITY” (Used: Times New Roman) Votes: G: 9/15 j: 6/15

37 “UNIVERSITY” (Used: Times New Roman) Votes: j: 12/15 x: 2/15 w: 1/15

38 “UNIVERSITY” (Used: Times New Roman) Votes: j: 5/15 C: 4/15 O: 4/15 x: 2/15

39 “UNIVERSITY” (Used: Times New Roman) Votes: T: 9/15 l: 3/15 i: 1/15 j: 1/15 L: 1/15

40 “UNIVERSITY” (Used: Times New Roman) Votes: Y: 10/15 j: 3/15 i: 2/15

41 Evaluation Biggest weaknesses in preprocessing stage – OCR sensitive to thresholding/color inversion – Occasionally color modeling chooses a bad channel to use for OCR – happens more often on low-resolution images Works surprisingly well for low-resolution images Font does not need to be exact, but proportions need to be roughly the same

42 How do I use this information?

43 The Big Picture Latitude Longitude Geocoding Reverse Geocoding Nearby Businesses ImageSTR Detected Text Business Name Matching Business Identification Business Spatial Detection 43

44 Old Approach Form words from highest-voted characters Compare to lexicon using Levenshtein distance Use existing ranking system afterwards BOKFRY > BAKERY (L-DIST = 2) GFQRGF > GEORGE (L-DIST = 3)

45 New Approach (Lexicon-assisted STR) Minimize Levenshtein distance with best permutation of voted characters Use existing ranking system afterwards B O K F P Y G U H E R I >>> BAKERY J A j L I l (L-DIST = 0)

46 The End Result 46 Bruegger's Bagels Category:Bagels Address:Market Sq Pittsburgh, PA 15222 Phone: (412) 281-2515 Rating: Not Rated Category:Bagels Address:Market Sq Pittsburgh, PA 15222 Phone: (412) 281-2515 Rating: Not Rated

47 Next Steps Fix STR Preprocessing – Bug in Color Modeling code found online – Inversion determination – Multiple thresholds Word matching: Generate templates of words/logos instead of letters Text detector: fix character/word fragmentation by reading papers that address the issue

48 Next Steps Test more images; fix problems as they arise Ideas to consider: – Feed grid-patch probability vectors into SVM instead of “smoothing” – Generate “disambiguation classifiers” to differentiate: Between top contending votes. Remember how ‘G’ and ‘B’ got confused? Dynamically create classifier to tell them apart Between commonly confused letters. Eg. E/F, l/i/j, o/c, etc – Don’t consider statistically insignificant confidences

49 Next Steps Text Detection – Look into after more work has been done on STR – Need to address issues: Intracharacter segmentation Intercharacter segmentation Word segmentation – Needed to make STR system automated like before

50 Thank You


Download ppt "Spatial Business Detection and Recognition from Images Alexander Darino Weeks 10 & 11."

Similar presentations


Ads by Google