Presentation is loading. Please wait.

Presentation is loading. Please wait.

Text Detection in Images and Video

Similar presentations


Presentation on theme: "Text Detection in Images and Video"— Presentation transcript:

1 Text Detection in Images and Video

2 Outline Background Video Demo (Google translate, HuayuNavi, Camcard)
Optical character recognition (OCR) ICDAR Competition Part I: Caption detection in video Part II: Scene text detection Conclusion

3 Background Detecting text and caption from videos is important and in great demand for video retrieval, annotation, indexing, and content analysis. Provide high level semantic information such as program name, speaker name, speech content, sports scores, date, time, location, and so forth. Scene text detection attempts to extract textual information from images/videos in more natural settings.

4 Possible Applications
Annotation, tagging, indexing, search Translation Navigation Book cover recognition License plate recognition Computerized aid for visually impaired Automatic geocoding of businesses

5 Demo Video Google translate HuayuNavi Camcard

6 OCR Tesseract OCR (pptx file)

7 ICDAR Competitions International Conference on Document Analysis and Recognition Competitions Robust Reading Competition Challenge 2: Reading Text in Scene Images

8 Text Localization Task

9 Word Recognition Task

10 Part I: Caption Detection
Text From Corners: A Novel Approach to Detect Text and Caption in Videos, IEEE Transactions on Image Processing, 2011.

11 Existing Methods Most existing approaches can be generally classified into three categories texture based methods connected component based methods edge based methods

12 Corner Features Three-fold advantages of corner points
Corners are frequent and essential patterns in text regions. The distributions of corner points in text regions are usually orderly . generates more flexible and efficient criteria, under which the margin between text and non-text regions in the feature space is discriminative.

13 Corner Extraction

14 Feature Description (1/2)
morphology dilation on the binary corner image Corner points are dense and usually regularly placed in a horizontal string. The text can be effectively detected by figuring out the shape properties of the formed regions.

15 Feature Description (2/2)
Five region properties : Area > Ra Saturation -> Rs Orientation -> Ro aspect ratio -> Ras position > Rc bounding box : smallest rectangular that completely encloses the corner points formed regions.

16 Area The area of a region is defined as the number of foreground pixels in the region enclosed by a rectangle bounding box.

17 Saturation Saturation specifies the proportion of the foreground pixels in the bounding box that also belong to the region, which can be calculated by

18 Orientation Orientation is defined as the angle between the x-axis and the major axis of the ellipse

19 Aspect Ratio and Position
Aspect Ratio: Aspect Ratio of a bounding box is defined as the ratio of its width to its height. Position: We describe the position of a region with its centroid.

20 Language-independent

21 Part II: Scene Text Detection
Paper 1: Detecting text in natural scenes with stroke width transform, CVPR 2010. Paper 2: How salient is scene text?  Proceedings of the th IAPR International Workshop on Document Analysis Systems, Pages   3: Stroke Filter

22 Paper 1 Detecting text in natural scenes with stroke width transform, CVPR 2010.

23 Detected Text Area in Natural Scenes

24 The Extraction Process

25 Stroke Width Transform (SWT)

26 Flowchart

27 Performance Evaluation

28 Paper 2 How Salient is Scene Text?

29 How Salient is Scene Text?
Comparing the performance of four attention models in scene text detection: Torralba’s saliency map (color) Torralba’s saliency map (Intensity) Harel’s GBVS Zhang’s Fast Saliency Itti’s saliency map (N2P2CI)

30 Input and Ground Truth

31 Results

32 Performance Comparison

33 Stroke Filter (1/3) Original

34 Results

35 Stroke Filter (2/3) Improved

36 Stroke Filter (3/3) Fast

37 Conclusions Caption detection is easier due to its nature of origin.
Robust scene text detection is more challenging due to unconstrained environments. Detecting Chinese text in natural images is still an open problem.


Download ppt "Text Detection in Images and Video"

Similar presentations


Ads by Google