Wonjun Kim and Changick Kim, Member, IEEE

Slides:

Advertisements

Similar presentations

Patient information extraction in digitized X-ray imagery Hsien-Huang P. Wu Department of Electrical Engineering, National Yunlin University of Science.

Advertisements

Applications of one-class classification

Image Segmentation Longin Jan Latecki CIS 601. Image Segmentation Segmentation divides an image into its constituent regions or objects. Segmentation.

A Graph based Geometric Approach to Contour Extraction from Noisy Binary Images Amal Dev Parakkat, Jiju Peethambaran, Philumon Joseph and Ramanathan Muthuganapathy.

EE663 Image Processing Histogram Equalization Dr. Samir H. Abdul-Jauwad Electrical Engineering Department King Fahd University of Petroleum & Minerals.

Automatic Histogram Threshold Using Fuzzy Measures 呂惠琪.

Facial feature localization Presented by: Harvest Jang Spring 2002.

September 10, 2013Computer Vision Lecture 3: Binary Image Processing 1Thresholding Here, the right image is created from the left image by thresholding,

Segmentation (2): edge detection

Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.

IEEE TCSVT 2011 Wonjun Kim Chanho Jung Changick Kim

Thresholding Otsu’s Thresholding Method Threshold Detection Methods Optimal Thresholding Multi-Spectral Thresholding 6.2. Edge-based.

A New Approach for Video Text Detection and Localization M. Cai, J. Song and M.R. Lyu VIEW Technologies The Chinese University of Hong Kong.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Text Detection in Video Min Cai Background  Video OCR: Text detection, extraction and recognition  Detection Target: Artificial text  Text.

Processing Digital Images. Filtering Analysis –Recognition Transmission.

Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.

CS 223B Assignment 1 Help Session Dan Maynes-Aminzade.

Smart Traveller with Visual Translator. What is Smart Traveller? Mobile Device which is convenience for a traveller to carry Mobile Device which is convenience.

A Novel 2D To 3D Image Technique Based On Object- Oriented Conversion.

Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.

Image Segmentation by Clustering using Moments by, Dhiraj Sakumalla.

Digital Image Processing - (monsoon 2003) FINAL PROJECT REPORT Project Members Sanyam Sharma Sunil Mohan Ranta Group No FINGERPRINT.

Entropy and some applications in image processing Neucimar J. Leite Institute of Computing

Chapter 10: Image Segmentation

Tricolor Attenuation Model for Shadow Detection. INTRODUCTION Shadows may cause some undesirable problems in many computer vision and image analysis tasks,

Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)

BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.

Chapter 9.  Mathematical morphology: ◦ A useful tool for extracting image components in the representation of region shape.  Boundaries, skeletons,

S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.

CS 6825: Binary Image Processing – binary blob metrics

September 23, 2014Computer Vision Lecture 5: Binary Image Processing 1 Binary Images Binary images are grayscale images with only two possible levels of.

September 5, 2013Computer Vision Lecture 2: Digital Images 1 Computer Vision A simple two-stage model of computer vision: Image processing Scene analysis.

Digital Image Processing Lecture 18: Segmentation: Thresholding & Region-Based Prof. Charlene Tsai.

Chapter 10, Part II Edge Linking and Boundary Detection The methods discussed in the previous section yield pixels lying only on edges. This section.

Pixel Connectivity Pixel connectivity is a central concept of both edge- and region- based approaches to segmentation The notation of pixel connectivity.

Digital Image Processing - (monsoon 2003) FINAL PROJECT REPORT Project Members Sanyam Sharma Sunil Mohan Ranta Group No FINGERPRINT.

Digital Image Processing Lecture 16: Segmentation: Detection of Discontinuities Prof. Charlene Tsai.

Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.

A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )

Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.

Digital Image Processing

October 1, 2013Computer Vision Lecture 9: From Edges to Contours 1 Canny Edge Detector However, usually there will still be noise in the array E[i, j],

Vehicle Detection in Aerial Surveillance Using Dynamic Bayesian Networks Hsu-Yung Cheng, Member, IEEE, Chih-Chia Weng, and Yi-Ying Chen IEEE TRANSACTIONS.

Nottingham Image Analysis School, 23 – 25 June NITS Image Segmentation Guoping Qiu School of Computer Science, University of Nottingham

November 5, 2013Computer Vision Lecture 15: Region Detection 1 Basic Steps for Filtering in the Frequency Domain.

1 Mathematic Morphology used to extract image components that are useful in the representation and description of region shape, such as boundaries extraction.

Digital Image Processing Lecture 16: Segmentation: Detection of Discontinuities May 2, 2005 Prof. Charlene Tsai.

Machine Vision ENT 273 Hema C.R. Binary Image Processing Lecture 3.

Preliminary Transformations Presented By: -Mona Saudagar Under Guidance of: - Prof. S. V. Jain Multi Oriented Text Recognition In Digital Images.

A computational model of stereoscopic 3D visual saliency School of Electronic Information Engineering Tianjin University 1 Wang Bingren.

Course 3 Binary Image Binary Images have only two gray levels: “1” and “0”, i.e., black / white. —— save memory —— fast processing —— many features of.

Chapter 6 Skeleton & Morphological Operation. Image Processing for Pattern Recognition Feature Extraction Acquisition Preprocessing Classification Post.

Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons.

Shadow Detection in Remotely Sensed Images Based on Self-Adaptive Feature Selection Jiahang Liu, Tao Fang, and Deren Li IEEE TRANSACTIONS ON GEOSCIENCE.

Student Gesture Recognition System in Classroom 2.0 Chiung-Yao Fang, Min-Han Kuo, Greg-C Lee, and Sei-Wang Chen Department of Computer Science and Information.

Solar Image Recognition Workshop, Brussels, 23 & 24 Oct. The Detection of Filaments in Solar Images Dr. Rami Qahwaji Department of Electronic Imaging and.

Computer Vision Lecture 13: Image Segmentation III

Presenter: Ibrahim A. Zedan

Computer Vision Lecture 12: Image Segmentation II

Computer Vision Lecture 5: Binary Image Processing

Binary Image Analysis: Part 1 Readings: Chapter 3: 3.1, 3.4, 3.8

A New Approach to Track Multiple Vehicles With the Combination of Robust Detection and Two Classifiers Weidong Min , Mengdan Fan, Xiaoguang Guo, and Qing.

Binary Image Analysis used in a variety of applications:

Binary Image Analysis: Part 1 Readings: Chapter 3: 3.1, 3.4, 3.8

Department of Computer Engineering

A Block Based MAP Segmentation for Image Compression

Scalable light field coding using weighted binary images

Binary Image Analysis used in a variety of applications:

Morphological Filters Applications and Extension Morphological Filters

Presentation transcript:

Wonjun Kim and Changick Kim, Member, IEEE A New Approach for Overlay Text Detection and Extraction From Complex Video Scene Wonjun Kim and Changick Kim, Member, IEEE

Abstract Overlay text brings important semantic clues in video content analysis such as video information retrieval and summarization, since the content of the scene or the editor’s intention can be well represented by using inserted text. Most of the previous approaches to extracting overlay text from videos are based on low-level features, such as edge, color, and texture information. However, existing methods experience difficulties in handling texts with various contrasts or inserted in a complex background.

II. OVERLAY TEXT REGION DETECTION

A. Transition Map Generation As a rule of thumb, if the background of overlay text is dark,then the overlay text tends to be bright. On the contrary, the overlay text tends to be dark if the background of overlay text is bright. the intensities of three consecutive pixels are decreasing logarithmically at the boundary of bright overlay text due to color bleeding by the lossy video compression. It is also observed that the intensities of three consecutive pixels increases exponentially at the boundary of dark overlay text.

A. Transition Map Generation

A. Transition Map Generation

A. Transition Map Generation The thresholding value TH is empirically set to 80 in consideration of the logarithmical change.

B. Candidate Region Extraction If a gap of consecutive pixels between two nonzero points in the same row is shorter than 5% of the image width, they are filled with 1s. If the connected components are smaller than the threshold value, they are removed. The minimum size of overlay text region used to remove small components in Section II-B is set to be 300. The threshold value for the overlay text region update is set to be 500.

C. Overlay Text Region Determination Since most of overlay texts are placed horizontally in the video,the vertically longer candidates can be easily eliminated. The density of transition pixels is a good criterion as well. LBP is a very efficient and simple tool to represent the consistency of texture using only the intensity pattern.

C. Overlay Text Region Determination Let denote the density of transition pixels in each candidate region and can be easily obtained from dividing the number of transition pixels by the size of each candidate region. POT is defined as follows: where denotes the number of candidate regions as mentioned. denotes the number of different LBPs, which is normalized by the maximum of the number of different LBPs (i.e.,256) in each candidate region. If POT of the candidate region is larger than a predefined value, the corresponding region is finally determined as the overlay text region. The thresholding value in POT is empirically set to 0.05 based on various experimental results.

C. Overlay Text Region Determination

D. Overlay Text Region Refinement First, the horizontal projection is performed to accumulate all the transition pixel counts in each row of the detected overlay text region to form a histogram of the number of transition pixels. Then the null points, which denote the pixel row without transition pixels, are removed and separated regions are re-labeled. The projection is conducted vertically and null points are removed once again.

E. Overlay Text Region Update

III. OVERLAY TEXT EXTRACTION Among several algorithms proposed to conduct such a task, there is one effective method, proposed by Lyu et al.,consists of color polarity classification and color inversion (if necessary), followed by adaptive thresholding, dam point labeling and inward filling.

A. Color Polarity Computation Such inconsistent results must complicate the following text extraction steps. Thus, our goal in this subsection is to check the color polarity and inverse the pixel intensities if needed so that the output text region of the module can always contain bright text compared to its surrounding pixels

A. Color Polarity Computation Given the binarized text region, the boundary pixels, which belong to left, right, top, and bottom lines of the text region, are searched and the number of white pixels is counted. If the number of white boundary pixels is less than 50% of the number of boundary pixels, the text region is regarded as “bright text on dark background” scenario, which requires no polarity change. denotes the position of the first encountered transition pixel in each row of the text region and denotes the value on the binary image.

B. Overlay Text Extraction First, each overlay text region is expanded wider by two pixels to utilize the continuity of background. This expanded outer region is denoted as ER. Then, the pixels inside the text region are compared to the pixels in ER so that pixels connected to the expanded region can be excluded.We denote the text region as TR and the expanded text region as ETR, i.e., Let and denote gray scale pixels on ETR and the resulting binary image, respectively. All are initialized as ¡§White¡¨.The window with the size of is moving horizontally with the stepping size 8 and then the window with the size of is moving vertically with the stepping size . If the intensity of is smaller than the local thresholding value computed by Otsu method in each window, the corresponding is set to be ¡§Black¡¨.

B. Overlay Text Extraction

IV. EXPERIMENTAL RESULTS A IV. EXPERIMENTAL RESULTS A. Performance Comparison of the Transition Map With the Harris Corner Map Where P denotes the set of detected pixels by each method and T denotes the number of pixels belonging to the overlay text.

B. Performance Evaluation of Overlay Text Detection

B. Performance Evaluation of Overlay Text Detection

B. Performance Evaluation of Overlay Text Detection false positives (FP) and false negatives (FN)

B. Performance Evaluation of Overlay Text Detection

C. Performance Evaluation of Overlay Text Extraction

C. Performance Evaluation of Overlay Text Extraction

C. Performance Evaluation of Overlay Text Extraction denote the probabilities of overlay text pixels denote background pixels in the ground-truth image denotes the error probability to classify overlay text pixels as background pixels denotes the error probability to classify background pixels as overlay text pixels.

Thank you for your listening ! Good afternoon ! 2009.03.26