Word Spotting DTW.

Slides:

Advertisements

Similar presentations

The A-tree: An Index Structure for High-dimensional Spaces Using Relative Approximation Yasushi Sakurai (NTT Cyber Space Laboratories) Masatoshi Yoshikawa.

Advertisements

QR Code Recognition Based On Image Processing

Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.

Segmentation of Touching Characters in Devnagari & Bangla Scripts Using Fuzzy MultiFactorial Analysis Presented By: Sanjeev Maharjan St. Xavier’s College.

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

PARTITIONAL CLUSTERING

Prénom Nom Document Analysis: Document Image Processing Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

電腦視覺 Computer and Robot Vision I Chapter2: Binary Machine Vision: Thresholding and Segmentation Instructor: Shih-Shinh Huang 1.

Automatic Feature Extraction for Multi-view 3D Face Recognition

74 th EAGE Conference & Exhibition incorporating SPE EUROPEC 2012 Automated seismic-to-well ties? Roberto H. Herrera and Mirko van der Baan University.

Clustered alignments of gene- expression time series data Adam A. Smith, Aaron Vollrath, Cristopher A. Bradfield and Mark Craven Department of Biosatatistics.

DIMENSIONALITY REDUCTION BY RANDOM PROJECTION AND LATENT SEMANTIC INDEXING Jessica Lin and Dimitrios Gunopulos Ângelo Cardoso IST/UTL December

A Study of Approaches for Object Recognition

Segmentation Divide the image into segments. Each segment:

Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.

FACE RECOGNITION, EXPERIMENTS WITH RANDOM PROJECTION

Scale Invariant Feature Transform (SIFT)

Feature Screening Concept: A greedy feature selection method. Rank features and discard those whose ranking criterions are below the threshold. Problem:

Learning the space of time warping functions for Activity Recognition Function-Space of an Activity Ashok Veeraraghavan Rama Chellappa Amit K. Roy-Chowdhury.

A Novel 2D To 3D Image Technique Based On Object- Oriented Conversion.

Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.

California Car License Plate Recognition System ZhengHui Hu Advisor: Dr. Kang.

Hand Signals Recognition from Video Using 3D Motion Capture Archive Tai-Peng Tian Stan Sclaroff Computer Science Department B OSTON U NIVERSITY I. Introduction.

Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.

Radial-Basis Function Networks

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

Exact Indexing of Dynamic Time Warping

Navigating and Browsing 3D Models in 3DLIB Hesham Anan, Kurt Maly, Mohammad Zubair Computer Science Dept. Old Dominion University, Norfolk, VA, (anan,

FEATURE EXTRACTION FOR JAVA CHARACTER RECOGNITION Rudy Adipranata, Liliana, Meiliana Indrawijaya, Gregorius Satia Budhi Informatics Department, Petra Christian.

Computer vision.

1 Template-Based Classification Method for Chinese Character Recognition Presenter: Tienwei Tsai Department of Informaiton Management, Chihlee Institute.

: Chapter 10: Image Recognition 1 Montri Karnjanadecha ac.th/~montri Image Processing.

Presented by Tienwei Tsai July, 2005

Language-Independent Text Line Extraction from Historical Document Images Presented by: Abed Asi First International workshop on Historical Document.

Graphite 2004 Statistical Synthesis of Facial Expressions for the Portrayal of Emotion Lisa Gralewski Bristol University United Kingdom

BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.

S DTW: COMPUTING DTW DISTANCES USING LOCALLY RELEVANT CONSTRAINTS BASED ON SALIENT FEATURE ALIGNMENTS K. Selçuk Candan Arizona State University Maria Luisa.

Dynamic Time Warping Algorithm for Gene Expression Time Series

S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.

Implementing a Speech Recognition System on a GPU using CUDA

Shape Analysis and Retrieval Structural Shape Descriptors Notes courtesy of Funk et al., SIGGRAPH 2004.

K. Selçuk Candan, Maria Luisa Sapino Xiaolan Wang, Rosaria Rossini

Digital Image Processing CCS331 Relationships of Pixel 1.

1 Interactive Thickness Visualization of Articular Cartilage Author :Matej Mlejnek, Anna Vilanova,Meister Eduard GröllerMatej MlejnekAnna VilanovaMeister.

1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.

Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.

CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.

Exact indexing of Dynamic Time Warping

Duy & Piotr. How to reconstruct a high quality image with the least amount of samples per pixel the least amount of resources And preserving the image.

A Multiresolution Symbolic Representation of Time Series Vasileios Megalooikonomou Qiang Wang Guo Li Christos Faloutsos Presented by Rui Li.

Scale Invariant Feature Transform (SIFT)

Instructor: Mircea Nicolescu Lecture 5 CS 485 / 685 Computer Vision.

Week 4: 6/6 – 6/10 Jeffrey Loppert. This week.. Coded a Histogram of Oriented Gradients (HOG) Feature Extractor Extracted features from positive and negative.

Unsupervised Classification

Optical Character Recognition

南台科技大學資訊工程系 Region partition and feature matching based color recognition of tongue image 指導教授：李育強報告者：楊智雁日期： 2010/04/19 Pattern Recognition Letters,

IMAGE PROCESSING RECOGNITION AND CLASSIFICATION

Department of Computer Science

Robust Similarity Measures for Mobile Object Trajectories

Image Segmentation Techniques

Text Detection in Images and Video

COSC 4335: Other Classification Techniques

Handwritten Characters Recognition Based on an HMM Model

Midterm Exam Closed book, notes, computer Similar to test 1 in format:

Computer and Robot Vision I

Pattern Recognition and Training

Pattern Recognition and Training

Auditory Morphing Weyni Clacken

Presentation transcript:

Word Spotting DTW

Word Spot DTW Introduction The Basic Idea Pruning DTW Matching Words With DTW Experimental Results Summary

Introduction Libraries contain an enormous amount of hand-written historical documents. They would like to make it available electronically. such large collections can only be accessed efficiently if a searchable index exist. The current state-of-the-art approach is to manually create an index.

Introduction – cont. The quality of historical documents is degraded due to faded ink, stained paper, etc. Traditional Optical Character Recognition (OCR) techniques that usually recognize words character-by-character, fail.

Introduction – cont.

The Basic Idea For handwritten manuscripts written by a single author - the images of multiple instances of the same word are likely to look similar. Word spotting idea provides an alternative approach to index generation.

Word Spotting Each page in the document collection is segmented into words. The different instances of a word are clustered together using image matching. A human can tag the n most interesting clusters for indexing with the appropriate ASCII equivalent.

Matching Good matching performance can be achieved by: A technique that skews, resizes and aligns two candidate words. Compares the words pixel-by-pixel. We will use DTW.

Pruning Running a matching algorithm is expensive with growing collection sizes. Pruning techniques which can discard unlikely matches are used.

Pruning Techniques Pruning of word pairs based on the area and aspect ratio of their bounding boxes. Require words to have the same number of descenders (strokes below the baseline). The idea is to require similar pruning statistics.

Ascenders Upper Baseline Lower Baseline Descenders

DTW Used to compute a distance between two time series. A time series is a list of samples taken from a signal ordered by time. Naive approach: resample one of them and then compare the series sample-by-sample. does not produce intuitive results, as it compares samples that might not correspond well.

DTW Recovering optimal alignments between sample points in the two time series. Demonstrates: time

Comparison between Naive & DTW time Any distance (Euclidean, Manhattan, …) which aligns the i-th point on one time series with the i-th point on the other will produce a poor similarity score. i i+2 time A non-linear (elastic) alignment produces a more intuitive similarity measure, allowing similar shapes to match even if they are out of phase in the time axis.

DTW The DTW-distance between two time series Xi . . . Xm and Yi . . . Yn is D(m,n). D(i,j)= min {D(i,j-1),D(i-1,j),D(i-1,j-1)} + d(i,j) d(i,j) varies with the application. This calculation realizes a local continuity constraint.

Warping Function Time Series A 1 is n m pk To find the best alignment between A and B one needs to find the path through the grid P = p1, … , ps , … , pk ps = (is , js ) which minimizes the total distance between them. P is called a warping function. js ps Time Series B 1 p1

Time-Normalized Distance Measure Time Series A Time-normalized distance between A and B : 1 is n m pk D(A , B ) = d(ps): distance between is and js ws > 0: weighting coefficient. js ps Best alignment path between A and B : P0 = (D(A , B )). Time Series B 1 p1

Matching words with DTW

Matching words with DTW The inter-character and intra-character spacing is subject to larger variations. DTW offers a more ﬂexible way compensate for these variations than linear scaling. We ﬁrst normalize the slant and skew angle of candidate images. From each word, four features per image column are extracted and combined into a single time series.

Matching Words With DTW For each image I with height h and width w, we extract a time series: X(I) = x1….xw. xi = f1(I,i),f2(I,i),f3(I,i),f4(I,i). fk = four extracted features per image column.

Matching Words With DTW In order to run the DTW algorithm on two time series X(I) and Y(J), we deﬁne a local distance function: d(xi,yj ) = ∑ (fk(I,i)-fk(J,j))² Now, the DTW algorithm can be run to determine a warping path between X and Y: D(X,Y) = ∑ d(xik,yjk )

DTW Features Projection Profiles Word Profiles Upper word profiles Lower word profiles Background/Ink transitions

Projection Profile Projection proﬁle capture the distribution of ink along one dimension in a word image. A vertical projection proﬁle is computed by summing the intensity values in each image column separately: PP(I,c) = ∑ (255-I(r,c)) h r=1

(a) original image: slant/skew/baseline-normalized, cleaned. (b) normalized projection profile.

Word Profiles Word proﬁles capture part of the outlining shape of a word. Using upper and lower word profiles. Going along the upper (lower) boundary of a word’s bounding box. Recording for each image column the distance to the nearest “ink” pixel in that column.

Word Profiles Due to a number of factors, some image columns may not contain ink pixels. Therefore, these gaps are closed by linearly interpolating between the two closest points.

Upper Boundary

Lower Boundary

Background/Ink Transitions A capture of the inner structure of a word is missing. Records for every image column, the number of transitions from the background to ink pixels: Determined by threshold. nbit(I, c).

Experimental Results Data sets and processing Results

Data Sets And Processing conducted on two test sets of different quality Acceptable quality (set 1). Very degraded quality (set 2). Divide the test to four sets: 15 images in test set 1. Entire test set 1. 32 images in test set 2. Entire test set 2.

Test Sests

Results SC XOR EDM Shape context matching. The images are aligned to compensate for shear and scale changes and then a difference image is computed. EDM Euclidean distance map. Larger regions are weighted more heavily.

Results Test set/Algorithm XOR SSD SLH SC EDM DTW A 54.14% 52.66% 42.43% 48.67% 72.61% 73.71% B n/a 65.34% C 48.11% 49.56% 58.81% D 51.81%

Summary & Conclusions DTW approach perform better than a number of other techniques. Accuracy. Speed. The future work will focus on improvements in speed and accuracy. Pruning. Optimizations in DTW.