Presentation is loading. Please wait.

Presentation is loading. Please wait.

Visual Speech Recognition Using Hidden Markov Models Kofi A. Boakye CS280 Course Project.

Similar presentations


Presentation on theme: "Visual Speech Recognition Using Hidden Markov Models Kofi A. Boakye CS280 Course Project."— Presentation transcript:

1 Visual Speech Recognition Using Hidden Markov Models Kofi A. Boakye CS280 Course Project

2 Motivation Visual articulation provides good information source for speech –Lip-reading humans can intelligibly recognize speech –Visual information provides robustness to noise Can enhance speech recognition in various applications –Text annotation of multimedia data –Automatic computer dictation –Lip-reading in mobile phones for noisy environments

3 Project Overview Visual speech recognition task using Tulips1 database Recognition performed by training features in HMMs Cross-validation procedure used for training and testing Experimented with features and HMM architecture

4 Tulips1 Small public audiovisual database Consists of 12 speakers (9 male, 3 female) saying first four English digits Video format: –Digitized (8-bit grayscale pgm) images of lips of size 100x75 –Sampling rate: 30fps

5 Features Contour features –6 features related to geometry of the mouth and lips (hand generated) PCA on raw image pixels –Experimented with different numbers of components Image preprocessing + PCA Processing included: 1)Symmetry enforcement 2)Lowpass filtering (9x9 Gsn kernel, σ=1.5) and subsampling (5 ) 3)Compression and linearization

6 Results Contour Features Best choice: 5 states and 1 Gaussian Note high accuracy with even 1 state Indicates importance of delta components Raw Image Features Best choice: 10 components Similar performance to contour features, which require human assistance Demonstrates power of PCA

7 Results Preprocessed Image Features Procedure produces fair performance Even better with addition of PCA

8 Conclusions For given task, HMMs proved very effective HMM architecture significantly affects results Delta features appear to be quite useful Feature selection –Contour features best Generation can potentially be automatic –Within limited exploration, “blind” statistical technique (i.e., PCA) superior to image-specific one


Download ppt "Visual Speech Recognition Using Hidden Markov Models Kofi A. Boakye CS280 Course Project."

Similar presentations


Ads by Google