Presentation is loading. Please wait.

Presentation is loading. Please wait.

Speech controlled keyboard Instructor: Dr. John G. Harris TA: M. Skowronski Andréa Matsunaga Maurício O. Tsugawa ©2002,

Similar presentations


Presentation on theme: "Speech controlled keyboard Instructor: Dr. John G. Harris TA: M. Skowronski Andréa Matsunaga Maurício O. Tsugawa ©2002,"— Presentation transcript:

1 Speech controlled keyboard Instructor: Dr. John G. Harris TA: M. Skowronski Andréa Matsunaga (ammatsun@ufl.edu) Maurício O. Tsugawa (tsugawa@ufl.edu) ©2002, UFL-COE-ECE EEL 6586 - Automatic Speech Processing Final Project – Spring - 2002

2 Agenda Introduction Challenges Project Description Results Demo

3 Introduction Why Speech Recognition? Why keyboard?

4 Challenges Vocabulary Size (Not so big, but 4x HW#4) Homework #4, part B5 100% (on training data) 99%~100% (on test data) Find a good training data Real-Time Processing using Matlab Matlab is not multithreaded Audiorecorder does not offer much control

5 Project Description Real-Time Recording HMM Engine

6 Recording Data Acquisition Toolbox from MathWorks more control than audiorecorder very simple triggering scheme End point detection based on: short-time zero crossing short-time energy Trigger level adjusted during recording

7 HMM Engine Frame window size: 15 ms (no overlap.) Feature vector: MFCC from Malcom Slaney’s mfcc.m (12 coefficients, no c[0]) Delta (12 coefficients, K=2) Delta-delta (12 coefficients, K=1) HMM models: 1 female and 1 male models for each digit 1 model for letters (no sufficient database) 8 states (second classification with 4 states for class “E”) EM iterations: 10 Classifier: Viterbi (hmm_vit from h2m) Utterance classified according to the max log likelihood of all HMMs. Noise Reduction: Cepstral mean subtraction for both TRAIN and TEST data.

8 Results Using about 40 utterances per class as training data: About 92% accuracy on training data HW4 Extra Credit Recognition dropped from 99% to about 10%!

9 Results Using only digits: very good recognition Using digits+alphabet: {E, {B, V}, {D, G}, {P, T}, {C, Z}} {F, X, S}, {L, M, N}, {A, K, J, {H, 8}} {O} {I, 5, 9} {Q, 2} {U} {R} {W} {Y} {0} {1} {3} {4} {6} {7} Poor Good

10 Demo Please, enjoy the demo!

11 Thank you! Questions?


Download ppt "Speech controlled keyboard Instructor: Dr. John G. Harris TA: M. Skowronski Andréa Matsunaga Maurício O. Tsugawa ©2002,"

Similar presentations


Ads by Google