1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.

1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types

2 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training In HMM

3 Recognition Tasks Isolated Word Recognition (IWR) Connected Word (CW), And Continuous Speech Recognition (CSR) Connected Word (CW), And Continuous Speech Recognition (CSR) Speaker Dependent, Multiple Speaker, And Speaker Independent Vocabulary Size –Small <20 –Medium >100, 100, <1000 –Large >1000, 1000, <10000 –Very Large >10000

4 Speech Recognition Concepts NLP Speech Processing Text Speech NLP Speech Processing Speech Understanding Speech Synthesis Text Phone Sequence Speech Recognition Speech recognition is inverse of Speech Synthesis

5 Speech Recognition Approaches Bottom-Up Approach Top-Down Approach Blackboard Approach

6 Bottom-Up Approach Signal Processing Feature Extraction Segmentation Signal Processing Feature Extraction Segmentation Sound Classification Rules Phonotactic Rules Lexical Access Language Model Voiced/Unvoiced/Silence Knowledge Sources Recognized Utterance

7 Unit Matching System Top-Down Approach Feature Analysis Lexical Hypo thesis Syntactic Hypo thesis Semantic Hypo thesis Utterance Verifier/ Matcher Inventory of speech recognition units Word Dictionary Grammar Task Model Recognized Utterance

8 Blackboard Approach Environmental Processes Acoustic Processes Lexical Processes Syntactic Processes Semantic Processes Black board

9 Recognition Theories Articulatory Based Recognition –Use from Articulatory system for recognition –This theory is the most successful until now Auditory Based Recognition –Use from Auditory system for recognition Hybrid Based Recognition –Is a hybrid from the above theories Motor Theory –Model the intended gesture of speaker

10 Recognition Problem We have the sequence of acoustic symbols and we want to find the words that expressed by speaker Solution : Finding the most probable of word sequence by having Acoustic symbols

11 Recognition Problem A : Acoustic Symbols W : Word Sequence we should find so that

12 Bayse Rule

13 Bayse Rule (Cont’d)

14 Simple Language Model Computing this probability is very difficult and we need a very big database. So we use from Trigram and Bigram models.

15 Simple Language Model (Cont’d) Trigram : Bigram : Monogram :

16 Simple Language Model (Cont’d) Computing Method : Number of happening W3 after W1W2 Total number of happening W1W2 AdHoc Method :

17 Error Production Factor Prosody (Recognition should be Prosody Independent) Noise (Noise should be prevented) Spontaneous Speech

18 P(A|W) Computing Approaches Dynamic Time Warping (DTW) Hidden Markov Model (HMM) Artificial Neural Network (ANN) Hybrid Systems

Dynamic Time Warping

Use Dynamic Programing (DP): (s,t) ----> (u,v) => (s,t) --> (w,x) (w,x) --> (u,v) => (s,t) --> (w,x) (w,x) --> (u,v) - Recursive : O(n!) or O((I*J)!) - DP : O(I*J)

Dynamic Time Warping for I=J=50 then O(recursive) / O(DP) ~ 1.2 * 10^61

Dynamic Time Warping Detail of DTW - Distance - Distance - Selecting Same Frams

Dynamic Time Warping

Search Limitation : Search Limitation : - First & End Interval - Global Limitation - Local Limitation

Dynamic Time Warping Global Limitation : Global Limitation :

Dynamic Time Warping Local Limitation : Local Limitation :

29 Artificial Neural Network...... Simple Computation Element of a Neural Network

30 Artificial Neural Network (Cont’d) Neural Network Types –Perceptron –Time Delay –Time Delay Neural Network Computational Element (TDNN)

31 Artificial Neural Network (Cont’d)... Single Layer Perceptron

32 Artificial Neural Network (Cont’d)... Three Layer Perceptron...

33 Hybrid Methods Hybrid Neural Network and Matched Filter For Recognition PATTERN CLASSIFIER Speech Acoustic Features Delays Output Units

34 Neural Network Properties The system is simple, But too much iteration is needed for training Doesn’t determine a specific structure Regardless of simplicity, the results are good Training size is large, so training should be offline Accuracy is relatively good

1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.

Similar presentations

Presentation on theme: "1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.

Similar presentations

Presentation on theme: "1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types."— Presentation transcript:

Similar presentations

About project

Feedback