Presentation is loading. Please wait.

Presentation is loading. Please wait.

Minimum Phoneme Error Based Heteroscedastic Linear Discriminant Analysis for Speech Recognition Bing Zhang and Spyros Matsoukas BBN Technologies Present.

Similar presentations


Presentation on theme: "Minimum Phoneme Error Based Heteroscedastic Linear Discriminant Analysis for Speech Recognition Bing Zhang and Spyros Matsoukas BBN Technologies Present."— Presentation transcript:

1 Minimum Phoneme Error Based Heteroscedastic Linear Discriminant Analysis for Speech Recognition Bing Zhang and Spyros Matsoukas BBN Technologies Present by shih-hung Liu 2006/ 05/16

2 2 Outline Review PCA, LDA, HLDA Introduction MPE-HLDA Experimental results Conclusions

3 3 Review - PCA

4 4 Review - LDA

5 5 Review - HLDA

6 6

7 7 Introduction for MPE-HLDA In speech recognition systems, feature analysis is usually employed for better classification accuracy and complexity control. In recent years, extensions to the classical LDA have been widely adopted Among them, HDA seeks to remove the equal variance constraint by LDA ML-HLDA is taking the HMM structure (eg. diagonal covariance Guassian mixture state distribution) into consideration

8 8 Introduction for MPE-HLDA Despite the differences between the above techniques, they have some common limitation. First, none of them assumes any prior knowledge of confusable hypotheses, so their choices are determined to be suboptimal for recognition Second, their objective functions do not directly related to the WER For example, we found that HLDA could select totally non- discriminant features while improving its objective function by mapping all training samples to a single point in space along some dimensions

9 9 Introduction LDA and HLDA –Better classification accuracy –some common Limitations None of them assumes any prior knowledge of confusable hypotheses Their objective functions do not directly relate to the word error rate (WER) –As a result, it is often unknown whether selected features will do well in testing by just looking at the values of objective functions Minimum Phoneme Error –Minimize phoneme errors in lattice-based training frameworks –Since this criterion is closely related to WER, MPE-HLDA tends to be more robust than other projection methods, which makes it potentially better suited for a wider variety of features

10 10 MPE-HLDA MPE-HLDA model MPE-HLDA aims at minimizing expected number of phoneme errors introduced by the MPE- HLDA model in a given hypothesis lattice, or equivalently maximizing the function

11 11 MPE-HLDA

12 12 MPE-HLDA It can be shown that the derivative of (4) with respect to A is

13 13 MPE-HLDA

14 14 MPE-HLDA Therefore, Eq.(6) can be rewritten as 39*39 39*162

15 15 MPE-HLDA Implementation In theory, the derivative of the MPE-HLDA objective function can be computed based on Eq.(12), via s single forward-backward pass over the training lattices. In practice, however, it is not possible to fit all the full covariance matrices in memory. Two steps –First, run a forward-backward pass over the training lattices to accumulate –Second, uses these statistics together with the full covariance matrices to synthesize the derivative. The Paper used gradient descent in updating the projection matrix.

16 16 MPE-HLDA Overview

17 17 MPE-HLDA Overview

18 18 Implementation 1. Initialize feature projection matrix by LDA or HLDA, and MPE-HLDA model 2. Set 3. Compute covariance statistics in the original feature space –(a) Do maximum likelihood update of MPE-HLDA model in the feature space define by –(b) Do single pass retraining using to generate and in the original feature space 4. Optimize the feature projection matrix: –(a) Set –(b) Project and using to get model in reduced subspace –(c) Run F-B pass on lattices using to compute, and –(d) Use, and statistics form 4(c) to compute the MPE derivative –(e) Update to using gradient descent –(f) Set, go to 4(b) unless convergence 5. Optionally, set and go to 3

19 19 Experiments DARPA EARS research project CTS, 800/2300 hrs for ML training, 370 hrs of held-out data for MPE-HLDA training BN, 600 hrs from Hub4 and TDT for ML training, 330 hrs of held-out data for MPE-HLDA estimating PLP(15dim) and 1st 2nd 3th derivative coefficients (60dim) EARS 2003 Evaluation test set

20 20 Experiments

21 21 Experiments

22 22 Conclusions We have taken a first look at a new feature analysis method, MPE-HLDA. It shows that it is effective in reducing recognition error, and that it is more robust than other commonly used analysis methods like LDA and HLDA


Download ppt "Minimum Phoneme Error Based Heteroscedastic Linear Discriminant Analysis for Speech Recognition Bing Zhang and Spyros Matsoukas BBN Technologies Present."

Similar presentations


Ads by Google