Presentation is loading. Please wait.

Presentation is loading. Please wait.

Minimum Phoneme Error Based Heteroscedastic Linear Discriminant Analysis For Speech Recognition Bing Zhang and Spyros Matsoukas, BBN Technologies, 50 Moulton.

Similar presentations


Presentation on theme: "Minimum Phoneme Error Based Heteroscedastic Linear Discriminant Analysis For Speech Recognition Bing Zhang and Spyros Matsoukas, BBN Technologies, 50 Moulton."— Presentation transcript:

1 Minimum Phoneme Error Based Heteroscedastic Linear Discriminant Analysis For Speech Recognition Bing Zhang and Spyros Matsoukas, BBN Technologies, 50 Moulton St. Cambridge Reporter : Chang Chih Hao

2 Introduction LDA and HLDA –Better classification accuracy –some common Limitations None of them assumes any prior knowledge of confusable hypotheses Their objective functions do not directly relate to the word error rate (WER) Minimum Phoneme Error –Minimize phoneme errors in lattice-based training frameworks –Since this criterion is closely related to WER, MPE_HLDA tends to be more robust than other projection methods, which makes it potentially better suited for a wider variety of features.

3 MPE Objection Function MPE-HLDA model MPE-HLDA aims at minimizing expected number of phoneme errors introduced by the MPE-HLDA model in a given hypothesis lattice, or equivalently maximizing the function

4 MPE Objection Function

5 MPE Objection Function It can be shown that the derivative of (4) with respect to A is

6 MPE Objection Function

7 MPE Objection Function Therefore, Eq.(6) can be rewritten as 39*39 39*162

8 MPE-HLDA Implementation In theory, the derivative of the MPE-HLDA objective function can be computed based on Eq.(12), via s single forward-backward pass over the training lattices. In practice, however, it is not possible to fit all the full covariance matrices in memory. Two steps –First, run a forward-backward pass over the training lattices to acumulate –Second, uses these statistics together with the full covariance matrices to synthesize the derivative. The Paper used gradient descent in updating the projection matrix.

9 MPE-HLDA Implementation

10 Experimental Framework A L p*n n*l l*1 p*1 Global feature projection ---there is more useful information in longer contexts ---Reduce the computational cost

11 Experimentation Conversational Telephone Speech (CTS) –2300 hours of training data 800 hours : training the initial ML model 1500 hours : held-out training data –Lattice generation –Discriminative training –MPE-HLDA : only 370 hours –Testing set Eval03 Dev04

12 Experimentation Conversational Telephone Speech (CTS) –Feature Frame concatenated PLP cepstra –15 frames, l = 225, n = 130, p = 60

13 Experimentation

14 Broadcast News (BN) –600 hours : training the initial model (Hub4 and TDT) –330 hours : held-out data

15 Thanks

16

17


Download ppt "Minimum Phoneme Error Based Heteroscedastic Linear Discriminant Analysis For Speech Recognition Bing Zhang and Spyros Matsoukas, BBN Technologies, 50 Moulton."

Similar presentations


Ads by Google