Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presented by: Fang-Hui, Chu Automatic Speech Recognition Based on Weighted Minimum Classification Error Training Method Qiang Fu, Biing-Hwang Juang School.

Similar presentations


Presentation on theme: "Presented by: Fang-Hui, Chu Automatic Speech Recognition Based on Weighted Minimum Classification Error Training Method Qiang Fu, Biing-Hwang Juang School."— Presentation transcript:

1 Presented by: Fang-Hui, Chu Automatic Speech Recognition Based on Weighted Minimum Classification Error Training Method Qiang Fu, Biing-Hwang Juang School of Electrical & Computer Engineering Georgia Institute of Technology ASRU 2007

2 Outline Introduction Weighted word error rate The minimum risk decision rule & weighted MCE method Training scenarios & weighting strategies in ASR Experiment results for weighted MCE Conclusion & future work

3 Review of Bayes decision theory A conditional loss for classifying into a class event : Expected loss function If we impose the assumption that the error loss function is uniform maximum a posteriori (MAP) decision rule –It transforms the classifier design problem into a distribution estimation problem Several limitations !!

4 Introduction In a variety of ASR applications, some errors should be considered more critical than others in terms of system objective –Keyword spotting system, speech understanding system,… –The difference of the significance of the recognition error is necessary and a nonuniform error cost function becomes appropriate –This transforms the classifier design into an error cost minimization problem instead of a distribution estimation problem

5 An example for non-uniform error rate Here is an example for using non-uniform error rate : The weighted word error rate (WWER) can be calculated as 0 AT N. E. C. THE NEED FOR INTERNATIONAL MANAGERS WILL KEEP RISING 1 AT ANY SEE THE NEED FOR INTERNATIONAL MANAGERS WILL KEEP RISING 2 AT N. E. C. NEEDS FOR INTERNATIONAL MANAGER ’ S WILL KEEP RISING Two recognition results with equal-significance word error rate. But, which is better ?

6 An example for non-uniform error rate cont. An example of weighted word error rate : 0 AT N. E. C. THE NEED FOR INTERNATIONAL MANAGERS WILL KEEP RISING 2.317 3.138 3.135 2.784 1.275 3.675 2.027 3.259 3.797 2.481 3.689 3.925 35.502 1 AT ANY SEE THE NEED FOR INTERNATIONAL MANAGERS WILL KEEP RISING 2.317 3.038 3.503 1.275 3.675 2.027 3.259 3.797 2.481 3.689 3.925 2 AT N. E. C. NEEDS FOR INTERNATIONAL MANAGER ’ S WILL KEEP RISING 2.317 3.138 3.135 2.784 3.966 2.027 3.259 3.719 2.481 3.689 3.925

7 The Minimum Risk decision rule minimum risk (MR) decision rule : –involves a weighted combination of the a posteriori probabilities for all the classes

8 A practical MR rule

9 A practical MR rule cont. We can prescribe a discriminant function for each class,, and define the practical decision rule for the recognizer as The alternative system loss is then

10 A practical MR rule cont. The approximation then needs to be made to the summands

11 The weighted MCE method The objective function of the weighted MCE is

12 Training Scenarios Intra-level training –The training and recognition decisions are on the same semantic level with the performance measure Inter-level training –The training and recognition decisions are on the different semantic level with the performance metric –Minimizing the cost of the wrong recognition decisions does not directly optimize the recognizer’s performance in term of the evaluation metric –To alleviate this inconsistency, the error weighting strategy could be built in a cross-level fashion

13 Two types of error cost User-defined cost –Usually characterized by the system requirement and relatively straightforward Data-defined cost –More complicated –The wrong decisions occur because the underlying data observation deviates form the distribution represented –“bad” data ? or “bad” models ? –It is possible to measure the “reliability” of the errors by introducing the data-defined weighting

14 Error weighting for intra-level training In the intra-level training situation, the system performance is directly measured by the loss of wrong recognition decisions We can absorb both types of the error weighting into the error cost function as one universal functional form The objective function for the weighted MCE could be written as :

15 Error weighting for inter-level training We need to use cross-level weighting in this case to break down the high level cost and impose the appropriate weights upon the low level models The user-defined weighting of the weighted MCE in the inter-level training can be written as :

16 Error weighting for inter-level training cont. The data-defined weighting of the weighted MCE in the inter-level training can be written as : A W-MCE objective function including both weighting function under the inter-level training scenario can be written as

17 Weighted MCE & MPE/MWE method The MPE/MWE is a training method with a weighted objective function to mimic training errors :

18 Weighted MCE & MPE/MWE method cont. To maximize the original MPE/MWE objective function is equivalent to minimize the modified objective function : In summary, MPE/MWE builds a objective function that incorporates the non-uniform error cost of each training utterance –W-MCE & MPE/MWE are both rooted in the Bayes decision theory, directing to the same aim of designing the optimal classifier to minimize the non-uniform error cost

19 W-MCE implementation In our experiments, we assume that the weighting function only contains the data-defined weighting for simplicity

20 Experiments Database : WSJ0


Download ppt "Presented by: Fang-Hui, Chu Automatic Speech Recognition Based on Weighted Minimum Classification Error Training Method Qiang Fu, Biing-Hwang Juang School."

Similar presentations


Ads by Google