Boosting Training Scheme for Acoustic Modeling Rong Zhang and Alexander I. Rudnicky Language Technologies Institute, School of Computer Science Carnegie.

Slides:



Advertisements
Similar presentations
Ensemble Learning Reading: R. Schapire, A brief introduction to boosting.
Advertisements

On-line learning and Boosting
Boosting Rong Jin.
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
FilterBoost: Regression and Classification on Large Datasets Joseph K. Bradley 1 and Robert E. Schapire 2 1 Carnegie Mellon University 2 Princeton University.
Games of Prediction or Things get simpler as Yoav Freund Banter Inc.
Lattices Segmentation and Minimum Bayes Risk Discriminative Training for Large Vocabulary Continuous Speech Recognition Vlasios Doumpiotis, William Byrne.
Longin Jan Latecki Temple University
Present by: Fang-Hui Chu A Survey of Large Margin Hidden Markov Model Xinwei Li, Hui Jiang York University.
2D1431 Machine Learning Boosting.
Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficiency with boostrap sampling: Every example.
A Brief Introduction to Adaboost
Ensemble Learning: An Introduction
Adaboost and its application
Examples of Ensemble Methods
Machine Learning: Ensemble Methods
Boosting Main idea: train classifiers (e.g. decision trees) in a sequence. a new classifier should focus on those cases which were incorrectly classified.
Ensemble Learning (2), Tree and Forest
For Better Accuracy Eick: Ensemble Learning
Ensembles of Classifiers Evgueni Smirnov
Machine Learning CS 165B Spring 2012
Soft Margin Estimation for Speech Recognition Main Reference: Jinyu Li, " SOFT MARGIN ESTIMATION FOR AUTOMATIC SPEECH RECOGNITION," PhD thesis, Georgia.
Presented by: Fang-Hui, Chu Automatic Speech Recognition Based on Weighted Minimum Classification Error Training Method Qiang Fu, Biing-Hwang Juang School.
Active Learning for Class Imbalance Problem
A Survey of Boosting HMM Acoustic Model Training
CS 391L: Machine Learning: Ensembles
Classification and Ranking Approaches to Discriminative Language Modeling for ASR Erinç Dikici, Murat Semerci, Murat Saraçlar, Ethem Alpaydın 報告者:郝柏翰 2013/01/28.
Ensemble Classification Methods Rayid Ghani IR Seminar – 9/26/00.
ECE 8443 – Pattern Recognition Objectives: Bagging and Boosting Cross-Validation ML and Bayesian Model Comparison Combining Classifiers Resources: MN:
Benk Erika Kelemen Zsolt
Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly.
DISCRIMINATIVE TRAINING OF LANGUAGE MODELS FOR SPEECH RECOGNITION Hong-Kwang Jeff Kuo, Eric Fosler-Lussier, Hui Jiang, Chin-Hui Lee ICASSP 2002 Min-Hsuan.
Presented by: Fang-Hui Chu Boosting HMM acoustic models in large vocabulary speech recognition Carsten Meyer, Hauke Schramm Philips Research Laboratories,
BOOSTING David Kauchak CS451 – Fall Admin Final project.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
CLASSIFICATION: Ensemble Methods
Ensemble Learning (1) Boosting Adaboost Boosting is an additive model
Round-Robin Discrimination Model for Reranking ASR Hypotheses Takanobu Oba, Takaaki Hori, Atsushi Nakamura INTERSPEECH 2010 Min-Hsuan Lai Department of.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Learning Long-Term Temporal Feature in LVCSR Using Neural Networks Barry Chen, Qifeng Zhu, Nelson Morgan International Computer Science Institute (ICSI),
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
I-SMOOTH FOR IMPROVED MINIMUM CLASSIFICATION ERROR TRAINING Haozheng Li, Cosmin Munteanu Pei-ning Chen Department of Computer Science & Information Engineering.
E NSEMBLE L EARNING : A DA B OOST Jianping Fan Dept of Computer Science UNC-Charlotte.
Voice Activity Detection based on OptimallyWeighted Combination of Multiple Features Yusuke Kida and Tatsuya Kawahara School of Informatics, Kyoto University,
COP5992 – DATA MINING TERM PROJECT RANDOM SUBSPACE METHOD + CO-TRAINING by SELIM KALAYCI.
Weakly Supervised Training For Parsing Mandarin Broadcast Transcripts Wen Wang ICASSP 2008 Min-Hsuan Lai Department of Computer Science & Information Engineering.
MINIMUM WORD CLASSIFICATION ERROR TRAINING OF HMMS FOR AUTOMATIC SPEECH RECOGNITION Yueng-Tien, Lo Speech Lab, CSIE National.
Presented by: Fang-Hui Chu Discriminative Models for Speech Recognition M.J.F. Gales Cambridge University Engineering Department 2007.
Classification Ensemble Methods 1
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bagging and Boosting Cross-Validation ML.
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
A Maximum Entropy Language Model Integrating N-grams and Topic Dependencies for Conversational Speech Recognition Sanjeev Khudanpur and Jun Wu Johns Hopkins.
Spoken Language Group Chinese Information Processing Lab. Institute of Information Science Academia Sinica, Taipei, Taiwan
Boosting ---one of combining models Xin Li Machine Learning Course.
AdaBoost Algorithm and its Application on Object Detection Fayin Li.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Voice Activity Detection Based on Sequential Gaussian Mixture Model Zhan Shen, Jianguo Wei, Wenhuan Lu, Jianwu Dang Tianjin Key Laboratory of Cognitive.
Ensemble Classifiers.
Machine Learning: Ensemble Methods
Olivier Siohan David Rybach
COMP61011 : Machine Learning Ensemble Models
Ensemble Learning Introduction to Machine Learning and Data Mining, Carla Brodley.
Boosting Nearest-Neighbor Classifier for Character Recognition
Data Mining Practical Machine Learning Tools and Techniques
Introduction to Data Mining, 2nd Edition
Machine Learning Ensemble Learning: Voting, Boosting(Adaboost)
Ensemble learning.
Model Combination.
Presentation transcript:

Boosting Training Scheme for Acoustic Modeling Rong Zhang and Alexander I. Rudnicky Language Technologies Institute, School of Computer Science Carnegie Mellon University

Reference Papers [ICASSP 03] Improving The Performance of An LVCSR System Through Ensembles of Acoustic Models [EUROSPEECH 03] Comparative Study of Boosting and Non-Boosting Training for Constructing Ensembles of Acoustic Models [ICSLP 04] A Frame Level Boosting Training Scheme for Acoustic Modeling [ICSLP 04] Optimizing Boosting with Discriminative Criteria [ICSLP 04] Apply N-Best List Re-Ranking to Acoustic Model Combinations of Boosting Training [EUROSPEECH 05] Investigations on Ensemble Based Semi-Supervised Acoustic Model Training [ICSLP 06] Investigations of Issues for Using Multiple Acoustic Models to Improve Continuous Speech Recognition

Improving The Performance of An LVCSR System Through Ensembles of Acoustic Models ICASSP 2003 Rong Zhang and Alexander I. Rudnicky Language Technologies Institute, School of Computer Science Carnegie Mellon University

Introduction An ensemble of classifiers is a collection of single classifiers –which is used to select a hypothesis based on the majority vote from its components Bagging and Boosting are the two most successful algorithms for construction ensembles This paper describes the work on applying ensembles of acoustic models to the problem of LVCSR plurality voting majority voting weighted voting

Bagging vs. Boosting Bagging –In each round, bagging randomly selects a number of examples from the original training set, and produces a new single classifier based on the selected subset –The final classifier is built by choosing the hypothesis best agreed on by single classifiers Boosting –In boosting, the single classifiers are iteratively trained in a fashion such that hard-to-classify examples are given increasing emphasis –A parameter that measures the classifier’s importance is determined in respect of its classification accuracy –The final hypothesis is the weighted majority vote from the single classifiers

Algorithms The first algorithm is based on the intuition that an incorrectly recognized utterance should receive more attention in training If the weight of an utterance is 2.6, we first add two copies of the utterance to the new training set, and then add its third copy with probability 0.6

Algorithms The exponential increase in the size of training set is a severe problem for algorithm 1 Algorithm 2 is proposed to address this problem

Algorithms In algorithm 1 and 2, there is no concern to measure how important a model is relative to others –Good model should play more important role than bad one

Experiments Corpus : CMU Communicator system Experimental results :

A Frame Level Boosting Training Scheme for Acoustic Modeling ICSLP 2004 Rong Zhang and Alexander I. Rudnicky Language Technologies Institute, School of Computer Science Carnegie Mellon University

Introduction In the current Boosting algorithm, utterance is the basic unit used for acoustic model training Our analysis shows that there are two notable weaknesses in this setting.. –First, the objective function of current Boosting algorithm is designed to minimize utterance error instead of word error –Second, in the current algorithm, an utterance is treated as a unity for resample This paper proposes a frame level Boosting training scheme for acoustic modeling to address these two problems

is the pseudo loss for frame t, which describes the degree of confusion of this frame for recognition Frame Level Boosting Training Scheme The metrics that we will use in Boosting training is the frame level conditional probability -----(word level) Objective function :

Frame Level Boosting Training Scheme Training Scheme: –How to resample the frame level training data? to duplicate for times and creates a new utterance for acoustic model training

Experiments Corpus : CMU Communicator system Experimental results :

Discussions Some speculations on this outcome : –1. The mismatch between training criterion and target in principle still exists in the new scheme –2. The frame based resample method is exclusively depends on the weight calculated in the training process –3. Forced-alignment is used to determine the correct word for each frame –4. A new training scheme considering both utterance and frame level recognition errors may be more suitable for accurate modeling