Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models C. J. Leggetter and P. C. Woodland Department of.

Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models C. J. Leggetter and P. C. Woodland Department of Engineering, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, U.K. Computer Speech and Language (1995) Present by Hsu Ting-Wei 2006.03.16

2 Introduction Speaker adaptation techniques fall into two main categories: –Speaker normalization The input speech is normalized to match the speaker that the system is trained to model –Model adaptation techniques The parameters of the model set are adjusted to improve the modeling of the new speaker MAP method –Only update the parameters of models which are observed in the adaptation data MLLR method (Maximum Likelihood Linear Regression) –All model states can be adapted even if no model-specific data is available Speaker HMM Models Say: “ Hello! ”

3 MLLR’s adaptation approach This method requires an initial speaker independent continuous density HMM system MLLR takes some adaptation data from a new speaker and updates the model mean parameters to maximize the likelihood of the adaptation data The other HMM parameters are not adapted since the main differences between speakers are assumed to be characterized by the means

4 MLLR’s adaptation approach (cont.) Consider the case of a continuous density HMM system with Gaussian output distributions. A particular distribution s,is characterized by a mean vector, and a covariance matrix Given a parameterized speech frame vector, the probability density of that vector being generated by distribution s is where n is the dimension of the observation vector S speech frame vector

5 MLLR’s adaptation approach (cont.) We use the following equation We can simply it where So the probability density function for the adapted system becomes n*(n+1) extended mean vector 要調適的分佈的 mean 值所串起的向量 offset = 1, include an offset in the regression offset = 0, ignore offsets 若調適語者的錄音環境與初始模型錄音環境不同時，可以加入的一項參數 [ 參考資料 ] Original.. (n+1)*1 transformation matrices (1)

6 MLLR’s adaptation approach (cont.) The transformation matrices are calculated to maximize the likelihood of the adaptation data The transformation matrices can be implemented using the forward–backward algorithm A more general approach is adopted in which the same transformations matrix is used for several distributions. If some of the distributions are not observed in the adaptation data, a transformation may still be applied (global transformation)

7 Estimation of MLLR regression matrices 1.Definition of auxiliary function objective function S speech frame vector E-step

8 Estimation of MLLR regression matrices (cont.) 2.Maximization of auxiliary function only related with mean (2) (3)

9 Estimation of MLLR regression matrices (cont.) 2.Maximization of auxiliary function (cont.) (4) expanding this term

10 Estimation of MLLR regression matrices (cont.) 2.Maximization of auxiliary function (cont.)

11 Estimation of MLLR regression matrices (cont.) 2.Maximization of auxiliary function (cont.) M-step <= 估測的 general form (5)

12 Estimation of MLLR regression matrices (cont.) 3.Re-estimation formula for tied regression matrices [(n+1)*1][1*(n+1)] =(n+1) *(n+1) 當調適語料不夠多時，可以將調適語料中相關性較大的狀態分為同一類，利用在同一類別中所收集到的語料來估測 Ws 。

13 Estimation of MLLR regression matrices (cont.) 3.Re-estimation formula for tied regression matrices (cont.) (7)is denoted by n*(n+1) matrix Y(7)is denoted by n*(n+1) matrix Z ?

14 Special cases of MLLR 1.Least squares regression (XX ’ )YX ’

15 Special cases of MLLR (cont.) 1.Least squares regression (cont.)

16 Special cases of MLLR (cont.) 2.Single variable linear regression

17 Special cases of MLLR (cont.) 2.Single variable linear regression (cont.) M-step

18 Defining regression classes When regression matrices are tied across mixture components, each matrix is associated with many mixture components. For the tied approach to be effective it is desirable to put all the mixture components which will use similar transforms into the same class. Two approaches for defining regression classes were considered: –Based on broad phonetic classes All mixture components in any model representing the same broad phonetic class (e.g. fricatives, nasals, etc.) were placed in the same regression class. –Based on clustering of mixture components The mixture components were compared using a likelihood measure and similar components placed in the same regression class.

19 Experiment: Full regression matrix V.S. Diagonal regression matrix SD SI diagonal full : a lot of parameters

20 Experiment: Full matrix using global regression class SD SI adapted

21 Experiment: Supervised v.s Unsupervised SD SI supervised unsupervised

22 Conclusion MLLR can be applied to continuous density HMMs with a large number of Gaussians and is effective with small amounts of adaptation data.

Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models C. J. Leggetter and P. C. Woodland Department of.

Similar presentations

Presentation on theme: "Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models C. J. Leggetter and P. C. Woodland Department of."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models C. J. Leggetter and P. C. Woodland Department of.

Similar presentations

Presentation on theme: "Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models C. J. Leggetter and P. C. Woodland Department of."— Presentation transcript:

Similar presentations

About project

Feedback