Download presentation
Presentation is loading. Please wait.
1
HIWIRE MEETING Nancy, July 6-7, 2006 José C. Segura, Ángel de la Torre
2
2 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Schedule Non-linear feature normalization for mobile platform Integration scheme Results and discussion Rapid speaker adaptation Combination of adaptation at signal level and acoustic model level Results and discussion Assessment of two non-linear techniques for feature normalization Non-linear parametric equalization Model based feature compensation (VTS) New improvements in robust VAD Model based VAD
3
HIWIRE MEETING Nancy, July 6-7, 2006 José C. Segura, Ángel de la Torre
4
4 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Schedule Non-linear feature normalization for mobile platform Integration scheme Results and discussion Rapid speaker adaptation Combination of adaptation at signal level and acoustic model level Results and discussion Assessment of two non-linear techniques for feature normalization Non-linear parametric equalization Model based feature compensation (VTS) New improvements in robust VAD Model based VAD
5
5 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Non-linear Parametric Equalization Feature normalization Motivation of PEQ: Limitation of linear methods: Cepstral Mean Normalization Cepstral Mean and Variance Normalization Limitation of non-linear methods (HEQ, OSEQ): Speech/non-speech ratio Estimation problems Parametric Equalization PEQ: Two Gaussian Model (speech / non-speech) Training of clean Gaussians; estimation of noisy Gaussians Non-linear transformation: combination of two linear transformations (one for speech, one for non-speech)
6
6 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Non-linear Parametric Equalization Aurora-2 results: Aver. WERRelative improv. BASELINE34.1 %0.0 % OSEQ17.5 %48.6 % PEQ18.6 %45.3 % Aurora-4 results: Aver. WERRelative improv. BASELINE45.6 %0.0 % OSEQ37.5 %17.8 % PEQ31.5 %30.1 %
7
7 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Non-linear Parametric Equalization Additional problem of non-linear transformations: Once the transformation is estimated, it is an “instantaneous transformation” Temporal correlations are not exploited Temporal Smoothing (TES): Each equalized cepstrum is time-filtered with an ARMA filter that restores autocorrelation of clean data
8
8 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Non-linear Parametric Equalization Aurora-2 results: Aver. WERImprov.Aver. WERImprov. BASELINE34.1 %0.0 %31.6 %6.5 % OSEQ17.5 %48.6 %15.5 %54.3 % PEQ18.6 %45.3 %--- Aurora-4 results: TES Aver. WERImprov.Aver. WERImprov. BASELINE45.6 %0.0 %43.4 %4.9 % OSEQ37.5 %17.8 %35.5 %22.2 % PEQ31.5 %30.1 %30.7 %32.6 % TES
9
9 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Model Based Feature Compensation (VTS) VTS feature normalization: Performed in log-FBE domain, (previous to DCT) Based on a Gaussian mixture model trained with clean speech Allows feature compensation and uncertainty estimation Summary of VTS (vector Taylor series approach): 1. Given the noisy conditions, VTS provides a noisy Gaussian from each clean Gaussian 2. The noisy Gaussian mixture model allow the computation of the probabilities P(k|y) 3. An estimation of the clean speech x is then possible 4. An estimation of the uncertainty is also possible
10
10 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Model Based Feature Compensation (VTS) Step 1: Estimation of a noisy Gaussian from a clean Gaussian: where the function g 0, f 0 and h 0 are evaluated at the mean of the clean Gaussian and at the mean of the noise:
11
11 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Model Based Feature Compensation (VTS) Step 2: Estimation of P(k|y): is the k-th Gaussian evaluated at the noisy speech y, and P(k) is the a-priori probability of the Gaussian. where: Step 3: Estimation of clean speech:
12
12 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Model Based Feature Compensation (VTS) Step 4: Estimation of uncertainty: the uncertainty of the clean speech can be estimated as: and from the estimation of the clean speech: assuming small values of the variance of the noise:
13
13 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Aurora-2 results: Aver. WERRelative improv. BASELINE34.1 %0.0 % VTS + MVN14.0 %58.9 % VTS + MVN + UNCERT.13.5 %60.0 % Model Based Feature Compensation (VTS) Some considerations about VTS: Computational load Better than HEQ, PEQ, etc., but only valid for additive noise or channel distortion Estimation of noise is critical There are some approximations in the formulation Uncertainty: small improvement (insert., substit., delet.) Alternative: model-based compensation based on numerical integration of pdfs
14
14 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Schedule Non-linear feature normalization for mobile platform Integration scheme Results and discussion Rapid speaker adaptation Combination of adaptation at signal level and acoustic model level Results and discussion Assessment of two non-linear techniques for feature normalization Non-linear parametric equalization Model based feature compensation (VTS) New improvements in robust VAD Model based VAD
15
15 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Model-based VAD Fundamentals of model-based VAD: Gaussian mixture model in log-FBE domain Gaussian mixture model trained with clean speech VTS provides a noisy version of the GMM From the noisy GMM, P(k|y) can be estimated for each observation y and each Gaussian k A-priori probability of k th Gaussian being speech P(V|k) can be estimated from the training data Then, the probability P(V|y) of the noisy observation y being speech is given by:
16
16 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Model-based VAD Some considerations about model-based VAD: VAD decision relies on a Gaussian mixture model trained with clean speech (based on speech events observed in the training database) Not based on energy.... Based on observations in the log-FBE domain VTS adapts the Gaussian mixture to noisy conditions: the performance of the VAD is expected to be stable for a wide range of SNRs Computational load
17
17 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Model-based VAD Model-based VAD for different SNRs:
18
18 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Model-based VAD Comparison with other VADs: HR1 and HR0 evaluated for AURORA-2
19
19 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Model-based VAD Comparison with other VADs: HR1 and HR0 evaluated for AURORA-2
20
20 HIWIRE Meeting – Nancy, 6 -7 June, 2006 Aurora-2 recognition results (WAcc): Model-based VAD WFWF+FD G.72957.1 %57.8 % AMR.166.3 %65.0 % AMR.278.3 %78.5 % AFE75.3 %79.0 % VTS-VAD78.4 %80.2 % Baseline: 60.5 % (no VAD, no WF, no FD)
21
HIWIRE MEETING Nancy, July 6-7, 2006 José C. Segura, Ángel de la Torre
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.