Presentation is loading. Please wait.

Presentation is loading. Please wait.

Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation.

Similar presentations


Presentation on theme: "Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation."— Presentation transcript:

1 Robust Speech recognition V. Barreaud LORIA

2 Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation u Inter-Speaker Variation

3 Robust Approaches n three categories u noise resistant features (Speech var.) u speech enhancement (Speech var. + Inter-speaker var.) u model adaptation for noise (Speech var. + Inter-speaker var.) Recognition system testing training Models Features encoding Word sequence Spk. A Spk. B

4 Contents n Overview u Noise resistant features u Speach enhancement u Model adaptation n Stochastic Matching n Our current work

5 Noise resistant features n Acoustic representation u Emphasis on less affected evidences n Auditory systems inspired models u Filter banks, Loudness curve, Lateral inhibition n Slow variation removal u Cepstrum Mean Normalization, Time derivatives n Linear Discriminative Analysis u Searches for the best parameterization

6 Speech enhancement n Parameter mapping u stereo data u observation subspace n Bayesian estimation u stochastic modelization of speech and noise n Template based estimation u restriction to a subspace u output is noise free u various templates and combination methods n Spectral Subtraction u noise and speech uncorrelated u slowly varying noise

7 Model Adaptation for noise n Decomposition of HMM or PMC u Viterbi algorithm searches in a NxM state HMM u Noise and speech simultaneously recognized u complex noises recognized n State dependant Wiener filtering u Wiener filtering in spectral domain faces non-stationary u Hmms divide speech in quasi-stationary segments u wiener filters specific to the state n Discriminative training u Classical technique trains models independently u error corrective training u minimum classification error training n Training data contamination u training set corrupted with noisy speech u depends on the test environment u lower discriminative scores Training

8 Stochastic Matching : Introduction n General framework n in feature space n in model space

9 Stochastic Matching : General framework n HMM Models  X, X training space n Y ={y 1, …, y t } observation in testing space n and Y W 

10 Stochastic Matching : In Feature Space n Estimation step : Auxiliary function n Maximization step

11 Stochastic Matching : In Feature Space (2) n Simple distorsion function n Computation of the simple bias

12 Stochastic Matching : In Model Space n random additive bias sequence B={b 1,…,b t } independent of speech stochastic process of mean  b and diagonal covariance  b

13 On-Line Frame-Synchronous Noise Compensation n Lies on stochastic matching method n Transformation parameter estimated along with optimal path. n Uses forward probabilities b1b1 b2b2 b3b3 b4b4 Sequence of observations Bias computation y2y2 y3y3 y4y4 z2z2 z3z3 z4z4 z5z5 reco Transformed observations

14 Theoretical framework and issue n On line frame synchronous n cascade of errors 1. Initiate bias of first frame b 0 =0 2. Compute  and then b 3. Transform next frame with b 4. Goto next frame n Classical Stochastic Matching

15 Viterbi Hypothesis vs Linear Combination n Viterbi Hypothesis take into account only the « most probable » state and gaussian component. n Linear combination t t+1 states

16 Experiments n Phone numbers in a running car n Forced Align u transcription + optimum path n Free Align u optimum path n Wild Align u no data

17 Perspectives n Error recovery problem u a forgetting process u a model of distorsion function u environmental clues n More elaborated transform


Download ppt "Robust Speech recognition V. Barreaud LORIA. Mismatch Between Training and Testing n mismatch influences scores n causes of mismatch u Speech Variation."

Similar presentations


Ads by Google