Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Tutorial on Bayesian Speech Feature Enhancement

Similar presentations


Presentation on theme: "A Tutorial on Bayesian Speech Feature Enhancement"— Presentation transcript:

1 A Tutorial on Bayesian Speech Feature Enhancement
SCALE Workshop, January 2010 A Tutorial on Bayesian Speech Feature Enhancement Friedrich Faubel

2 I Motivation

3 Speech Recognition System Overview
A speech recognition system converts speech to text. It basically consists of two components: Front End: extracts speech features from the audio signal Decoder: finds that sentence (sequence of acoustical states), which is the most likely explanation for the observed sequence of speech features Front End Decoder Text Speech

4 Speech Feature Extraction Windowing

5 Speech Feature Extraction Windowing

6 Speech Feature Extraction Windowing

7 Speech Feature Extraction Windowing

8 Speech Feature Extraction Time Frequency Analysis
Performing spectral analysis separately for each frame yields a time-frequency representation

9 Speech Feature Extraction Time Frequency Analysis
Performing spectral analysis separately for each frame yields a time-frequency representation

10 Speech Feature Extraction Perceptual Representation
Emulation of the logarithmic frequency and intensity perception of the human auditory system

11 Background Noise Background noise distorts speech features
Result: features don’t match the features used during training Consequence: severely degraded recognition performance

12 Overview of the Tutorial
I - Motivation II - The effect of noise to speech features III - Transforming probabilities IV - The MMSE solution to speech feature enhancement V - Model-based speech feature enhancement VI - Experimental results VII - Extensions

13 II Interaction Function The Effect of Noise

14 Interaction Function + =
Principle of Superposition: signals are additive noise clean speech noisy speech + =

15 Interaction Function In the signal domain we have the following relationship: noisy speech noise clean speech

16 Interaction Function In the signal domain we have the following relationship:

17 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes:

18 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

19 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

20 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

21 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

22 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

23 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

24 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

25 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

26 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

27 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

28 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

29 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

30 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

31 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

32 Interaction Function In the signal domain we have the following relationship: After Fourier transformation, this becomes: Taking the magnitude square on both sides, we get:

33 Interaction Function Taking the magnitude square on both sides, we get:

34 Interaction Function Taking the magnitude square on both sides, we get: Hence, in the power spectral domain we have:

35 Interaction Function Taking the magnitude square on both sides, we get: Hence, in the power spectral domain we have: phase term

36 Interaction Function Taking the magnitude square on both sides, we get: Hence, in the power spectral domain we have: relative phase

37 Interaction Function The relative phase between two waves describes their relative offset in time (delay) time relative phase

38 Interaction Function = = = =
When 2 sound sources are present the following can happen: = = amplification amplification = = attenuation cancellation

39 Interaction Function Taking the magnitude square on both sides, we get: Hence, in the power spectral domain we have: relative phase

40 Interaction Function Taking the magnitude square on both sides, we get: Hence, in the power spectral domain we have: zero in average

41 Interaction Function Taking the magnitude square on both sides, we get: Hence, in the power spectral domain we have: In the log power spectral domain that becomes:

42 Interaction Function Taking the magnitude square on both sides, we get: Hence, in the power spectral domain we have: In the log power spectral domain that becomes:

43 Interaction Function Taking the magnitude square on both sides, we get: Hence, in the power spectral domain we have: In the log power spectral domain that becomes:

44 Interaction Function Taking the magnitude square on both sides, we get: Hence, in the power spectral domain we have: In the log power spectral domain that becomes:

45 Interaction Function Taking the magnitude square on both sides, we get: Hence, in the power spectral domain we have: In the log power spectral domain that becomes: Acero, 1990

46 Interaction Function Taking the magnitude square on both sides, we get: Hence, in the power spectral domain we have: In the log power spectral domain that becomes: But is that really right?

47 Interaction Function The mean of a nonlinearly transformed random variable is not necessarily equal to the nonlinear transform of the random variable’s mean. nonlinear transform

48 Interaction Function The mean of a nonlinearly transformed random variable is not necessarily equal to the nonlinear transform of the random variable’s mean. nonlinear transform

49 Interaction Function Phase-averaged relationship between clean and noisy speech:

50 III Transforming Probabilities

51 Transforming Probabilities Motivation
In the signal domain we have the following relationship: In the log Mel domain that translates to: nonlinear interaction function

52 Transforming Probabilities Motivation
noise power noisy speech power clean speech power

53 Transforming Probabilities Motivation
noisy speech power clean speech power noise power

54 Transforming Probabilities Motivation
clean speech power noise power noisy speech power

55 Transforming Probabilities Motivation

56 Transforming Probabilities Motivation
Transformation results in a non-Gaussian probability distribution for noisy speech features.

57 Transforming Probabilities Introduction
Transformation of a random variable Transformation Probability density function

58 Transforming Probabilities Introduction
Transformation of a random variable Transformation Probability density function The transformation maps each x to a y:

59 Transforming Probabilities Introduction
Transformation of a random variable Transformation Probability density function The transformation maps each x to a y: Conversely, each y can be identified with

60 Transforming Probabilities Introduction
Transformation of a random variable Transformation Probability density function Idea: use to map distribution of y to distribution of x

61 Transforming Probabilities Introduction
Transformation of a random variable Transformation Probability density function Idea: use to map distribution of y to distribution of x change of variables

62 Transforming Probabilities Introduction
Transformation of a random variable Transformation Probability density function Idea: use to map distribution of y to distribution of x Jacobian determinant

63 Transforming Probabilities Introduction
Transformation of a random variable Transformation Probability density function Idea: use to map distribution of y to distribution of x Fundamental Transformation Law of Probability

64 Transforming Probabilities Monte Carlo
Idea: approximate probability distribution by samples drawn from the distribution. discrete probability mass pdf

65 Transforming Probabilities Monte Carlo
Idea: approximate probability distribution by samples drawn from the distribution. pdf cumulative density function

66 Transforming Probabilities Monte Carlo
Idea: approximate probability distribution by samples drawn from the distribution. Then: transform each sample pdf transformed pdf

67 Transforming Probabilities Monte Carlo
Idea: approximate probability distribution by samples drawn from the distribution. Then: transform each sample histogram transformed pdf

68 Transforming Probabilities Local Linearization
Idea: Locally linearize the interaction function around the mean of speech and noise, using a first order Taylor series expansion. Note: a linear transformation of a Gaussian random variable results in a Gaussian random variable.

69 Transforming Probabilities Local Linearization
Idea: Locally linearize the interaction function around the mean of speech and noise, using a first order Taylor series expansion. Moreno, 1996 Vector Taylor Series Approach Note: a linear transformation of a Gaussian random variable results in a Gaussian random variable.

70 Transforming Probabilities Local Linearization
Idea: Locally linearize the interaction function around the mean of speech and noise, using a first order Taylor series expansion.

71 Transforming Probabilities Local Linearization
Idea: Locally linearize the interaction function around the mean of speech and noise, using a first order Taylor series expansion.

72 Transforming Probabilities Local Linearization
Idea: Locally linearize the interaction function around the mean of speech and noise, using a first order Taylor series expansion.

73 Transforming Probabilities The Unscented Transform
Idea: similar as in Monte Carlo, select points in a determi nistic fashion and in such a way that they capture the mean and covariance of the distribution select points

74 Transforming Probabilities The Unscented Transform
select points

75 Transforming Probabilities The Unscented Transform
select points transform points

76 Transforming Probabilities The Unscented Transform
select points transform points Re-estimate parameters of the Gaussian distribution

77 Transforming Probabilities The Unscented Transform
Comparison to local linearization: local linearization unscented transform

78 Transforming Probabilities The Unscented Transform
select points transform points Re-estimate parameters of the Gaussian distribution

79 Transforming Probabilities The Unscented Transform
transform points

80 Transforming Probabilities The Unscented Transform
The points selected by the un-scented transform lie on lines around the center point. transform points

81 Transforming Probabilities The Unscented Transform
The points selected by the un-scented transform lie on lines around the center point. transform points

82 Transforming Probabilities The Unscented Transform
The points selected by the un-scented transform lie on lines around the center point. transform points

83 Transforming Probabilities The Unscented Transform
The points selected by the un-scented transform lie on lines around the center point. After nonlinear transformation, the points might no longer lie on a line transform points

84 Transforming Probabilities The Unscented Transform
The points selected by the un-scented transform lie on lines around the center point. After nonlinear transformation, the points might no longer lie on a line transform points

85 Transforming Probabilities The Unscented Transform
The points selected by the un-scented transform lie on lines around the center point. After nonlinear transformation, the points might no longer lie on a line transform points Hence we can measure the degree of nonlinearity as the average distance of each three points from a linear fit of the three points.

86 Transforming Probabilities The Unscented Transform
transform points Hence we can measure the degree of nonlinearity as the average distance of each three points from a linear fit of the three points.

87 Transforming Probabilities The Unscented Transform
Hence we can measure the degree of nonlinearity as the average distance of each three points from a linear fit of the three points. transform points

88 Transforming Probabilities The Unscented Transform
Hence we can measure the degree of nonlinearity as the average distance of each three points from a linear fit of the three points. This can be shown to be closely related to the R2 measure used in linear regression. transform points

89 Transforming Probabilities The Unscented Transform
true distribution Gaussian fit High degree of nonlinearity Gaussian fit does not well represent the transformed distribution

90 Transforming Probabilities An Adaptive Level of Detail Approach
Idea: splitting a Gaussian into two Gaussian components decreases the covariance and thereby the nonlinearity.

91 Transforming Probabilities An Adaptive Level of Detail Approach
Idea: splitting a Gaussian into two Gaussian components decreases the covariance and thereby the nonlinearity. 2 Gaussians

92 Transforming Probabilities An Adaptive Level of Detail Approach
Algorithm, Adaptive Level of Detail Transform [ALoDT] start with one Gaussian g transform that Gaussian with the UT identify Gaussian component with highest dnl split that component into 2 Gaussians g1, g2 transform g1 and g2 with the UT while #(Gaussians) < N: repeat step 3.

93 Transforming Probabilities An Adaptive Level of Detail Approach
Density approximation with the Adaptive Level of Detail Transform unscented transform

94 Transforming Probabilities An Adaptive Level of Detail Approach
Density approximation with the Adaptive Level of Detail Transform ALoDT-2

95 Transforming Probabilities An Adaptive Level of Detail Approach
Density approximation with the Adaptive Level of Detail Transform ALoDT-4

96 Transforming Probabilities An Adaptive Level of Detail Approach
Density approximation with the Adaptive Level of Detail Transform ALoDT-8

97 Transforming Probabilities An Adaptive Level of Detail Approach
Density approximation with the Adaptive Level of Detail Transform ALoDT-16

98 Transforming Probabilities An Adaptive Level of Detail Approach
Density approximation with the Adaptive Level of Detail Transform ALoDT-32

99 Transforming Probabilities An Adaptive Level of Detail Approach
Kullback Leibler divergence between approximated and true distribution (Monte Carlo with 10M samples). Adaptive Level of Detail Transform N 1 2 4 8 16 32 KLD 0.190 0.078 0.025 0.017 0.007 0.004 decrease by a factor of 48

100 IV Speech Feature Enhancement The MMSE Solution

101 Speech Feature Enhancement The MMSE Solution
Idea: train speech recognition system on clean speech try to map distorted features to clean speech features Systematic Approach: derive an estimator for clean speech given noisy speech

102 Speech Feature Enhancement The MMSE Solution
Let be an estimator for clean speech , given noisy speech .

103 Speech Feature Enhancement The MMSE Solution
Let be an estimator for clean speech , given noisy speech . Then the expected mean square error introduced by using instead of the true is:

104 Speech Feature Enhancement The MMSE Solution
Let be an estimator for clean speech , given noisy speech . Then the expected mean square error introduced by using instead of the true is:

105 Speech Feature Enhancement The MMSE Solution
Then the expected mean square error introduced by using instead of the true is:

106 Speech Feature Enhancement The MMSE Solution
Then the expected mean square error introduced by using instead of the true is: Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion:

107 Speech Feature Enhancement The MMSE Solution
Then the expected mean square error introduced by using instead of the true is: Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion:

108 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion:

109 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: But how to obtain this distribution?

110 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: Idea: assume that the joint distribution of S and Y is Gaussian

111 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: Idea: assume that the joint distribution of S and Y is Gaussian

112 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: Idea: assume that the joint distribution of S and Y is Gaussian Afify, 2007 Stereo-Based Stochastic Mapping

113 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: Idea: assume that the joint distribution of S and Y is Gaussian

114 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: Idea: assume that the joint distribution of S and Y is Gaussian

115 Speech Feature Enhancement The MMSE Solution
Idea: assume that the joint distribution of S and Y is Gaussian

116 Speech Feature Enhancement The MMSE Solution
Idea: assume that the joint distribution of S and Y is Gaussian Then the conditional distribution of S|Y is again Gaussian: with conditional mean and covariance matrix

117 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion:

118 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: Under the Gaussian assumption, this integral is easily obtained:

119 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: Under the Gaussian assumption, this integral is easily obtained: This is exactly what you get with the vector Taylor series approach Moreno, 1996

120 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: Under the Gaussian assumption, this integral is easily obtained: Problem: speech is known to be multi modal

121 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion:

122 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: Introduce the index k of the mixture component as a hidden variable.

123 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: Then rewrite this as

124 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion:

125 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: pull the sum out of the integral

126 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion:

127 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: independent of s

128 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: pull this out of the integral

129 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion:

130 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: Probability that clean speech originated from the kth Gaus-sian given the noisy speech spectrum y.

131 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: Clean speech estimate of the k-th Gaussian:

132 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: Bayes’ theorem

133 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion:

134 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion:

135 Speech Feature Enhancement The MMSE Solution
Minimizing the MSE with respect to yields the optimal estimator with respect to the MMSE criterion: joint distribution

136 V Model-Based Speech Feature Enhancement

137 Model-Based Speech Feature Enhancement
Distribution of clean speech is modeled as Gaussian Mixture

138 Model-Based Speech Feature Enhancement
Distribution of clean speech is modeled as Gaussian Mixture + +

139 Model-Based Speech Feature Enhancement
Distribution of clean speech is modeled as Gaussian Mixture Noise is modeled as a single Gaussian

140 Model-Based Speech Feature Enhancement
Distribution of clean speech is modeled as Gaussian Mixture Noise is modeled as a single Gaussian

141 Model-Based Speech Feature Enhancement
Distribution of clean speech is modeled as Gaussian Mixture Noise is modeled as a single Gaussian Presence of noise changes the clean speech distribution according to the interaction function

142 Model-Based Speech Feature Enhancement
Distribution of clean speech is modeled as Gaussian Mixture Noise is modeled as a single Gaussian Presence of noise changes the clean speech distribution according to the interaction function Construct the joint distribution of clean and noisy speech based on this model

143 Model-Based Speech Feature Enhancement
Distribution of clean speech is modeled as Gaussian Mixture Noise is modeled as a single Gaussian Presence of noise changes the clean speech distribution according to the interaction function Construct the joint distribution of clean and noisy speech based on this model

144 Model-Based Speech Feature Enhancement
Construct the joint distribution of clean and noisy speech based on this model

145 Model-Based Speech Feature Enhancement
Construct the joint distribution of clean and noisy speech based on this model

146 Model-Based Speech Feature Enhancement
Construct the joint distribution of clean and noisy speech based on this model

147 Model-Based Speech Feature Enhancement
Construct the joint distribution of clean and noisy speech based on this model

148 Model-Based Speech Feature Enhancement
Noise Estimation: Find that noise distribution, which is the most likely explanation for the observed, noisy speech features

149 Model-Based Speech Feature Enhancement
Noise Estimation: Find that noise distribution, which is the most likely explanation for the observed, noisy speech features mean and covariance of the noise

150 Model-Based Speech Feature Enhancement
Noise Estimation: Find that noise distribution, which is the most likely explanation for the observed, noisy speech features Problem: the observations are also dependent on speech!

151 Model-Based Speech Feature Enhancement
Problem: the observations are also dependent on speech! hidden variable

152 Model-Based Speech Feature Enhancement
Problem: the observations are also dependent on speech! hidden variable

153 Model-Based Speech Feature Enhancement
Problem: the observations are also dependent on speech! hidden variable

154 Model-Based Speech Feature Enhancement
Noise Estimation: Find that noise distribution, which is the most likely explanation for the observed, noisy speech features Problem: the observations are also dependent on speech!

155 Model-Based Speech Feature Enhancement
Noise Estimation: Find that noise distribution, which is the most likely explanation for the observed, noisy speech features Problem: the observations are also dependent on speech! Hence, the Expectation Maximization algorithm is used. Rose, 1994 Moreno, 1996

156 Model-Based Speech Feature Enhancement
Expectation Step: construct the joint distribution by using the current noise parameter estimate Then calculate

157 Model-Based Speech Feature Enhancement
Expectation Step: construct the joint distribution by using the current noise parameter estimate Then calculate

158 Model-Based Speech Feature Enhancement
Expectation Step: construct the joint distribution by using the current noise parameter estimate Then calculate Maximization Step: Reestimate by ac-cumulating statistics of the instantaneous noise estimates for each possible , weighted by the probability that clean speech originated from this Gaussian.

159 Model-Based Speech Feature Enhancement
Maximization Step: Reestimate by ac-cumulating statistics of the instantaneous noise estimates for each possible , weighted by the probability that clean speech originated from this Gaussian.

160 Model-Based Speech Feature Enhancement
Maximization Step: Reestimate by ac-cumulating statistics of the instantaneous noise estimates for each possible , weighted by the probability that clean speech originated from this Gaussian:

161 Model-Based Speech Feature Enhancement
Maximization Step: Reestimate by ac-cumulating statistics of the instantaneous noise estimates for each possible , weighted by the probability that clean speech originated from this Gaussian:

162 Model-Based Speech Feature Enhancement
Maximization Step: Reestimate by ac-cumulating statistics of the instantaneous noise estimates for each possible , weighted by the probability that clean speech originated from this Gaussian: But how to obtain this distribution?

163 Model-Based Speech Feature Enhancement
Maximization Step: Reestimate by ac-cumulating statistics of the instantaneous noise estimates for each possible , weighted by the probability that clean speech originated from this Gaussian:

164 Model-Based Speech Feature Enhancement
Maximization Step: Reestimate by ac-cumulating statistics of the instantaneous noise estimates for each possible , weighted by the probability that clean speech originated from this Gaussian:

165 Model-Based Speech Feature Enhancement
Maximization Step: Reestimate by ac-cumulating statistics of the instantaneous noise estimates for each possible , weighted by the probability that clean speech originated from this Gaussian:

166 Model-Based Speech Feature Enhancement
Maximization Step: Reestimate by ac-cumulating statistics of the instantaneous noise estimates for each possible , weighted by the probability that clean speech originated from this Gaussian:

167 Model-Based Speech Feature Enhancement
Maximization Step: Reestimate by ac-cumulating statistics of the instantaneous noise estimates for each possible , weighted by the probability that clean speech originated from this Gaussian: So, we have , need

168 Model-Based Speech Feature Enhancement
Maximization Step: Reestimate by ac-cumulating statistics of the instantaneous noise estimates for each possible , weighted by the probability that clean speech originated from this Gaussian: But that is just the conditional Gaussian distribution with conditional mean and covariance

169 Model-Based Speech Feature Enhancement
Maximization Step: Reestimate by ac-cumulating statistics of the instantaneous noise estimates for each possible , weighted by the probability that clean speech originated from this Gaussian:

170 VI Experimental Results

171 Experimental Results Speech Recognition Experiments
clean speech from MC-WSJ-AV corpus noise from the NOISEX-92 database (artifically added) MFCC with 13 components, stacking of 15 frames, LDA cepstral mean and variance normalization 1743 acoustical states; Gaussians

172 Experimental Results WER, destroyer engine noise

173 Experimental Results WER, factory noise

174 VII Extensions

175 Extensions Sequential noise estimation:
Sequential expectation maximization (SEM), Kim, 1998

176 Extensions Sequential noise estimation:
Sequential expectation maximization (SEM), Kim, 1998 Interacting Multiple Model (IMM) Kalman Filter, Kim, 1999

177 Extensions Sequential noise estimation:
Sequential expectation maximization (SEM), Kim, 1998 Interacting Multiple Model (IMM) Kalman Filter, Kim, 1999 Particle filter, Yao, 2001

178 Extensions Sequential noise estimation:
Sequential expectation maximization (SEM), Kim, 1998 Interacting Multiple Model (IMM) Kalman Filter, Kim, 1999 Particle filter, Yao, 2001 Improve speech recognition through: Combination with Joint Uncertainty Decoding, Shinohara, 2008

179 Extensions Sequential noise estimation:
Sequential expectation maximization (SEM), Kim, 1998 Interacting Multiple Model (IMM) Kalman Filter, Kim, 1999 Particle filter, Yao, 2001 Improve speech recognition through: Combination with Joint Uncertainty Decoding, Shinohara, 2008 Combination with bounded conditional mean imputation?


Download ppt "A Tutorial on Bayesian Speech Feature Enhancement"

Similar presentations


Ads by Google