Presentation is loading. Please wait.

Presentation is loading. Please wait.

Yi-zhang Cai, Jeih-weih Hung 2012/08/17 報告者:汪逸婷 1.

Similar presentations


Presentation on theme: "Yi-zhang Cai, Jeih-weih Hung 2012/08/17 報告者:汪逸婷 1."— Presentation transcript:

1 Yi-zhang Cai, Jeih-weih Hung 2012/08/17 報告者:汪逸婷 1

2  Introduction  Background methods  Presented methods with NMF  Experimental Results  Conclusion 2

3  The effect of noise in speech signals ◦ This thesis focuses on developing new algorithms to reduce the noise effect in speech recognition (Convolutional Noise) Channel Effect Background Noise (Additive Noise) Noisy Speech Clean Speech 3

4  Noise causes a serious mismatch in the modulation spectrum of speech feature streams ◦ We try to normalize the modulation spectra under different SNR cases. 4

5  The nonnegative matrix factorization(NMF) is a novel method in processing the modulation spectrum.  By using the following criterion 5

6  In general, the two nonnegative matrix W and H is obtained in an iterative manner ◦ With an initial guess of W and H, the following multiplicative updating rule is employed to achieve a local minimum of the cost function and 6

7  Conventional NMF for dealing with the modulation spectrum is relatively complicated: ◦ Using an iterative approach to find the best possible encoding vector h(given the basis matrix W is fixed) :  Iteration :  Termination : ◦ Processing the entire modulation frequency band of speech features  Only the first-half low frequency part is important for speech recognition 7

8  Two ways to reduce the complexity ◦ Orthogonal projection  Find the orthogonal basis B for the basis matrix W, which can be done off-line  Obtain the new modulation spectrum by projecting the old modulation spectrum onto the column space of B ◦ Updating the low-frequency modulation spectrum  Reducing the computation while keeping the effect of enhancement in modulation spectrum 8

9 9

10 10

11  Experimental setup ◦ Aurora-2 database  Clean condition trainng  Noise environment: Test set A: subway, babble, car, and exhibition noises Test set B:restaurant, street, airport, and train station noises Test set C: MIRS subway and MIRS street noises SNRs: clean, 20dB, 15dB, 10dB, 5dB, 0dB, -5dB ◦ HMM for each digit: 16 states and 20 Gaussian mixtures per state 11

12  The iteration function :  The projection function :  The computational complexity(for a feature sequence) 12

13 13

14  The negative spectra magnitude appeared (averaged for a feature sequence) ◦ The probability of producing negative magnitudes in NMF(p,f) and PCA(p,f) is very small even though they have no nonnegative constraints 14

15  The recognition accuracy (%) of each method for different numbers of frequency points forming the low sub-band 15

16 16

17  The proposed two schemes reduce the computational complexity of NMF a lot  Using the orthogonal projection in NMF further improves the recognition accuracy  Normalizing the low sub-band provides very similar results when compared with the full-band normalization 17


Download ppt "Yi-zhang Cai, Jeih-weih Hung 2012/08/17 報告者:汪逸婷 1."

Similar presentations


Ads by Google