Presentation is loading. Please wait.

Presentation is loading. Please wait.

Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99.

Similar presentations


Presentation on theme: "Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99."— Presentation transcript:

1 Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99

2 DLFBE ---Preliminary * MFCC is very successful in speech recognition * MFCC computed from the speech signal using the following three steps: 1.Compute the FFT power spectrum of the speech signal 2.Apply a Mel-space filter-bank to the power spectrum to get N energies (N=20~60) 3.Compute discrete cosine x’form (DCT) of log filter-bank energies to get uncorrelated MFCC’s (M=10)

3 DLFBE --- Motivation *MFCC has two drawbacks 1. Does not have any physical interpretataion 2. Liftering of cepstral coefficient has no effect in the modern speech recognition (discuss later) *The two problem(i.e., numbers and correlation) in FBE used in 50’s, 60’s,70’s can be solved today

4 Liftering --- What and How *Lifter is the reweighting process of cepstral coeff. used in DTW framework of speech recognition where is dissimilarity between the test vector and the mean vector Euclidean distance

5 Liftering --- What and How (cont’d) Where is i-th cepstral coeff., is the corresponding liftering coeff. and is the lifter So More general form

6 Liftering --- What and How (cont’d)

7 The types of lifters are listed belows 1.Linear lifter 2.Statistical lifter 3.Sinusoidal lifter 4.Exponential lifter

8 Liftering --- Discussion and Why * The multiplicative weighting in cepstrum domain is equivalent to convolution in spectral domain Spectral domainCepstral domain Type 1 and 2HP filter Emphasize the higher cepstral coeff’s. Type 3 and 4BP filter Lessen the higher and lower cepstral coeff’s.

9 Liftering --- Experiment on DTW

10 Liftering on CDHMM (??) --- Why Mahalanobis distance measure due to out observation prob.

11 Liftering on CDHMM (??) --- Why liftering matrix for MFCC where

12 Liftering on CDHMM (??) --- Why Thus,cepstral liftering has no effect in the recognition process when used with continuous observation Gaussian Density HMM’s

13 Decorrelation of FBE --- Why/How *FBEs are correlated => we can’t use CDHMM * We can use LP techniques to solve this defeat can be obtained by covariance method of LP analysis

14 Liftering of FBE --- How FIR filter N=M+L

15 DLFBE --- Experiment *SI and isolated word recognition using ISOLET spoken letter database *90 training utterances from 90 speakers(45 females,45 males) 30 testing utterances from 30 speakers (15 females,15 males)

16 DLFBE --- Experiment (cont’d)

17

18 Robust Speech Feature Noise-Invariant Representation for Speech Signal Group Delay Function (GDF) Method Proposed by Bayya & Yegnanarayana in EuroSpeech ‘99

19 GDF --- Motivation *Background noise is a prominent source of mismatch and eliminated roughly by methods as follows 1.compensation cause the overestimation and underestimation side effects Pre- Processing SS(spectral sub.),HP,BP FN(feature normalization) Model Adaptation Parameter x’form

20 GDF --- Motivation (cont’d) 2.new feature not completely noise resistant *All the above use power/amplitude as speech feature Why don’t we use phase information as features ? And phase infor. may be helpful in speech recognition. LPCMEL,PLP (projection concept)

21 GDF --- What/How *GDF is defined as the normalized autocorrelation of a short segment of a signal (#.1) Where is the normalized autocorrelation of a short segment of a signal

22 (#.2) compare(#.1)&(#.2) GDF --- What/How (cont’d)

23 Easy to implement Truncated version of GDF

24 GDF --- What/How (cont’d) where Hanning window

25 GDF --- Why & Experiment *frame length = 5 ms, frame rate = 1 ms & modified autocorrelation sequence averaged over 20 frames then the GDF computed as defined above

26 GDF --- Why & Experiment (cont’d)

27 GDF --- Experiment *Isolated-digit recognition Clean Noisy SI97% 95% YES SD96.5% 94.5% NO Due to large dynamic range ?


Download ppt "Robust Speech Feature Decorrelated and Liftered Filter-Bank Energies (DLFBE) Proposed by K.K. Paliwal, in EuroSpeech 99."

Similar presentations


Ads by Google