Presentation is loading. Please wait.

Presentation is loading. Please wait.

aperiodic periodic Production of /z/: Covariation and weighting of harmonically decomposed streams for ASR Introduction Pitch-scaled harmonic filter Recognition.

Similar presentations


Presentation on theme: "aperiodic periodic Production of /z/: Covariation and weighting of harmonically decomposed streams for ASR Introduction Pitch-scaled harmonic filter Recognition."— Presentation transcript:

1

2 aperiodic periodic Production of /z/: Covariation and weighting of harmonically decomposed streams for ASR Introduction Pitch-scaled harmonic filter Recognition experiments Results Conclusion

3 Motivation and aims Most speech sounds are either voiced or unvoiced, which have very different properties: –voiced: quasi-periodic signal from phonation –unvoiced: aperiodic signal from turbulence noise Do these properties allow humans to recognize speech in noise? Maybe, we can use this information to help ASR... by computing separate features for the two parts. Are their two contributions complementary? http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ INTRODUCTION

4 aperiodic contribution periodic contribution http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ INTRODUCTION Voiced and unvoiced parts of a speech signal Production of /z/:

5 speech waveform aperiodic waveform s(n) periodic waveform http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Pitch-scaled harmonic filter u(n) ^ time shifting v(n) ^ PSHF... optimised pitch f 0 raw f 0 opt pitch optimisation pitch extraction N opt PSHF re-splicing

6 Original Periodic Aperiodic http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Decomposition example (waveforms)

7 http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Original Periodic Aperiodic Decomposition ex. (spectrograms)

8 http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Original Periodic Aperiodic Decomposition ex. (MFCC specs.)

9 http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Speech database: Aurora 2.0 From TIdigits database of connected English digit strings (male & female speakers), filtered with G.712 at 8 kHz. TRAIN TEST

10 http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Description of the experiments Baseline experiment: [base] –standard parameterisation of the original waveforms (i.e., MFCC,+Δ,+ΔΔ) PCA experiments: [pca26, pca78, pca13 and pca39] –decorrelation of the feature vectors, and reduction of the number of coefficients Split experiments: [split, split1] –adjustment of stream weights (periodic vs. aperiodic) Caveat: pitch values were derived from clean speech files, for entire database!

11 PCA26: PCA78: PCA13: PCA39: MFCC +Δ, +Δ 2 cat PSHF PCA MFCC+Δ, +Δ 2 catPSHF PCA MFCC+Δ, +Δ 2 catPSHF PCA MFCC+Δ, +Δ 2 catPSHF PCA BASE: MFCC waveformfeatures +Δ, +Δ 2 http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Parameterisations SPLIT: MFCC+Δ, +Δ 2 catPSHF SPLIT1: MFCC+Δ, +Δ 2 catPSHF

12 http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ RESULTS Full-sized PCA results

13 PCA26PCA39 clean + multi http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ RESULTS Variance of Principal Components

14 PCA26 experiment’s results CLEANMULTI

15 http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ RESULTS Summary of best PCA results

16 Split experiment’s results

17 http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ RESULTS Sample Split results Note: same value of stream weights used in training as in testing, for Split.

18 Split1 experiment’s results

19 http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ RESULTS Summary of PCA & Split results

20 http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ CONCLUSION Conclusions PSHF module split Aurora’s speech waveforms into two synchronous streams (periodic and aperiodic) –large improvements over the single-stream Baseline Split was better than all PCA combinations: –PCA26/13 better than PCA 78/39, and PCA13 best –Split1 marginally better than Split Periodic speech segments give robustness to noise. Further work –Modeling: how best to combine the streams? –LVCSR: evaluate front end on TIMIT (phone recognition). –Robust pitch tracking

21 COLUMBO PROJECT: Harmonic decomposition applied to ASR Philip J.B. Jackson 1 David M. Moreno 2 Javier Hernando 2 Martin J. Russell 3 123 http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/


Download ppt "aperiodic periodic Production of /z/: Covariation and weighting of harmonically decomposed streams for ASR Introduction Pitch-scaled harmonic filter Recognition."

Similar presentations


Ads by Google