aperiodic periodic Production of /z/: Covariation and weighting of harmonically decomposed streams for ASR Introduction Pitch-scaled harmonic filter Recognition.

aperiodic periodic Production of /z/: Covariation and weighting of harmonically decomposed streams for ASR Introduction Pitch-scaled harmonic filter Recognition experiments Results Conclusion

Motivation and aims Most speech sounds are either voiced or unvoiced, which have very different properties: –voiced: quasi-periodic signal from phonation –unvoiced: aperiodic signal from turbulence noise Do these properties allow humans to recognize speech in noise? Maybe, we can use this information to help ASR... by computing separate features for the two parts. Are their two contributions complementary? http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ INTRODUCTION

aperiodic contribution periodic contribution http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ INTRODUCTION Voiced and unvoiced parts of a speech signal Production of /z/:

speech waveform aperiodic waveform s(n) periodic waveform http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Pitch-scaled harmonic filter u(n) ^ time shifting v(n) ^ PSHF... optimised pitch f 0 raw f 0 opt pitch optimisation pitch extraction N opt PSHF re-splicing

Original Periodic Aperiodic http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Decomposition example (waveforms)

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Original Periodic Aperiodic Decomposition ex. (spectrograms)

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Original Periodic Aperiodic Decomposition ex. (MFCC specs.)

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Speech database: Aurora 2.0 From TIdigits database of connected English digit strings (male & female speakers), filtered with G.712 at 8 kHz. TRAIN TEST

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Description of the experiments Baseline experiment: [base] –standard parameterisation of the original waveforms (i.e., MFCC,+Δ,+ΔΔ) PCA experiments: [pca26, pca78, pca13 and pca39] –decorrelation of the feature vectors, and reduction of the number of coefficients Split experiments: [split, split1] –adjustment of stream weights (periodic vs. aperiodic) Caveat: pitch values were derived from clean speech files, for entire database!

PCA26: PCA78: PCA13: PCA39: MFCC +Δ, +Δ 2 cat PSHF PCA MFCC+Δ, +Δ 2 catPSHF PCA MFCC+Δ, +Δ 2 catPSHF PCA MFCC+Δ, +Δ 2 catPSHF PCA BASE: MFCC waveformfeatures +Δ, +Δ 2 http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ METHOD Parameterisations SPLIT: MFCC+Δ, +Δ 2 catPSHF SPLIT1: MFCC+Δ, +Δ 2 catPSHF

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ RESULTS Full-sized PCA results

PCA26PCA39 clean + multi http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ RESULTS Variance of Principal Components

PCA26 experiment’s results CLEANMULTI

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ RESULTS Summary of best PCA results

Split experiment’s results

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ RESULTS Sample Split results Note: same value of stream weights used in training as in testing, for Split.

Split1 experiment’s results

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ RESULTS Summary of PCA & Split results

http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/ CONCLUSION Conclusions PSHF module split Aurora’s speech waveforms into two synchronous streams (periodic and aperiodic) –large improvements over the single-stream Baseline Split was better than all PCA combinations: –PCA26/13 better than PCA 78/39, and PCA13 best –Split1 marginally better than Split Periodic speech segments give robustness to noise. Further work –Modeling: how best to combine the streams? –LVCSR: evaluate front end on TIMIT (phone recognition). –Robust pitch tracking

COLUMBO PROJECT: Harmonic decomposition applied to ASR Philip J.B. Jackson 1 David M. Moreno 2 Javier Hernando 2 Martin J. Russell 3 123 http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo/

aperiodic periodic Production of /z/: Covariation and weighting of harmonically decomposed streams for ASR Introduction Pitch-scaled harmonic filter Recognition.

Similar presentations

Presentation on theme: "aperiodic periodic Production of /z/: Covariation and weighting of harmonically decomposed streams for ASR Introduction Pitch-scaled harmonic filter Recognition."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

aperiodic periodic Production of /z/: Covariation and weighting of harmonically decomposed streams for ASR Introduction Pitch-scaled harmonic filter Recognition.

Similar presentations

Presentation on theme: "aperiodic periodic Production of /z/: Covariation and weighting of harmonically decomposed streams for ASR Introduction Pitch-scaled harmonic filter Recognition."— Presentation transcript:

Similar presentations

About project

Feedback