Presentation is loading. Please wait.

Presentation is loading. Please wait.

Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication.

Similar presentations


Presentation on theme: "Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication."— Presentation transcript:

1

2 Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication & Multimedia Signal Processing Group School of Engineering and Design, Brunel University 6 July, 2005

3 Communications & Multimedia Signal Processing Outline Parameters estimation of HNM (incl. Pitch/Harmonic tracking in noise) HNM of excitation Formant Tracking LP with HNM of excitation in speech enhancement

4 Communications & Multimedia Signal Processing Overview of Speech Enhancement System LP Pole Analysis Noisy Speech Formant Candidate Estimation Kalman Filter Vowel/ Consonant Classification LP Model of Noise LP-Analysis and LP-Spectral Subtraction VAD LP Spectrum Reconstruction Residual De-noising Speech Reconstruction Enhanced Speech Formant Track Restoration Module HNM of Residual Residual Restoration Module

5 Communications & Multimedia Signal Processing In HNM, speech is decomposed to two parts : Harmonic and noise. where L(t) denotes the number of harmonic included in the harmonic part, ω 0 denotes the pitch frequency. Harmonic : Noise : Synthesized Speech : where h the a time-varying autoregressive(AR) model and b is white Gaussian noise. Harmonic plus Noise Model

6 Communications & Multimedia Signal Processing HNM - Pitch Tracking In Griffin’s method error function in clean speech: In noisy condition the error function is modified to including SNR dependent weights The weighting function W(l) is a SNR-dependent given by Error function can be extended into frequency domain: Where r is defined Each frame outputs several pitch candidates (N=3 ) and Viterbi algorithm generates the final pitch tracks.

7 Communications & Multimedia Signal Processing Pitch Tracking Results (1) Figure - Illustration of error function from a voiced frame

8 Communications & Multimedia Signal Processing Pitch Tracking Results (2) Pitch Tracking Average RMSE Average % error Without SNR Weights23.34.9% With SNR weights18.72.7% Figure - An illustration of pitch tracks of a speech at sampling frequency of 16kHz. Table - Comparison of average RMSE of pitch tracking from car noisy speech at SNR of 0 dB.

9 Communications & Multimedia Signal Processing HNM - Harmonic Tracking Peak picking Pitch Tracking Noise Speech VAD Noise model FFT Throw away short trajectory Harmonic Frequency bin tracks The experiments shows it is better to perform harmonic tracking over the whole speech than the excitation.

10 Communications & Multimedia Signal Processing

11 Synthesis of Excitation(1) Noisy Speech Pitch estimation Harmonic tracking MMSE LP Analysis UV decision Excitation LP Spectrum Unvoiced excitation voiced excitation WGN Synthesized unvoiced excitation Synthesized voiced excitation Std & Phase + Enhanced Speech VAD Noise model

12 Communications & Multimedia Signal Processing Synthesis of Excitation(2) Voiced Excitation : Unvoiced Excitation : Car Noise speech at snr0 Cleaned speech by MMSE only Clean speech Cleaned speech by MMSE and HNM of Excitation Where b(m) is unit white Gaussian noise, e(m) is original excitation and a is the phases of original excitation.

13 Communications & Multimedia Signal Processing Future Work Refining voiced/unvoiced classification in pitch estimation. More precise evaluation of the improved pitch estimation method compared with manually corrected pitch tracks. Integrate FTLP smoothing in current speech enhancement system. Evaluation of the whole proposed speech enhancement system by ISD, PESQ and perceptual tests.

14 Communications & Multimedia Signal Processing Thank You!


Download ppt "Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication."

Similar presentations


Ads by Google