Presentation is loading. Please wait.

Presentation is loading. Please wait.

Vocal Tract & Lip Shape Estimation By MS Shah & Vikash Sethia Supervisor: Prof. PC Pandey EE Dept, IIT Bombay AIM-2003, EE Dept, IIT Bombay, 27 th June,

Similar presentations


Presentation on theme: "Vocal Tract & Lip Shape Estimation By MS Shah & Vikash Sethia Supervisor: Prof. PC Pandey EE Dept, IIT Bombay AIM-2003, EE Dept, IIT Bombay, 27 th June,"— Presentation transcript:

1 Vocal Tract & Lip Shape Estimation By MS Shah & Vikash Sethia Supervisor: Prof. PC Pandey EE Dept, IIT Bombay AIM-2003, EE Dept, IIT Bombay, 27 th June, 2003

2 2 ABSTRACT: The display of intensity, pitch, and vocal tract shape is considered to be helpful in speech training of the hearing impaired. A speech analysis package is developed in MATLAB for displaying speech waveforms, pitch and energy contours, spectrogram, and areagram (a two-dimensional plot of cross- sectional area of vocal tract as a function of time and position along the tract length). While vocal tract shape estimation works satisfactorily for vowels, during stop closures, the place of closure can not be estimated due to very low signal energy. There is a need to investigate methods for predicting vocal tract shape during stop closure from the shapes estimated on either side of the closure. Work is in progress for lip shape estimation which may find application in video telephony.

3 3 Introduction  Hearing impairment → Lack of auditory feedback during speech production → Speech impairment  Speech training to hearing impaired children by visual (using a mirror) & tactile feedback : some important features and efforts not distinguishable  Speech training aids: Display of articulatory efforts and acoustic parameters: vocal tract and lip shape, pitch, and energy variations

4 4 Vocal tract shape estimation General model for speech production system Where s ( n ) = speech signal, u ( n ) = glottal excitation, g ( n ) = glottis impulse response, v ( n ) = impulse response of the vocal tract, r ( n ) = impulse response of radiation from lips. Cont..

5 5 Acoustic tube model of the vocal tract Cont.. At the mth section, volume velocity: pressure: reflection coefficient:

6 6 Speech analysis model (Wakita-1973) Assumption vocal tract represented as an all-pole filter with Algorithmic steps: inverse filtering for error signal with LMS technique set of simultaneous equations solved with Robinson’s algorithm for reflection coefficients & relative area values Cont..

7 7 Implementation ■ Set-up: PC with sound card for signal acquisition (sampling rate used: 11.025 k sa/s) ■ “ VTAG-1 ” developed for speech pr. & display  Pre-emphasis for 6 dB/octave equalization, analysis window: 256-sample Hamming with 50% overlap  Robinson’s algorithm for obtaining reflection coefficients & area values  Beizer form algorithm for interpolation of area values

8 8 VTAG-1 result for all-vowel word /aIje/

9 9 Synthesized vowels /a//u//i/

10 10 Amplitude/pitch modulated synthesized vowel /a/ Amplitude modulatedPitch modulated Amp. & pitch modulated

11 11 Spectrograms for V-C-V sequence /aka/ /aga/ /ata/ /ada/

12 12 /aka/ /aga/ /ata/ /ada/ Areagram for V-C-V sequence

13 13 Lip shape estimation  Mouth parameters:  Parameter estimation : Pitch tracking : odd harmonics absent for analysis window length = 2 * pitch period Magnitude spectrum above 4000 Hz clipped to zero Mean & variance used for generation of predictor surfaces

14 14 Lip shape estimation results Pitch and mean vs. variance result (1): synthesized amplitude modulated vowel / u /

15 15 Pitch and mean vs. variance result (2): synthesized pitch/amplitude modulated vowel / a /

16 16 Pitch and mean vs. variance result (3): synthesized pitch modulated vowel / i /

17 17 Summary ■ Analysis & display package VTAG-1 developed for pitch/energy variation, spectrogram, & areagram (2-D plot of v.t. area) to investigate the problems in estimation of vocal tract shape, for use in speech training aid of the hearing impaired children. Cont.

18 18 ■ Area estimation for vowels: not affected by amplitude & pitch variation ■ Area estimation during stop closure: place of closure can not be estimated from analysis result during stop closure ■ Further work:  Investigate methods for predicting vocal tract area during stop closure from the areas estimated on either side of closure  Implement algorithm for generation of predictor surfaces for extraction of lip shape estimation parameters


Download ppt "Vocal Tract & Lip Shape Estimation By MS Shah & Vikash Sethia Supervisor: Prof. PC Pandey EE Dept, IIT Bombay AIM-2003, EE Dept, IIT Bombay, 27 th June,"

Similar presentations


Ads by Google