Download presentation

Presentation is loading. Please wait.

Published byKathlyn Theresa Norris Modified over 5 years ago

1
Complete Discrete Time Model Complete model covers periodic, noise and impulsive inputs. For periodic input 1) R(z): Radiation impedance. It has been shown that R(z) can be approximated as R(z) = 1 - z -1 or R(z) = 1 - z -1 differentiator.

2
Therefore in continuous time it can be written that R(z) can be moved to the glottis in the previous figure 2) G(z): z-transform of glottal flow input, g[n] over one cycle. It can be approximated by If <1 two identical poles outside the unit circle, two zeroes at infinity (maximum phase) 3) V(z): all pole vocal-tract function. Complete Discrete Time Model

3
Therefore V(z) and R(z) are minimum phase. G(z) is maximum phase. Some related work Zeros of Z-Transform (ZZT) Decomposition of Speech For Source-Tract Separation Baris Bozkurt, Boris Doval, Christophe D’Alessandro, Thierry Dutoit This study proposes a new spectral decomposition method for source-tract separation. It is based on a new spectral representation called the Zeros of Z-Transfor m (ZZT), which is an all-zero representation of the z-transform of the signal. We show that separate patterns exist in ZZT representations of speech signals for the glottal flow and the vocal tract contributions. The ZZT-decomposition is simply composed of grouping the zeros into two sets, according to their location in the z-plane. This type of decomposition leads to separating glottal flow contribution (without a return phase) from vocal tract contributions in z domain. Complete Discrete Time Model

4
A Method For Glottal Formant Frequency Estimation Baris Bozkurt, Boris Doval, Christophe D’Alessandro, Thierry Dutoit This study presents a method for estimation of glottal formant frequency (Fg) from speech signals. Our method is based on zeros of z-transform decomposition of speech spectra into two spectra : glottal flow dominated spectrum and vocal tract dominated spectrum. Peak picking is performed on the amplitude spectrum of the glottal flow dominated part. The algorithm is tested on synthetic speech. It is shown to be effective especially when glottal formantand first formant of vocal tract are not too close. In addition, tests on a real speech example are also presented where open quotient estimates from EGG signals are used as reference and correlated with the glottal formant frequency estimates. Improved Differential Phase Spectrum Processing For Formant Tracking Baris Bozkurt, Boris Doval, Christophe D’Alessandro, Thierry Dutoit This study presents an improved version of our previously introduced formant tracking algorithm. The algorithm is based on processing the negative derivative of the argument of the chirp-z transform (termed as the differential phase spectrum) of a given speech signal. No modeling is included in the procedure but only peak picking on differential phase spectrum. We discuss the effects of roots of z- transform to differential phase spectrum and the need to ensure that all Zeros are at some distance from the circle where chirp-z transform is computed. For that, we include an additional zero- decomposition step in our previously presented algorithm to improve its robustness. The final version of the algorithm is tested for analysis of synthetic speech and real speech signals and compared to two other formant tracking systems. Complete Discrete Time Model

5
If the differentiation at the otput (radiation impedance) is applied to the glottal flow. Derivative of glottal flow is more like pulse ! Complete Discrete Time Model Glottal flow Glottal flow derivative

6
NOISE INPUT IMPULSE INPUT The combination of three inputs may be linear or nanlinear ! Complete Discrete Time Model

7
OTHER ZEROS OF THE VOCAL-TRACT In the noise and impulse source states oral tract constrictions may give zeros as well as poles (absorption of energy by cavity anti-resonances) V(z) may have zeros Vocal tract function is generally mixed phase. Maximum phase elements of the vocal tract can also contribute to a more gradual attack of the speech waveform. The modeling described here is called the SOURCE-FILTER MODEL of speech production. Complete Discrete Time Model

8
In the source filter model, it is assumed that glottal input is infinite and glottal airflow is not influenced by the vocal tract. However the pressure in the vocal tract cavity above glottis backs up (resists) against the glottal flow. Vocal Fold and Vocal Tract Interaction

9
Electrical analog is shown below P sg : subglottal (lung) pressure p(t): sound pressure corresponding to a single first formant in front of glottis.(because it has been found that other formants have negligible effect on glottal flow.) Z g (t): time varying impedance of the glottis. R,L,C: these parameters model first formant with formant frequency (center frequency) bandwidth (3dB) Vocal Fold and Vocal Tract Interaction

10
Z g (t) accounts for the interaction between the glottal flow and vocal tract. If Z g (t) is comparable to the impedance of 1 st formant then there will be considerable interaction and Ω 0, B 0 will be affected. Also Z g (t) has been found to be nonlinear: k=1.1 A(t) smallest time-varying area of glottal slit. Equations are nonlinear and time-varying. Vocal Fold and Vocal Tract Interaction

11
Numericval solution of the above equations reveals that the skewness of glottal flow is due to in part A(t) and in part to the loading effect of the first formant. Numerical solution also yielded a ripple component. Vocal Fold and Vocal Tract Interaction Glottal flow derivative

12
The problem can approximately be analyzed by linearizing the differential equation. Taylor series of if x<<1. Vocal Fold and Vocal Tract Interaction

13
where By differentiation Vocal Fold and Vocal Tract Interaction

14
Corresponding Norton equivalent circuit is Where U sc (t) is now time-varying source. Vocal Fold and Vocal Tract Interaction

15
Formant Frequency and Bandwidth Modulation Because we have a linear but time-varying equation, formant frequency and bandwidth are time-varying i.e. they are modulated Laplace transform does not apply. But the equaton can be solved for each time instantas a constant coefficient equation. g 0 (t) is proportional to A(t) (glottal area) bandwidth is proportional to glottal area ( B 1 (t) B 0 since A(t) 0 ) formant is proportional to the derivative of glottal area ( Ω 1 (t) may be aboveor below Ω 0 ) Vocal Fold and Vocal Tract Interaction formant bandwidth

16
Glottal area Bandwidth Formant factor

17
In the minimum bandwidth modulation cases ( /i/, /u/ ) B 1 (t) increases by a factor of 3 to 4. Multiplier of Ω 1 (t) 0.8 ~ 1.2 Conclusions The increase of B 1 (t) within a glottal cycle yields the truncation of glottal flow (sharp closing of folds. It is due to a decrease in the impedance at the glottis as glottis opens. Reduced glottal impedance Z g (t) yields pressure drop accros glottis. Vocal Fold and Vocal Tract Interaction

18
Truncation Effect (Using Klatt Synthesiser)

19
Truncation Effect

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google