Presentation is loading. Please wait.

Presentation is loading. Please wait.

Brad Story Speech, Language, and Hearing Sciences University of Arizona Research supported by NIH R01-04789 Advances in Speech Synthesis ASA Fall 2009.

Similar presentations


Presentation on theme: "Brad Story Speech, Language, and Hearing Sciences University of Arizona Research supported by NIH R01-04789 Advances in Speech Synthesis ASA Fall 2009."— Presentation transcript:

1

2 Brad Story Speech, Language, and Hearing Sciences University of Arizona Research supported by NIH R Advances in Speech Synthesis ASA Fall 2009 – San Antonio, TX, Advances in simulation of sentence-level speech production with kinematic models of the vocal tract and vocal folds

3 Goal: Develop a model to facilitate understanding the human sound production system- nonlinear interaction of source and filter (vocal tract) acoustics produced by the vocal folds & vocal tract : -anatomic/physiologic scaling -time-varying changes perceptual response to sounds produced by the system Brad Story, U. of Arizona

4 Coordinated movement Speech Production Speech Perception Implication: Model must produce intelligible speech at the syllable, word or sentence level To some degree, the model should allow access to three levels: Brad Story, U. of Arizona

5 Replicate Natural Speech Fleshpoint tracking: temporal patterns of vocal tract shape change, constriction locations,… to model parameters Tracking acoustic characteristics: acoustic characteristics to parameters for vocal fold vibration and vocal tract shape. U. Wisconsin X-ray microbeam The black cat F1 F2 F3 Frequency (Hz) Time Brad Story, U. of Arizona

6 2. Vocal Tract 1D tubular waveguide Trachea Voice source* *Titze, I.R. (2006). The myoelastic aerodynamic theory of phonation, NCVS, pp Parts of the Model 1. Kinematic vocal fold model: Driven motion of the medial surfaces; glottal flow is interactive with vocal tract pressures Brad Story, U. of Arizona

7 1. Model of Vocal Fold Kinematics Adductory maneuver + vibrational displacement Slowly-varying postural component Vibrational displacement Medial surfaces of the vocal folds Glottal width L T Brad Story, U. of Arizona

8 1. Model of Vocal Fold Kinematics Adductory maneuver + vibrational displacement Slowly-varying postural component Vibrational displacement Glottal area Medial surfaces of the vocal folds Glottal width Brad Story, U. of Arizona

9 Glottal area Medial surfaces of the vocal folds Example: Typical vibration Brad Story, U. of Arizona

10 Valving : modulate vowels with constrictions - consonants Three Categories of Vocal Tract Movements Shaping : slowly-varying changes to the shape of the entire vocal tract - vowels Tuning : modify parts of the vocal tract shape to enhance voice quality or facilitate voice production 2. Model of Time-Varying Vocal Tract Shape Gracco, V.L., (1992). Perkell, J. (1969). Ohman, S. E. G. (1966;1967). Brad Story, U. of Arizona

11 TubeTalker* *Story, (2005). JASA, 117, Tier I: Overall vocal tract shaping Hierarchical control tiers Tier II: Valving Composite time-varying vocal tract Brad Story, U. of Arizona

12 Average (mean) VT shape Tier I (Shaping): Derived from principal component analysis of vocal tract shapes for a collection of vowels from a specific speaker [F1, F2] vowel space Story and Titze, (1998). J. Phonetics; Story, (2005), JASA Brad Story, U. of Arizona

13 Tier I (Shaping): Mapping of [F1,F2] frequencies to model coefficients Interpolate between trajectories The black cat [F1, F2] vowel space [q1, q2] coeff space Brad Story, U. of Arizona

14 Tier I: Vowel transitions Overall shape changes The black cat Postural component Vocal tract deformation due to articulation Brad Story, U. of Arizona

15 Time-varying formant frequencies (continuous) Tier I: modulation of the overall shape of the vocal tract The black cat – vowel transitions only F3 F2 F1 Brad Story, U. of Arizona

16 Vocal Fold and Respiratory Parameters Fundamental Frequency Separation of the vocal folds at the vocal processes Respiratory pressure The black cat Brad Story, U. of Arizona

17 Tier II (consonant valving): requires specification of constriction location, degree of closure, and temporal characteristics Constriction location/degree Time course of the constriction Brad Story, U. of Arizona

18 Constriction location and timing U. Wisconsin X-ray microbeam Movement in the midsagittal plane Time-varying cross-distance The black cat Brad Story, U. of Arizona

19 Tier I - Shaping vowel shapes are imposed on neutral vocal tract Tier II - Valving consonant perturbations are imposed on the vowel substrate The black cat Brad Story, U. of Arizona

20 Shaping + Valving = composite time-varying vocal tract shape Time-varying formant frequencies F3 F2 F1 The black cat vowel transitions only vowel transitions + consonant perturbations Brad Story, U. of Arizona

21 Spectrographic Comparison: Simulated/Natural Simulated Natural Brad Story, U. of Arizona

22 The black cat Modification: change constriction location The black bat Brad Story, U. of Arizona

23 The black cat The black gnat Modification: change constriction location and open nasal port = nasal port open Brad Story, U. of Arizona

24 black cat black bat black gnat Brad Story, U. of Arizona

25 Black cat – vowels Black cat – vowels w/adduct Black cat – normal Black cat – widened epi larynx Black cat – constricted epi larynx Black cat – long duration, tremor Black cat – altered F0 contour Black cat – short VT, high F0 Black cat – shortened VT, increased F0 Black cat – Halloween voice Variations of the phrase Brad Story, U. of Arizona

26 The End Operation Black Cat Brad Story, U. of Arizona


Download ppt "Brad Story Speech, Language, and Hearing Sciences University of Arizona Research supported by NIH R01-04789 Advances in Speech Synthesis ASA Fall 2009."

Similar presentations


Ads by Google