Download presentation
Presentation is loading. Please wait.
Published byDustin Nash Modified over 8 years ago
1
A formant-trajectory model and its usage in comparing coarticulatory effects in Dysarthric and normal speech Xiaochuan Niu and Jan P. H. van Santen Center for Spoken Language Understanding OGI School of Science and Engineering at Oregon Health & Science University, USA MAVEBA 2003 Florence, Italy December 10-12, 2003
2
What is Dysarthria? Group of speech disorders –Weakness / incoordination of speech muscles –result of damage to the brain or nerves Results in unintelligible speech MAVEBA 2003 Florence, Italy December 10-12, 2003
3
Long Term Project Goal Long term goal: Speech transformation –Device that works in real time –Not: Amplifier, spectral filter –But: Correct for dynamic articulatory problems Based on a dynamic model of coarticulation Today’s talk: –Test (very simple) model of vowel dynamics MAVEBA 2003 Florence, Italy December 10-12, 2003
4
Observation: Vowel Formants Median Formants in Vowel Centers [pics] MAVEBA 2003 Florence, Italy December 10-12, 2003
5
Framework Formant Trajectories –(linear or non-linear) interpolation –between vowel targets Three mechanisms for vowel triangle data: 1.More coarticulation (interpolation too smooth) 2.More random variability 3.Incorrect targets MAVEBA 2003 Florence, Italy December 10-12, 2003
6
Mechanism 1: Coarticulation Average formants of any given vowel … –… more strongly dependent on … –… the average of the virtual formants … –… of the surrounding consonants MAVEBA 2003 Florence, Italy December 10-12, 2003
7
Mechanism 2: Random Variability Average formants of any given vowel … –… result of broad distributions that are … –… skewed by the boundaries of vowel space MAVEBA 2003 Florence, Italy December 10-12, 2003
8
Mechanism 3: Incorrect Targets Average formants of any given vowel … –… result of a tendency to … –… to move articulators in the wrong direction MAVEBA 2003 Florence, Italy December 10-12, 2003
9
Linear Coarticulation Model MAVEBA 2003 Florence, Italy December 10-12, 2003 3x13x33x13x33x13x3 F (t|p v n) = A pt FpFp + B nt FnFn + (I - A pt - B nt )FvFv
10
Linear Coarticulation Model MAVEBA 2003 Florence, Italy December 10-12, 2003 3x13x33x13x33x13x3 F (t|p v n) = A pt FpFp + B nt FnFn + (I - A pt - B nt )FvFv Observed formant vector t: Time p: Preceding consonant v: Vowel n: Next consonant
11
Linear Coarticulation Model MAVEBA 2003 Florence, Italy December 10-12, 2003 3x13x33x13x33x13x3 F (t|p v n) = A pt FpFp + B nt FnFn + (I - A pt - B nt )FvFv Observed formant vector t: Time p: Preceding consonant v: Vowel n: Next consonant Weight Matrices
12
Linear Coarticulation Model MAVEBA 2003 Florence, Italy December 10-12, 2003 3x13x33x13x33x13x3 F (t|p v n) = A pt FpFp + B nt FnFn + (I - A pt - B nt )FvFv Observed formant vector t: Time p: Preceding consonant v: Vowel n: Next consonant Target Formants Weight Matrices
13
Linear Coarticulation Model MAVEBA 2003 Florence, Italy December 10-12, 2003 3x13x33x13x33x13x3 F (t|p v n) = A pt FpFp + B nt FnFn + (I - A pt - B nt )FvFv Based on earlier work by Broad, Oehman, Lindblom, Schouten, Pols, Stevens, …
14
How use for transformation? MAVEBA 2003 Florence, Italy December 10-12, 2003 3x13x33x13x33x13x3 F (t|p v n) = A pt FpFp + B nt FnFn + (I - A pt - B nt )FvFv
15
How use for transformation? MAVEBA 2003 Florence, Italy December 10-12, 2003 3x13x33x13x33x13x3 F (t|p v n) = A pt FpFp + B nt FnFn + (I - A pt - B nt )FvFv F v = est (I - A pt - B nt ) -1 (F (t|p v n) - A pt F p - B nt F n ) implies
16
How use for transformation? MAVEBA 2003 Florence, Italy December 10-12, 2003 3x13x33x13x33x13x3 F (t|p v n) = A pt FpFp + B nt FnFn + (I - A pt - B nt )FvFv F v = est (I - A pt - B nt ) -1 (F (t|p v n) - A pt F p - B nt F n ) implies Partial consonant recognition observed
17
How use for transformation? MAVEBA 2003 Florence, Italy December 10-12, 2003 3x13x33x13x33x13x3 F (t|p v n) = A pt FpFp + B nt FnFn + (I - A pt - B nt )FvFv F v = est (I - A pt - B nt ) -1 (F (t|p v n) - A pt F p - B nt F n ) implies Partial consonant recognition observed
18
Application I MAVEBA 2003 Florence, Italy December 10-12, 2003 3x13x33x13x33x13x3 F (t|p v n) = A pt FpFp + B nt FnFn + (I - A pt - B nt )FvFv a nt 00 0 0 00 A pt = [] b nt 00 0 0 00 B pt = [] Model F (t|p v n) at vowel midpoints Each token may have different values of A pt and B nt No assumptions about dependency of weights on time. But: assume synchronicity for formant changes:
19
Application I: Targets [jj/ll] MAVEBA 2003 Florence, Italy December 10-12, 2003
20
Application I: Targets [00/09] MAVEBA 2003 Florence, Italy December 10-12, 2003
21
Application I: Weights [jj/ll] MAVEBA 2003 Florence, Italy December 10-12, 2003
22
Application I: Weights [00/09] MAVEBA 2003 Florence, Italy December 10-12, 2003
23
Application II MAVEBA 2003 Florence, Italy December 10-12, 2003 3x13x33x13x33x13x3 F (t|p v n) = A pt FpFp + B nt FnFn + (I - A pt - B nt )FvFv a nt 00 0a’ nt 0 00a” nt A pt = [] b nt 00 0b’ nt 0 00b” nt B pt = [] Model F (t|p v n) at vowel midpoints A pt and B nt same for all tokens. Assumptions are made about dependency of weights on time. But: no synchronicity for formant changes:
24
Application I: Targets [00/09] MAVEBA 2003 Florence, Italy December 10-12, 2003
25
Application II: Weights [00/09] MAVEBA 2003 Florence, Italy December 10-12, 2003
26
Conclusions Proposed linear model of vowel dynamics –To be used for formant “correction” When used as analytic instrument –Gave meaningful results Strikingly “normal” target values –Without any normalizing bias in the estimation process Clear evidence for enhanced coarticulation MAVEBA 2003 Florence, Italy December 10-12, 2003
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.