Presentation is loading. Please wait.

Presentation is loading. Please wait.

Investigating Pitch Accent Recognition in Non-native Speech

Similar presentations


Presentation on theme: "Investigating Pitch Accent Recognition in Non-native Speech"— Presentation transcript:

1 Investigating Pitch Accent Recognition in Non-native Speech
Gina-Anne Levow August 4, 2009

2 Roadmap Motivation: Prosody Recognition in Non-native Speech
Prosody and Language Learning Prosody Recognition in Non-native Speech LEAP Corpus Modeling Pitch Accent for Recognition Analysis of Pitch Accent in Learner Speech Pitch Accent Recognition: Within-group Cross-group Conclusion

3 Prosody and Language Learning
Acquisition of prosody essential for language learners Contributes to semantic, pragmatic info as well as quality Less emphasized in instruction (class, CALL (Chun, 1998)) Challenging to characterize Often requires individual attention Computer-assisted Language Learning (CALL) Potential for flexible, individual feedback Many prior approaches emphasize scoring (Teixeia et al, 2000; Tepperman et al, 2008) Goal: Automatic prosodic labeling for targeted, individal feedback Current focus: English pitch accent

4 Automatic Prosodic Labeling of Non-native Speech
Significant strides in prosody labeling Acoustic-only methods: > 80% Syllable-based, binary Native speakers, mostly broadcast news Challenges: Characterization, comparison of learner prosody Are pitch accents reliably produced? Can recognition reach competitive levels? Little prosodically labeled learner speech Can other sources be employed?

5 LEAP Corpus “Learning Prosody in a Foreign Language”
(Milde & Gut, 2002): papers on DB, agreement, etc Focus on prosodically labeled English set Read speech: analogous to language lab ‘Extended’ EToBI tagset (Silverman, 1992) 14 pitch accent tags, 14 phrase/boundary tags Collapse to standard sets: Analysis: 4-way: High, Downstepped High, Low, Unacc. Classification: Binary: Accented/Unaccented

6 LEAP Corpus Range of speakers, L1s, experience
37 recordings: ~300 syllables each 26 speakers ID Description c1 Non-native, before prosody training c2 Non-native, after first prosody training c3 Non-native, after second prosody training e1 Non-native, before travel abroad e2 Non-native, after travel abroad sl “super-learner”, near-native na Native

7 Modeling Pitch Accent Pitch accent identity, realization depend on context Pitch is relative: To speaker range To neighboring accents, phrase range e.g. downstep Coarticulatory effects: Modeling improves recognition (e.g. Sun 2002) Approach based on Pitch Target Approximation Model Tone/pitch accent target exponentially approached Linear target: height, slope (Xu et al, 99)

8 Local Feature Extraction
Base features: Pitch, Intensity max, mean, min, range (Praat, speaker normalized) Pitch at 5 points across voiced region Duration Initial, final in phrase Slope: Linear fit to last half of pitch contour

9 Context Features Local context: Extended features Difference features
Adjacent points of preceding, following syllables Difference features Difference between Pitch max, mean, mid, slope Intensity max, mean Of preceding, following and current syllable

10 Analysis of Learner Pitch Accent
Pitch height characterizes accent, but Key feature is contrast with neighbors Contrasts: Unaccented vs High accented syllables Early learners (e1, c1) and native Pitch height and pitch deltas w.r.t. previous

11 Contrasts Pitch delta: Pitch height:
High significantly larger than unaccented All groups Differences significantly larger for native than early learner Pitch height: e1: No significant difference b/t High, unacc c1, na: Significant difference b/t High, unacc All speakers understand local contrast Some learners do not have reliable global control Potential for effective pitch accent recognition

12 Contrasts in Learner Prosody
Pitch Delta Pitch Height

13 Pitch Accent Recognition in Non-native Speech I
Classifier: Support Vector Machine Linear kernel, LibSVM (Cheng & Lin, 2001)

14 Pitch Accent Recognition in Non-native Speech II
Cross-group training with native and near-native speakers

15 Conclusion Non-native pitch accent
Even early learners exhibit key local contrasts Learners exhibit smaller contrasts than natives Some learners do not achieve reliable global control Non-native pitch accent recognition: Within-group training achieves competitive accuracies Cross-group training also effective No significant degradation for binary classification Potential effectiveness for CALL

16 Future Work Integrate non-native prosodic labeling in CALL setting
Explore utility for tone languages Identify learner errors, relative to gold std. Employ resynthesis of learner’s speech for focused feedback Further explore effect of learner L1, for very early learners

17 Thanks LEAP Corpus (Ulrike Gut) LibSVM (C.-C. Cheng and C.-J. Lin)
This work was supported by: NSF IIS #:

18 Contrasts in Learner Prosody
Pitch Delta Pitch Height


Download ppt "Investigating Pitch Accent Recognition in Non-native Speech"

Similar presentations


Ads by Google