The long-term retention of fine- grained phonetic details: evidence from a second language voice identification training task Steve Winters CAA Presentation.

Slides:



Advertisements
Similar presentations
Tone perception and production by Cantonese-speaking and English- speaking L2 learners of Mandarin Chinese Yen-Chen Hao Indiana University.
Advertisements

Human Speech Recognition Julia Hirschberg CS4706 (thanks to John-Paul Hosum for some slides)
The perception of dialect Julia Fischer-Weppler HS Speaker Characteristics Venice International University
Infant sensitivity to distributional information can affect phonetic discrimination Jessica Maye, Janet F. Werker, LouAnn Gerken A brief article from Cognition.
Using prosody to avoid ambiguity: Effects of speaker awareness and referential context Snedeker and Trueswell (2003) Psych 526 Eun-Kyung Lee.
Ling 240: Language and Mind Acquisition of Phonology.
Speech perception 2 Perceptual organization of speech.
Splice: From vowel offset to vowel onset FIG 3. Example of stimulus spliced from the repetitive syllables. EXPERIMENT 2 (Voicing ID) METHOD Speech materials:
Psych 156A/ Ling 150: Acquisition of Language II Lecture 4 Sounds.
Analyzing Students’ Pronunciation and Improving Tonal Teaching Ropngrong Liao Marilyn Chakwin Defense.
Niebuhr, D‘Imperio, Gili Fivela, Cangemi 1 Are there “Shapers” and “Aligners” ? Individual differences in signalling pitch accent category.
Sentence Durations and Accentedness Judgments ABSTRACT Talkers in a second language can frequently be identified as speaking with a foreign accent. It.
General Problems  Foreign language speakers of a target language cause a great difficulty to native speakers because the sounds they produce seems very.
Do Children Pick and Choose? An Examination of Phonological Selection and Avoidance in Early Lexical Acquisition. Richard G. Schwartz and Laurence B. Leonard.
Vocal Emotion Recognition with Cochlear Implants Xin Luo, Qian-Jie Fu, John J. Galvin III Presentation By Archie Archibong.
Voice Onset Time as a Parameter for Identification of Bilinguals Claire Gurski University of Western Ontario London, ON Canada.
Identification and discrimination of the relative onset time of two component tones: Implications for voicing perception in stops David B. Pisoni ( )
Phonetic Similarity Effects in Masked Priming Marja-Liisa Mailend 1, Edwin Maas 1, & Kenneth I. Forster 2 1 Department of Speech, Language, and Hearing.
The Phonetic Space of Phonological Categories in Heritage Speakers of Mandarin The 44 th Annual Meeting of the Chicago Linguistics Society 24 April 2008.
GABRIELLA RUIZ LING 620 OHIO UNIVERSITY Cross-language perceptual assimilation of French and German front rounded vowels by novice American listeners and.
Wilson, “The case for sensorimotor coding in working memory” Wilson’s thesis: Items held in short-term verbal memory are encoded in an “articulatory” format.
The effectiveness of pronunciation teaching to Greek state school students Eleni Tsiartsioni Aristotle University of Thessaloniki
NOVA Comprehensive Perspectives on Child Speech Development and Disorders Chapter 14 Acquisition of the English Voicing Contrast by Native Spanish-Speaking.
Fricatives + VOT April 6, 2010 For Starters… A note on perceptual verbiage. Also note: I gave you the wrong CP data!
Sebastián-Gallés, N. & Bosch, L. (2009) Developmental shift in the discrimination of vowel contrasts in bilingual infants: is the distributional account.
Background Infants and toddlers have detailed representations for their known vocabulary items Consonants (e.g., Swingley & Aslin, 2000; Fennel & Werker,
Speech Perception 4/6/00 Acoustic-Perceptual Invariance in Speech Perceptual Constancy or Perceptual Invariance: –Perpetual constancy is necessary, however,
The Role of Linguistic Knowledge in the Encoding of Words and Voices in Memory Steve Winters, Karen Lichtman and Silke Weber Second Language Research Forum.
When do which sounds tell you who says what? A phonetic investigation of the familiar talker advantage in word recognition. University of Calgary Linguistics.
Una Y. Chow Stephen J. Winters Alberta Conference on Linguistics November 1, 2014.
Statistical learning, cross- constraints, and the acquisition of speech categories: a computational approach. Joseph Toscano & Bob McMurray Psychology.
1 Speech Perception 3/30/00. 2 Speech Perception How do we perceive speech? –Multifaceted process –Not fully understood –Models & theories attempt to.
Tone sensitivity & the Identification of Consonant Laryngeal Features by KFL learners 15 th AATK Annual Conference Hye-Sook Lee -Presented by Hi-Sun Kim-
Jiwon Hwang Department of Linguistics, Stony Brook University Factors inducing cross-linguistic perception of illusory vowels BACKGROUND.
Adaptive Design of Speech Sound Systems Randy Diehl In collaboration with Bjőrn Lindblom, Carl Creeger, Lori Holt, and Andrew Lotto.
Is phonetic variation represented in memory for pitch accents ? Amelia E. Kimball Jennifer Cole Gary Dell Stefanie Shattuck-Hufnagel ETAP 3 May 28, 2015.
Use of phonetic specificity during the acquisition of new words: Differences between consonants and vowels. Thiery Nazzi (2004) By: Dominique, Jennifer,
Results Tone study: Accuracy and error rates (percentage lower than 10% is omitted) Consonant study: Accuracy and error rates 3aSCb5. The categorical nature.
Words, Voices and Memories: the interaction of linguistic and indexical information in cross-language speech perception Steve Winters (in collaboration.
5aSC5. The Correlation between Perceiving and Producing English Obstruents across Korean Learners Kenneth de Jong & Yen-chen Hao Department of Linguistics.
Acoustic Cues to Laryngeal Contrasts in Hindi Susan Jackson and Stephen Winters University of Calgary Acoustics Week in Canada October 14,
1. Background Evidence of phonetic perception during the first year of life: from language-universal listeners to native listeners: Consonants and vowels:
SPEECH PERCEPTION DAY 16 – OCT 2, 2013 Brain & Language LING NSCI Harry Howard Tulane University.
A problem with linguistic explanations  A problem with linguistic explanations  Controlling articulatory movements  Memory for speech  The balance.
Epenthetic vowels in Japanese: a perceptual illusion? Emmanual Dupoux, et al (1999) By Carl O’Toole.
SEPARATION OF CO-OCCURRING SYLLABLES: SEQUENTIAL AND SIMULTANEOUS GROUPING or CAN SCHEMATA OVERRULE PRIMITIVE GROUPING CUES IN SPEECH PERCEPTION? William.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 6 Sounds of Words I.
4.2.6The effects of an additional eight years of English learning experience * An additional eight years of English learning experience are not effective.
Phonetic Context Effects Major Theories of Speech Perception Motor Theory: Specialized module (later version) represents speech sounds in terms of intended.
The New Normal: Goodness Judgments of Non-Invariant Speech Julia Drouin, Speech, Language and Hearing Sciences & Psychology, Dr.
Acoustic Continua and Phonetic Categories Frequency - Tones.
1 Cross-language evidence for three factors in speech perception Sandra Anacleto uOttawa.
Memory Part II Memory Stages and Processes. Overview Memory processes –encoding, storage, and retrieval Capacity & duration of memory stages –sensory.
Neurophysiologic correlates of cross-language phonetic perception LING 7912 Professor Nina Kazanina.
Psych 156A/ Ling 150: Psychology of Language Learning Lecture 2 Sounds I.
2.3 Markedness Differential Hypothesis (MDH)
Exemplar Theory, part 2 April 15, 2013.
Early Time Course Hemisphere Differences in Phonological & Orthographic Processes Laura K. Halderman 1, Christine Chiarello 1 & Natalie Kacinik 2 1 University.
Fricatives + Voice Onset Time November 25, 2015 In the Year 2000 Today: we’ll wrap up fricatives… and then move on to stops. This Friday, there will.
Introduction Method Experiment 2 In spoken word recognition, phonological and indexical properties (i.e., characteristics of the speaker’s voice) of a.
Danielle Werle Undergraduate Thesis Intelligibility and the Carrier Phrase Effect in Sinewave Speech.
Usage-Based Phonology Anna Nordenskjöld Bergman. Usage-Based Phonology overall approach What is the overall approach taken by this theory? summarize How.
Effects of Musical Experience on Learning Lexical Tone Categories
17th International Conference on Infant Studies Baltimore, Maryland, March 2010 Language Discrimination by Infants: Discriminating Within the Native.
Teaching pronunciation
Theoretical Discussion on the
6th International Conference on Language Variation in Europe
Linguistic Relativity: Evidence from Native Korean and English Speakers and Factors Affecting Its Extent.
Jessica McKee Speech, Language and Hearing Sciences
Vincent Porretta & Benjamin V. Tucker University of Alberta
Presentation transcript:

The long-term retention of fine- grained phonetic details: evidence from a second language voice identification training task Steve Winters CAA Presentation Victoria, BC October 13, 2010

Basic Precepts Exemplar theory: listeners store in memory every speech experience they have in their lifetime (Johnson, 2007). Including all details of those experiences. Variability forms an inherent (and informative) part of linguistic representations. Evidence: interactions in speech processing between indexical and linguistic information. 1.Word recognition is easier for familiar voices. (Nygaard and Pisoni, 1998) 2.Talker recognition is easier in familiar languages. (Goggin et al., 1991; Perrachione et al., 2009)

Bilingual Talker Interactions Winters et al. (2008) tested generalization of bilingual voice recognition across languages. 1.Listeners trained to identify voices speaking in English: Showed reduced identification accuracy in German (language-dependent knowledge) 2.Listeners trained to identify voices speaking in German: Showed equivalent ID accuracy in English (language-independent knowledge) Levi et al. (submitted): listeners trained to identify talkers speaking in German do not show a word recognition advantage for those talkers in English.

L2 Speech Perception Indexical and linguistic information do not seem to interact when listeners learn to identify German voices. Q: Are L2 stimuli not stored in exemplar fashion? I.e., are phonetic details lost in memory? Note: non-native sound contrasts can often be difficult for second language learners to acquire. Japanese listeners have difficulty discriminating between English /l/ + /r/ (Miyawaki et al., 1975). English listeners have difficulty discriminating between Thai voiced + unaspirated stops. (Abramson + Lisker, 1970). Perhaps listeners only store in memory what they know how to label. (Pierrehumbert, 2001)

Empirical Ambitions Thai contains a variety of phonetic features which are not contrastive in English: Lexical tones, vowel length, three-way VOT contrast (voiced ~ unaspirated ~ aspirated stops)… Can listeners encode this information in long-term memory? Experimental goal: train listeners to identify Thai voices which are associated with a particular phonetic property. (an implicit perception task)

Experimental Design Example talker identification training paradigm: Talker A is associated with Tone 1 Talker B is associated with Tone 2 Talker C is associated with Tone 3, etc. Q1: How much do these phonetic associations improve talker identification accuracy over a control condition? Q2: How much is identification accuracy impaired when the tone-talker associations no longer hold? Generalization: Talker A is presented with not-Tone 1 Talker B is presented with not-Tone 2, etc.

Experimental Design Four different training conditions: 1.Tone-talker associations 2.VOT-talker associations 3.(Vowel-talker associations) 4.Control: no consistent associations between talkers and phonetic properties Anticipated hierarchy of talker ID accuracy: Tone associations > Vowel associations > VOT associations (primarily for reasons of cue duration)

Exp. 1: Talker-Tone Associations 21 native English listeners learned to identify 5 Thai/English bilingual voices. Training paradigm: 6 learning sessions (2 on each day) familiarization, training w/feedback, testing In these training sessions, each voice produced only Thai words with a particular tone. High, Mid, Low, Falling, Rising Final day of experiment: generalization 1. English words 2. Novel Thai words in which previous tone-talker associations no longer held.

Talker-Tone Demo

Rising Mid Low High Falling

Talker-Tone Results

Rapid (and consistent) learning of voices during training Generalization: No effect of language Worse performance than on initial session Note: Thai generalization performance statistically equivalent to performance on first feedback session. Generalization mistakes: 37.6% gave the talker associated with the stimulus tone in training. (remember that chance = 1/4 = 25%) Conclusion: listeners used tone as a cue to voice identity.

Exp. 2: Talker-VOT Associations 20 native English listeners learned to identify six Thai/English bilingual voices. Identical training paradigm (with a few more stimuli) In training session, each voice produced only Thai words with a particular Voice Onset Time: Voiced, unvoiced, aspirated Note: two voices associated with each VOT type Generalization: novel English + novel Thai words (without the same Talker-VOT associations)

Talker-VOT Demo

Aspirated Unaspirated Aspirated Voiced Unaspirated

Talker-VOT Results

Result #1: Listeners do learn to identify the voices. Although pace of learning is slower than in Tone condition. Possible confounds: More voices to learn in VOT condition (6) Two voices associated with each VOT type Result #2: Performance does drop off significantly in generalization.  Listeners use VOT distinctions to identify voices.  VOT distinctions are encoded in memory. Note: Allen & Miller, 2004; Francis and Driscoll, 2006

Talker-VOT Mistakes In generalization, there are three potential mistake types. Stimuli: Talker (VOT Type A) - Word (VOT Type B) Mistake #1: Respond with other talker of Type A. (1/5) Mistake #2: Respond with talker of Type B. (2/5) Mistake #3: Respond with unrelated talker. (2/5) Totals: Mistake #1 (talker bias): 20.2% Mistake #2 (stimulus bias): 46.3% Mistake #3 (neither): 33.4%  VOT similarities are more salient than voice similarities.

Exp. 3: Control Condition 20 native English listeners learned to identify six Thai/English bilingual voices. Identical training paradigm to Experiment 2. No consistent associations in training between voices and particular phonetic properties. Note: essentially equivalent to German training in Winters et al. (2008)… with fewer speakers and with a different language.

* * *

Results: Experiments 1-3 In Training: Tone accuracy > Control + VOT accuracy in all six sessions. VOT accuracy > Control in sessions 3-6. In all conditions: accuracy is higher in session 6 than in session 1. In Generalization: No differences between learning conditions. But in Control: accuracy is higher for Thai stimuli than for English stimuli.

Discussion Listeners are storing in memory low-level acoustic cues to non-native sound contrasts. When they are associated with talker identity. Lexical tones provide more salient cues than VOT, but even VOT distinctions can be a cue to talker identity. Generalization to novel tokens works best in a Control condition. …even though rate of learning is slower in this condition, as well.

Conclusions These results provide further evidence for exemplar- based speech processing. Listeners encode in memory any potential cue which can be used to perform a listening task; Even if those cues are not distinctive in the listener’s native language… Or are not necessarily accessible to conscious reflection. Note: a perceptual reliance on highly specific phonetic details… Can make generalization hard.

Thanks! Thanks go to Kelly-Ann Casey, Tara Dainton and Sue Jackson, for all of their work in recording speakers, editing stimuli, analyzing data and running subjects through the listening experiments. This work was supported by a University of Calgary University Research Grants Committee starter grant.

Future Directions 1.Stronger test of exemplar-based memory: token recognition of training items 2.Is knowledge of talkers’ voices generalizable across different voice qualities? 3.Which phonetic properties support a familiar talker advantage in word recognition across languages? 4.Does learning to identify talkers associated with particular phonetic properties facilitate the learning of non-native sound contrasts?

Experiment 4: Vowels Still in progress! 9 native English listeners learned to identify six Thai/English bilingual voices. Identical training paradigm to Experiment 2 Each talker consistently produced only front, central, or back Thai vowels. In Generalization: talker-vowel quality associations no longer held. Voice/name labels were randomized between listeners.

The Thai Vowel Space iu eo a two talkers Note: there are also long/short vowel contrasts

Performance in the Vowel condition is no better (or worse) than the Control…yet.

One Persistent Issue: Talker Distinctiveness

One future direction: How much do talker representations depend on voice quality?

Imponderables Q: What cues do the listeners use to make the cross- language transfer? One future direction: Copy Thai Tones onto English words. Do language-dependent effects emerge: English word recognition? English talker identification? Also try the same trick with vowel-talker associations. “Linguistically irrelevant” vs. “Linguistically relevant” language-independent talker information.

More Future Directions A stronger test of exemplar memory: Listeners store in memory consistent cues to talker identity… Do they also store in memory inconsistent talker cues (found in particular tokens)? Plan: train listeners to identify talkers with particular (focused) phonetic associations Test them on training token recognition with: Words that differ in focused and unfocused phonetic properties.

More Future Directions Could talker identification training--with talker- property associations--aid L2 learners in the acquisition of non-native sound contrasts? Compare sound identification training regimen that: 1.alternates with talker identification training 2.alternates with a different listening task Does learning improve more with: 1.One-to-one talker-property associations? 2.Many-to-many talker-property associations?