The Role of Linguistic Knowledge in the Encoding of Words and Voices in Memory Steve Winters, Karen Lichtman and Silke Weber Second Language Research Forum.

The Role of Linguistic Knowledge in the Encoding of Words and Voices in Memory Steve Winters, Karen Lichtman and Silke Weber Second Language Research Forum.

1 The Role of Linguistic Knowledge in the Encoding of Words and Voices in Memory Steve Winters, Karen Lichtman and Silke Weber Second Language Research Forum October 16, 2011 1

2 A Basic Distinction Abercrombie (1967) famously distinguished between the “indexical” and “linguistic” properties of speech. The linguistic properties of speech support the identification of linguistic (phonemic, etc.) contrasts. The indexical properties of speech “reveal personal characteristics of the speaker”. 1. Dialect/group membership/gender 2. Talker idiosyncracies 3. Emotional/mental state 2

3 Indexical Properties Abercrombie (1967): a speaker provides a personal “medium” for a particular linguistic message. The personal indices in the medium are “extra- linguistic”. “Usually…many things about a medium which is being used as a vehicle for a given language are not relevant to linguistic communication. Such ‘extra-linguistic’ properties of the medium, however, may fulfill other functions which may sometimes even be more important than linguistic communication, and which can never be completely ignored.” 3

4 Unfiltered Exemplar theories of speech perception (Johnson, 2007) have emphasized the utility of not breaking down the signal into linguistic and indexical components. Conjecture: listeners store unanalyzed representations of speech that are “rich” with informative detail: linguistic representations might include indexical (talker-specific) information; and indexical representations include linguistic information. Generalizations emerge on the fly, from summed activations of similar exemplars. Strong form: listeners store in memory every detail of every speech event they experience in their lifetimes. 4

5 Interactions Core evidence for exemplar theory comes from interactions between indexical and linguistic properties:  Indexical properties influence processing of linguistic information in speech. 1.Recognition Memory (Palmeri et al., 1993) 2.Familiar talkers are easier to understand. (Nygaard et al., 1994)  Conversely, language affects identification of indexical properties: 3. It’s easier to identify someone speaking a language you know. (Goggin et al., 1991; Perrachione et al., 2009) 5

6 L2 Exemplars? The foundational evidence for exemplar theory comes primarily from English language studies. Do listeners encode the same amount of detail in memory from exemplars of L2 speech? Note: listeners can often exhibit perceptual insensitivity to fine-grained details of L2 speech. E.g.: Collapse of non-native VOT distinctions (Abramson & Lisker, 1970) Japanese listeners’ difficulty in acquiring English /l/-/r/ distinction (Strange & Dittman, 1984) However, Hall (2008) proposes an exemplar model in which sensitivity to L2 contrasts may be acquired over time. 6

7 L2 non-interactions Winters et al. (2008): tested identification of bilingual talkers across languages. 1.English-speaking listeners trained to identify: bilinguals speaking English bilinguals speaking German 2.Tested on same talkers speaking other language: English  German: loss of ID accuracy German  English: no loss in ID accuracy German-training: yields language-independent representations of talker voices. 7 mine mein

8 L2 non-interactions, continued Levi et al. (in press) tested the Familiar Talker Advantage across languages. Same training task Tested on ability to identify English words played in noise, spoken by trained (familiar) and untrained (unfamiliar) voices. Results: word recognition benefit for familiar voices only exhibited by English-trained listeners.  Familiar Talker Advantage does not transfer across languages. 8

9 Persistent Patterns 1.English-trained listeners displayed: Interactions between linguistic and talker categories in both voice learning experiments. 2.German-trained listeners: No interactions between linguistic and talker categories in either experiment. Implications: English-trained listeners develop richly detailed, exemplar-like representations of voices. German-trained listeners develop sparser, language-independent representations of voices. 9

10 What Matters? Training on an indexical task in an unknown L2 does not yield exemplar-style representations that integrate indexical and linguistic information. Hypothesis: an unknown language can be ignored during voice identification training… 1. meaningless to listeners 2. irrelevant to the task Task independence may facilitate development of language-independent representations of voices. Q: Is there a double dissociation? Do English listeners ignore voices when listening to German words? 10

11 Experimental Plan Test the effect of same vs. new voices on continuous word recognition memory in an L2. Task: Listeners hear a series of words; Must decide if each word is “new” or a “repeat” of an earlier word in the list. The catch: some words are repeated in the same voice others are repeated in a different voice. Recall: same-voice repeats are easier to recognize (Palmeri et al., 1993). Q: Is this also true in an unfamiliar language? 11

12 Materials 10 L1 German / L2 English talkers 5 male, 5 female Similar dialect Similar in perceived nativeness These talkers produced 360 CVC English words (e.g., buzz, cheek) 360 CVC German words (e.g., hoch, Rahm) Stimuli: German words only 5 male talkers in one series 5 female talkers in another series 12

13 Perception Experiment Listeners heard 160 (distinct) trials in each series 80 trials were new words; 80 were repeats (“old”) Among the repeats: 40 repeats in original voice 40 repeats in new voice Listeners: 17 English L1 listeners with no knowledge of German 19 English L1/German L2 learners Recruited from 4 th semester language classes; rated themselves 2.66 on a scale of 1-5 on ability to speak, read, and write German. 16 German L1 listeners 13

14 Analysis Responses were analyzed in terms of signal detection theory. Sensitivity (d’): quantifies how easily listeners can distinguish between “old” and “new” items For each listener, separate sensitivity scores were calculated for: Old items in the same voice Old items in a different voice Reaction times were also collected for each response. 14

15 15 English monolinguals German L2 learners German natives

16 16 English monolinguals German L2 learners German natives same diff

17 Results: Sensitivity There were significant main effects of: Repeat Type (Old Voice vs. New Voice): p =.016 Listener Group: p <.001 …but no interaction between the two. Also: no effect of talker group (males and females) Post-hoc analysis of repeat type effect: Same voice repeats > Different voice repeats Listener group performance hierarchy: German > Eng L1-Ger L2 learners> Eng Monolinguals 17

18 Bridging the Phonetic Gap Responses were also broken down by the “exotic” phonetic content of the various German words. Non-English soundExamples /x/sich, Loch Affricates: /pf/, /ts/Pfiff, Kopf Initial /r/Rad, Rock Final /r/Bier, leer Final /l/Ball, toll Front, rounded Vkühl, schön Long /o/ or /e/Boot, Weg

19 Phonetic Breakdown Breaking down the responses by phonetic content might make clear how (and why) the German L2 learners were able to perform better than the monolingual listeners. However: there were unbalanced numbers in each set of exotic sounds… So it wasn’t possible to run statistical tests. Nonetheless, here are three prominent patterns… 19

20 Monolingual listeners struggled to recognize words with velar fricatives in them. 20 English monolinguals German L2 learners German natives

21 German listeners were better able to recognize words with final /r/ across speakers. 21 English monolinguals German L2 learners German natives

22 English listeners’ recognition accuracy for words with mid vowels were also more affected by voice changes between repetitions. 22 English monolinguals German L2 learners German natives

23 Discussion 1.Voice information affects word recognition even when listeners hear words from a language they do not know. 2.Exemplar-style processing of L2 stimuli is possible, if listeners focus on the linguistic content of the signal, rather than its indexical properties. 3.Also: memory for words in a language increases as experience with that language increases. 4.Big picture: there is an apparent interaction asymmetry: L2 Linguistic information may be ignored L2 Voice information is not ignored  Perhaps listeners only store in memory what they know how to label. (Pierrehumbert, 2001) 23

24 The Devil in the Details Some exotic sounds (velar fricatives) were just too weird for the monolingual listeners to successfully encode. For other exotic sounds, non-native listeners exhibited more sensitivity to token-specific phonetic details. This resembles the findings of Goldinger (1996)  It is easier to generalize from larger populations of exemplars in memory. Work still to be done: frequency-based analysis. Another fun (future) direction: continuous voice recognition memory task. Pilot testing has revealed that this task is as challenging as you’d think it would be. 24

25 Implications These results pose something of a challenge to both models of speech perception. Exemplar-like representations may only emerge under a limited set of conditions; i.e., when both linguistic and indexical information are interpretable by (and important to) the listener.  Second language stimuli make an interesting proving ground for exemplar theories of speech perception. They provide a way to investigate the phonetic, semantic and task-based limits on what details of speech listeners are able to encode in memory. 25

26 The End! 26

27 References Abercrombie, D. 1967. Elements of General Phonetics. Edinburgh: Edinburgh University. Johnson, K.A. (2007) Decisions and mechanisms in exemplar-based phonology. in Experimental Approaches to Phonology, In Honor of John Ohala (M.J. Sole, P. Beddor & M. Ohala, eds.) Oxford University Press, 25-40. Palmeri, T.J., Goldinger, S.D., and Pisoni, D.B. (1993) Episodic encoding of voice attributes and recognition memory for spoken words. Journal of Experimental Psychology: Learning, Memory and Cognition, 19, 309-328. Nygaard, L.C., Sommers, M.S. and Pisoni, D.B. 1994. Speech perception as a talker-contingent process. Psychological Science, 5, 42-46. Goggin, J.P., Thompson, C.P., Strube, G. and Simental, L.R. 1991. The role of language familiarity in voice identification. Memory & Cognition, 19 (5), 448-458. Perrachione, T.K., Pierrehumbert, J.B. & Wong, P.C.M (2009) "Differential neural contributions to native- and foreign-language talker identification." Journal of Experimental Psychology - Human Perception and Performance, 35, 1950-1960. Abramson, A.S. & Lisker, L. (1970) Discriminability along the voicing continuum: cross language tests. Proceedings of the 6th International Congress of Phonetic Sciences, Prague, 569-573. Strange, W. & Dittman, S. (1984) Effects of discrimination training on the perception of /r/-/l/ by Japanese adults learning English. Perception & Psychophysics, 36, 131-145. Hall, K.C. (2008) Testing an exemplar-based model of contrast and allophony against evidence from second language acquisition. Unpublished manuscript. Available at Winters,S.J., Levi, S.V. & Pisoni, D.B. (2008). Identification and discrimination of bilingual talkers across languages. Journal of the Acoustical Society of America, 123 (6), 4524-4538. Levi, S.V., Winters, S.J. & Pisoni, D.B. (in press). The Familiar Talker Advantage is language-specific: evidence from cross- linguistic training. Journal of the Acoustical Society of America. Pierrehumbert, J. (2001) Exemplar dynamics: word frequency, lenition and contrast. In Frequency Effects and the Emergence of Linguistic Structure (J. Bybee and P. Hopper, eds.), 137-157. Benjamins: Amsterdam. Goldinger, S.D. (1996) Words and Voices: Episodic traces in spoken word identification and recognition memory. Journal of Experimental Psychology: Learning, Memory and Cognition, 22, 1166-1183. 27

29 Correct “new” responses were significantly longer than correct responses to both “old” types of words. (both p <.01) No effect of listener group No difference between repeat types 29

