Rules and analogy in Russian loanword adaptation and novel verb formation Vsevolod Kapatsinski Indiana University Dept. of Linguistics & Cognitive Science.

Slides:



Advertisements
Similar presentations
1 Minimally Supervised Morphological Analysis by Multimodal Alignment David Yarowsky and Richard Wicentowski.
Advertisements

1 Rule reliability and productivity Velar palatalization in Russian and artificial grammar Vsevolod Kapatsinski Indiana University
Knowing More than One Language: The Psycholinguistics of Bilingualism Marina Blekher Department of Linguistics.
Psycholinguistic what is psycholinguistic? 1-pyscholinguistic is the study of the cognitive process of language acquisition and use. 2-The scope of psycholinguistic.
Factors in the use of the simple past tense by Mandarin and Tamil ESL learners Mike Tiittanen Copyright, Mike Tiittanen, 2011.
Language & Mind Summer Words Perhaps the most conspicuous, most easily extractable aspect of language. Cf. phone, phoneme, syllable NB word vis.
Language, Mind, and Brain by Ewa Dabrowska Chapter 9: Syntactic constructions, pt. 1.
Assistant Professor, (Program for Linguistics)
Phonotactic Restrictions on Ejectives A Typological Survey ___________________________ Carmen Jany
Learning linguistic structure with simple recurrent networks February 20, 2013.
The Linguistics of SLA.
Language (and Decomposition). Linguistics provides… a highly articulated “computational” (generative) theory of the mental representations of language.
A Study of Speech Perception: Julie Langevin Communication Sciences and Disorders Faculty Mentor: Timothy Bryant The Psychological Reality of the Obligatory.
Autosegmental Phonology
Lexicon Language & Mind Summer Nature of the lexicon Much more structured than dictionaries Links between phonological forms and meanings – E.g.
Experimental evidence for product- oriented and source-oriented generalizations Vsevolod Kapatsinski Indiana University Dept. of Linguistics Cognitive.
Linguisitics Levels of description. Speech and language Language as communication Speech vs. text –Speech primary –Text is derived –Text is not “written.
Introduction Regular system: for every input, the grammar produces only one output Ways to achieve regularity Minimize competition between generalizations.
Experimental evidence for product- oriented generalizations (or not) Vsevolod Kapatsinski Indiana University Dept. of Linguistics Cognitive Science Program.
Verb inflectional morphology in L2. Ludovica Serratrice (2001) The emergence of verbal morphology and the lead-lag pattern issue in bilingual acquisition”
1 Representing Regularity: The English Past Tense Matt Davis William Marslen-Wilson Centre for Speech and Language Birkbeck College University of London.
Random Thoughts 2012 (COMP 066) Jan-Michael Frahm Jared Heinly source: fivethirtyeight.com.
Diachronic Change in Loanword Constraint Rankings An analysis of multiple outputs for the same input in English Loanwords in Korean.
1. Lexical Diffusion What is lexical diffusion?
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
Experimental study of morphological priming: evidence from Russian verbal inflection Tatiana Svistunova Elizaveta Gazeeva Tatiana Chernigovskaya St. Petersburg.
Zolkower-SELL 1. 2 By the end of today’s class, you will be able to:  Describe the connection between language, culture and identity.  Articulate the.
Infant Speech Perception & Language Processing. Languages of the World Similar and Different on many features Similarities –Arbitrary mapping of sound.
A chicken-and-egg problem
THE BIG PICTURE Basic Assumptions Linguistics is the empirical science that studies language (or linguistic behavior) Linguistics proposes theories (models)
Jeopardy Q 1 Q 2 Q 3 Q 4 Q 5 Q 6Q 16Q 11Q 21 Q 7Q 12Q 17Q 22 Q 8Q 13Q 18 Q 23 Q 9 Q 14Q 19Q 24 Q 10Q 15Q 20Q 25 Final Jeopardy Language.
The Role of Phonological Distance and Relative Support on the Productivity of the Dutch Simple Past Tense Bram Vandekerckhove, Emmanuel Keuleers, & Dominiek.
Language and Thought Its all about communication.
What is modularity good for? Michael S. C. Thomas, Neil A. Forrester, Fiona M. Richardson
Morphology A Closer Look at Words By: Shaswar Kamal Mahmud.
Simulated Evolution of Language By: Jared Shane I400: Artificial Life as an approach to Artificial Intelligence January 29, 2007.
The Past Tense Model Psych /719 Feb 13, 2001.
Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London.
Language, Mind, and Brain by Ewa Dabrowska Chapter 8: On rules and regularity, pt. 2.
IN THE NAME OF GOD IN THE NAME OF GOD. Grammar Grammar Chapter 2 Chapter 2.
1 Cross-language evidence for three factors in speech perception Sandra Anacleto uOttawa.
Lexical and morphosyntactic minimal pairs. Evidence for different processing Luca Cilibrasi, Vesna Stojanovik, Patricia Riddell, School of Psychology,
Natural Language Processing Chapter 2 : Morphology.
COGNITIVE MORPHOLOGY Laura Westmaas November 24, 2009.
Fita Ariyana Rombel 7 (Thursday 9 am).
Levels of Linguistic Analysis
3 Phonology: Speech Sounds as a System No language has all the speech sounds possible in human languages; each language contains a selection of the possible.
Connectionist Modelling Summer School Lecture Three.
CSA4050: Advanced Topics in NLP Computational Morphology II Introduction 2 Level Morphology.
Unit 2 The Nature of Learner Language 1. Errors and errors analysis 2. Developmental patterns 3. Variability in learner language.
Against formal phonology (Port and Leary).  Generative phonology assumes:  Units (phones) are discrete (not continuous, not variable)  Phonetic space.
Usage-based phonology Why are lines in grocery store about equal?
Projection and the Reality of Routines – reflections of a computational modeller Bruce Edmonds Centre for Policy Modelling Manchester Metropolitan University.
MORPHOLOGY. PART 1: INTRODUCTION Parts of speech 1. What is a part of speech?part of speech 1. Traditional grammar classifies words based on eight parts.
Lexical Phonology Specifically mixes phonology and morphology The word is the unit of analysis Relationship between phonology and morphology is captured.
NO ANTHROPOLOGY CLASS ***FRIDAY, SEPT 13 th*** (All 100- and 200-level classes between 10 and 11 are cancelled for orientation) ***FRIDAY, OCT 4 th ***
Gardner, D. (2007). Validating the construct of word in applied corpus-based vocabulary research: A critical survey. Applied Linguistics, 28(2), 241–265.
Approaches to Teaching and Learning How people learn languages Session 2.
10/31/00 1 Introduction to Cognitive Science Linguistics Component Topic: Formal Grammars: Generating and Parsing Lecturer: Dr Bodomo.
Linguistic Society of America Annual Meeting
عمادة التعلم الإلكتروني والتعليم عن بعد
PSYC 206 Lifespan Development Bilge Yagmurlu.
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.
Lecture 7 Summary Survey of English morphology
Language, Mind, and Brain by Ewa Dabrowska
INTRODUCTION TO PHONETICS AND PHONOLOGY
Assessing Grammar Module 5 Activity 5.
Language, Mind, and Brain by Ewa Dabrowska
Saidna Zulfiqar bin Tahir STATE UNIVERSITY OF MAKASSAR
Levels of Linguistic Analysis
Presentation transcript:

Rules and analogy in Russian loanword adaptation and novel verb formation Vsevolod Kapatsinski Indiana University Dept. of Linguistics & Cognitive Science Program Speech Research Lab LSA 2007

Russian stem extensions -i- event  event+i+ ‘happen’ -i- event  event+i+ ‘happen’ -a- eat  it+a+ ‘eat’ -a- eat  it+a+ ‘eat’ Source: The Big Dictionary of Youth Slang, 2003 Source: The Big Dictionary of Youth Slang, 2003 Borrowed verbs Borrowed verbs New verbs formed from nouns New verbs formed from nouns

Which stem extensions are more productive?

The questions How can we predict the choice of the stem extension? How can we predict the choice of the stem extension? Is one extension applied by default? Is one extension applied by default? Predicted by the Dual Mechanism Model (Pinker and Prince 1988, 1994) Predicted by the Dual Mechanism Model (Pinker and Prince 1988, 1994) Locality effects Locality effects Analogical vs. schema-based accounts? Analogical vs. schema-based accounts? Do parts of the root adjacent to the root-suffix boundary influence suffix choice more than more distant parts of the root? Do parts of the root adjacent to the root-suffix boundary influence suffix choice more than more distant parts of the root? Do parts of the root that are not adjacent to the root-suffix boundary influence the choice of the suffix? Do parts of the root that are not adjacent to the root-suffix boundary influence the choice of the suffix? Unexpected under the Rule-Based Learner (Albright and Hayes 2003) Unexpected under the Rule-Based Learner (Albright and Hayes 2003)

Part I. Defaultness

Phonotactic influences: It’s not all phonotactic

Phonotactics do not explain all the variation Can analogy to existing words predict the stem extension taken by a borrowed verb? Can analogy to existing words predict the stem extension taken by a borrowed verb? Analogy: Analogy: The borrowed verb will take the stem extension of the majority of its neighbors. The borrowed verb will take the stem extension of the majority of its neighbors. Verbs are neighbors if their roots share at least 2/3 of their phonemes Verbs are neighbors if their roots share at least 2/3 of their phonemes

Analogical predictions kam kap kak kaz kar kaj kad kim xam kum kajm kach -a -i

Similarity effect N=598N=1085

i a Final consonant as a predictor KAM kajM xaM kuM groM toM weM shtorM skoroM KiM duM xroM 8/11 3/11 m  i Not just Place: b  i (41/54) p  a (36/57)

Analogy vs. Final consonant Breakdown by stem extension

When analogy makes no prediction In 8.5% of verbs, analogy makes no prediction In 8.5% of verbs, analogy makes no prediction Numbers of nieghbors taking each stem extension are equal Numbers of nieghbors taking each stem extension are equalOR No neighbors No neighbors What determines stem extension choice then? What determines stem extension choice then?

N=98 (5.5%) When there are equal numbers of neighbors rooting for –a and -i, coronals are not associated with either stem extension What about verbs that have no neighbors?

Number of neighbors=0 N=59 (3%) When there are no neighbors, coronals are always followed by -i

Interim Summary Analogy accounts for 87% of the data excluding velars Analogy accounts for 87% of the data excluding velars Analogy performs better than specifying the final consonant Analogy performs better than specifying the final consonant Analogy predicts –i better than it predicts –a Analogy predicts –i better than it predicts –a (70% vs. 93%) (70% vs. 93%) When there are no neighbors, coronals are always followed by -i When there are no neighbors, coronals are always followed by -i

An issue for the Dual Mechanism Model Pinker and Prince (1988, 1994): Pinker and Prince (1988, 1994): One suffix should be more productive than the other suffix with novel lexical items that are not similar to existing ones One suffix should be more productive than the other suffix with novel lexical items that are not similar to existing ones -i > –a after coronals -i > –a after coronals  -i is the default This suffix is applied by default. Hence, analogy should be less able to predict when this suffix will occur. This suffix is applied by default. Hence, analogy should be less able to predict when this suffix will occur. Analogy is less able to predict occurrence of –a Analogy is less able to predict occurrence of –a  -a is the default Possible accounts: Possible accounts: Analogy Analogy Associations between parts of the root and suffixes Associations between parts of the root and suffixes Associations should be stronger when the distance between the suffix and the part of the root is small Associations should be stronger when the distance between the suffix and the part of the root is small

Part II. Locality

Do neighbors that don’t share the final C matter? Albright and Hayes (2003): Albright and Hayes (2003): The only segment strings that can be associated with a suffix are uninterrupted segment strings that include the final segment The only segment strings that can be associated with a suffix are uninterrupted segment strings that include the final segment Weaker version: Weaker version: Suffixes can be associated with adjacent phonological chunks more strongly than with non-adjacent ones Suffixes can be associated with adjacent phonological chunks more strongly than with non-adjacent ones

Testing the hypothesis of lack of non-local dependencies KAM KAp KAk KAz KAr KAj KAd KiM xAM KuM KAjM KAch -a -i

Adjacent dependencies are stronger

Combining predictors If we know What do most neighbors sharing final C take? What do most neighbors sharing final C take? What do most words with this final C take? What do most words with this final C take? Do we need to know What do most neighbors that do not share final C take? What do most neighbors that do not share final C take?

Final consonant vs. final-sharing neighbors KAM KiM XaM KuM KAjM loM groM weM greM Etc. Previously sharing just the final C was not enough to be considered neighbors

Non-local dependencies still important Logistic Regression: Logistic Regression: Final C: χ 2 = 31.0 Final C: χ 2 = 31.0 Neighbors sharing final C: χ 2 = Neighbors sharing final C: χ 2 = Neighbors not sharing final C: χ 2 = Neighbors not sharing final C: χ 2 =  Local dependencies are stronger All predictors are significant at p<.0005 All predictors are significant at p<.0005  Non-local dependencies do exist

Conclusion Huge similarity effects for both stem extensions Huge similarity effects for both stem extensions All productive suffixes sensitive to similarity All productive suffixes sensitive to similarity But, after coronals But, after coronals -a is less predictable than –i based on analogy -a is less predictable than –i based on analogy -i is more productive than –a when there are no analogical models nearby -i is more productive than –a when there are no analogical models nearby  Defining attributes of a DMM default are dissociable (cf. Kapatsinski 2005)

Conclusion -a is less predictable than –i based on analogy -a is less predictable than –i based on analogy Possible reason: Possible reason: There are more –i verbs than –a verbs in the lexicon There are more –i verbs than –a verbs in the lexicon Possible analogical solution: Possible analogical solution: Thus, a given neighbor is more likely to bear –i than it is to bear –a Thus, a given neighbor is more likely to bear –i than it is to bear –a Thus, occurrence of an –a neighbor is more salient than occurrence of an –i neighbor Thus, occurrence of an –a neighbor is more salient than occurrence of an –i neighbor

Conclusion After coronals After coronals -i is more productive than –a when there are no analogical models nearby -i is more productive than –a when there are no analogical models nearby -i and –a are equally productive when there are as many neighbors bearing –i as neighbors bearing -a -i and –a are equally productive when there are as many neighbors bearing –i as neighbors bearing -a Interpretation: Interpretation: Use analogy whenever possible; Use analogy whenever possible; if both alternatives have equal support, then they are equally acceptable; if both alternatives have equal support, then they are equally acceptable; if no analogical models, use phonotactics if no analogical models, use phonotactics

Conclusion Analogy or schemas? Analogy or schemas? Activate similar words? Activate similar words? Activate sublexical chunks associated with suffixes? Activate sublexical chunks associated with suffixes? Locality effects support the schematic account (cf. Albright and Hayes 2003) : Locality effects support the schematic account (cf. Albright and Hayes 2003) : Dependencies between adjacent segments are easier to learn than dependencies between non-adjacent ones (e.g., Hudson Kam and Newport 2005) Dependencies between adjacent segments are easier to learn than dependencies between non-adjacent ones (e.g., Hudson Kam and Newport 2005) While adjacent dependencies are stronger, non-adjacent dependencies seem to also play a role in suffix choice (contra Albright and Hayes 2003). While adjacent dependencies are stronger, non-adjacent dependencies seem to also play a role in suffix choice (contra Albright and Hayes 2003).

Thank you!

Acknowledgements N.I.H. for financial support through a training grant to David Pisoni and the Speech Research Lab N.I.H. for financial support through a training grant to David Pisoni and the Speech Research Lab Tessa Bent, Adam Buchwald, Joan Bybee, and Susannah Levi for helpful discussion Tessa Bent, Adam Buchwald, Joan Bybee, and Susannah Levi for helpful discussion

References Albright, A., and B. Hayes Rules vs. analogy in English past tenses: A computational/ experimental study. Cognition 90, Albright, A., and B. Hayes Rules vs. analogy in English past tenses: A computational/ experimental study. Cognition 90, Bybee, J. L Morphology: A study of the relation between meaning and form. Benjamins. Bybee, J. L Morphology: A study of the relation between meaning and form. Benjamins. Bybee, J. L Regular morphology and the lexicon. Language and Cognitive Processes, Bybee, J. L Regular morphology and the lexicon. Language and Cognitive Processes, Kapatsinski, V. M Characteristics of a rule-based default are dissociable: Evidence against the Dual Mechanism Model. In S. Franks, F. Y. Gladney, and M. Tasseva-Kurtchieva, eds. Formal Approaches to Slavic Linguistics 13: The South Carolina Meeting, Michigan Slavic Publications. Kapatsinski, V. M Characteristics of a rule-based default are dissociable: Evidence against the Dual Mechanism Model. In S. Franks, F. Y. Gladney, and M. Tasseva-Kurtchieva, eds. Formal Approaches to Slavic Linguistics 13: The South Carolina Meeting, Michigan Slavic Publications. Pinker, S., and A. Prince On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28, Pinker, S., and A. Prince On language and connectionism: Analysis of a parallel distributed processing model of language acquisition. Cognition, 28, Pinker, S., and A. Prince Regular and irregular morphology and the psychological status of rules of grammar. In S. D. Lima, R. L. Corrigan, and G. K. Iverson, eds. The reality of linguistic rules, Benjamins. Pinker, S., and A. Prince Regular and irregular morphology and the psychological status of rules of grammar. In S. D. Lima, R. L. Corrigan, and G. K. Iverson, eds. The reality of linguistic rules, Benjamins.

Breakdown by place of articulation of final C

Extracting the dependencies For a dependency between a part of the root and a suffix to be formed, many roots must share the same sublexical chunk and the same stem extension For a dependency between a part of the root and a suffix to be formed, many roots must share the same sublexical chunk and the same stem extension Is this the case? Is this the case? What are the major schemas? What are the major schemas? Are they all local? Are they all local?

Separate networks for –a and –i verbs kam kap kak kaz kar kaj kad kim xam kum kajm kach -a -i

The most connected –a verbs min number of neighbors = 20

The most connected –i verbs min number of neighbors = 35

Adding some less connected –i verbs (min #of neighbors = 20)

Conclusion There are large clusters of verbs in the lexicon in which all verbs are similar to each other in exactly the same way, which could give rise to schema formation. There are large clusters of verbs in the lexicon in which all verbs are similar to each other in exactly the same way, which could give rise to schema formation. Many of such schemas would not involve sharing segments that are adjacent to the suffix. Many of such schemas would not involve sharing segments that are adjacent to the suffix.