1 Rule reliability and productivity Velar palatalization in Russian and artificial grammar Vsevolod Kapatsinski Indiana University

Slides:



Advertisements
Similar presentations
Simplifications of Context-Free Grammars
Advertisements

Variations of the Turing Machine
1
Kapitel 10. Copyright © Houghton Mifflin Company. All rights reserved.10 | 2 1. The passive voice.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 10 second questions
5.1 Probability of Simple Events
Solve Multi-step Equations
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
Break Time Remaining 10:00.
The basics for simulations
Turing Machines.
PP Test Review Sections 6-1 to 6-6
DIVISIBILITY, FACTORS & MULTIPLES
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
VOORBLAD.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
Adding Up In Chunks.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
LO: Count up to 100 objects by grouping them and counting in 5s 10s and 2s. Mrs Criddle: Westfield Middle School.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Before Between After.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
Subtraction: Adding UP
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
1 Minimally Supervised Morphological Analysis by Multimodal Alignment David Yarowsky and Richard Wicentowski.
Januar MDMDFSSMDMDFSSS
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
Analyzing Genes and Genomes
1 Let’s Recapitulate. 2 Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Converting a Fraction to %
Clock will move after 1 minute
PSSA Preparation.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Select a time to count down from the clock above
Copyright Tim Morris/St Stephen's School
9. Two Functions of Two Random Variables
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
The Pumping Lemma for CFL’s
Rules and analogy in Russian loanword adaptation and novel verb formation Vsevolod Kapatsinski Indiana University Dept. of Linguistics & Cognitive Science.
Experimental evidence for product- oriented and source-oriented generalizations Vsevolod Kapatsinski Indiana University Dept. of Linguistics Cognitive.
Introduction Regular system: for every input, the grammar produces only one output Ways to achieve regularity Minimize competition between generalizations.
Experimental evidence for product- oriented generalizations (or not) Vsevolod Kapatsinski Indiana University Dept. of Linguistics Cognitive Science Program.
Presentation transcript:

1 Rule reliability and productivity Velar palatalization in Russian and artificial grammar Vsevolod Kapatsinski Indiana University Laboratory Phonology XI 30 June – 2 July 2008 Work supported by NIH Training Grant DC and NIH Research Grant DC-00111

2 The puzzle of productivity loss Morphophonemic rules can lose productivity while having no exceptions in the lexicon How does this happen? If there are a lot of examples supporting a rule, why would it fail?

3 Case study: Velar palatalization in Russian k  t  /_ -i(verbal stem extension) g   -ek/ik(nominal diminutive) -ok(nominal diminutive) Exceptionless in the lexicon (Levikova 2003, Sheveleva 1974) Fully productive before -ek and -ok. but Partially productive before –i and -ik. Why?

4 Hypothesis Rules are extracted from the lexicon Rules compete for inputs Competition is resolved by relative reliability Reliability = number of inputs that undergo the rule divided by the number of inputs that could undergo the rule (Albright and Hayes 2003, Pierrehumbert 2006) For []  ed, # of verbs that take –ed / # of verbs in English

5 Rule-Based Learner (Albright and Hayes 2003) Takes in a lexicon of pairs of morphologically related words blok, blot  i- sok, sot  i- sobak, sobat  i- zavtrak, zavtraka- Generalizes rules from it and weights them by reliability k  t  i / o_ (1.0) k  t  i / V [+back;-high] _ _ (0.75) []  a / ak_ (0.5)

6 Rule-Based Learner (Albright and Hayes 2003) Generalizes rules from it and weights them by reliability k  t  i / o_ (1.0) k  t  i / V [+back;-high] _ _ (0.75) []  a / ak_ (0.5) For each distinct output that an input can become, there will be one rule that’s more reliable than other rules producing that output from that input bok  bot  i k  t  i / o_ (1.0) k  t  i / V [+back;-high] _ _ (0.75) The probability of an output given an input is given by dividing the reliability of the most reliable applicable rule producing that output by the sum of reliabilities of the most reliable rules leading to different outputs bok  bot  i 1/(1+0.5) = 67% boka 0.5/(1+0.5) = 33%

7 blok, blot  i- sok, sot  i- lak, lat  i- zavtrak, zavtraka- k  t  i / o_ (1.0) k  t  i / V [+back;-high] _ _ (0.75) []  a / ak_ (0.5) bak  bat  i 0.75/( ) = 60% baka 0.5/( ) = 40% *baki  palatalization never fails before -i

8 blok, blot  i- sok, sot  i- sobak, sobat  i- zavtrak, zavtraka- k  t  i / o_ (1.0) k  t  i / V [+back;-high] _ _ (0.75) []  i / C_ (0.69) []  a / ak_ (0.5) platplati- koskosi- trubtrubi- varvari- ververi- solsoli- vozvozi- sorsori-  ar  ari- bak  bat  i 0.75/( ) = 39% baka 0.5/( ) = 26% baki 0.69/( ) = 36%  palatalization fails

-i is preceded by an alveopalatal in the output -i is preceded by a velar in the output Stored words derived from a velar-final input and bearing -i New inputs that end in a velar and take -i Stored words derived from a non-velar input and bearing -i

-ek -ok -i is preceded by an alveopalatal in the output -i is preceded by a velar in the output Stored words derived from a velar-final input and bearing -i New inputs that end in a velar and take -i Stored words derived from a non-velar input and bearing -i

-i is preceded by an alveopalatal in the output -i is preceded by a velar in the output Stored words derived from a velar-final input and bearing -i New inputs that end in a velar and take -i -i -ik Stored words derived from a non-velar input and bearing -i

12 Testing the hypothesis Borrowings from English in online communication –Inputs: Take all verbs and nouns that end in /k/ or /g/ from the British National Corpus, e.g., lock Plus a sample of verbs and nouns ending in other stops (for nouns, matched preceding vowel proportions) –Outputs: Choose suffix –For a verb, -i, -a, or –ova –For a noun, -ik, -ek, or –ok Choose whether to change the stem –For a verb: lokat j, lokovat j, lot  it j, lokit j, –For a noun: lot  ok, lokok, lot  ek, lokek, lot  ik, lokik –Count: Submit the possible outputs to Google Rate of vel.pal. failure: lokit j / (lot  it j + lokit j ) 56 velar-final, 140 non-velar-final 20 velar-final, 40 non-velar-final

13 Results: Stem extensions Velars favor –a over –i while –i is favored elsewhere Likelihood of taking -i Velar-final Labial-final Coronal-final Base

14 Results: Stem extensions Velar palatalization is likely to fail before –i despite being exceptionless; AND –i is favored by non-velar-final inputs Mean 44%

15 Results: Diminutives Mean 0%Mean 1%Mean 35% -ik is favored by non-velars -ok and –ek are favored by velars Velar palatalization fails only before -ik

16 Results: Diminutives Mean 0%Mean 1%Mean 35% -ik is favored by non-velars -ok and –ek are favored by velars Velar palatalization fails only before -ik g k p,b,t,d -ek -ik -ok Mean 10%Mean 0%Mean 100%

17 Evidence from artificial grammar Issue: speakers avoid using –i after velars because vel.pal. is unproductive before –i OR vel.pal. is unproductive before –i because -i is mostly used after non-velars

18 Evidence from artificial grammar Native English speakers exposed to two artificial languages: Language BLUERED {k;g}  {t  ;d  }i 100% 30 {t;d;p;b}  {t;d;p;b}i25%75% 8 24 {t;d;p;b}  {t;d;p;b}a 75%25% 24 8

19 Paradigm (Bybee and Newman 1995)

20 Paradigm The subject repeats the singular-plural pair

21 Paradigm

22 Paradigm The subject says the plural

23 Results As expected, -i is more productive in the red language with non-velars *** BLUE RED

24 Results Rate of velar palatalization is lower in Red Language than in Blue Language Prediction confirmed * 100% 30 BLUE RED

25 Results The more productive -i is with non-velar-final inputs for a subject, the less productive is velar palatalization for the same subject. ***

Constraining the model: Processing stages Two-stage model: –Stage I: -i vs. –a –Stage II: g   vs. ‘do nothing ’ One-stage model: –g   i vs. –g  ga vs. –C  Ci

27 Context effects Velar palatalization is likely to fail before –i despite being exceptionless Mean 44%

28 Explaining context effects Context effects are due to differences in the relative reliabilities of specific velar-changing rules g   i/V [+back;-high] _ (.475)log:.475 vs..232 g   i/V [-high] _ (.350) g   i/V_ (.272) g   i/[+voice]_ (.195)ping:.195 vs..232 []  i/C [+voiced] _ (.232) Suppose that the decision on whether to change the stem is made in the context of an already chosen suffix (-i) In this context, all velar-changing rules are completely reliable (they are exceptionless). Thus, relative reliability predicts context effects only if the suffix and the stem change are chosen simultaneously. g   /V [+back;-high] _i (1.0)log: 1.0 vs..756 g   /V [-high] _i (1.0) g   /V_i (1.0) g   /[+voice]_i (1.0)ping: 1.0 vs..756 []  []/C [+voiced] _i (.756)

29 Constraining the model: Decision rule Rule-Based Learner relies on a stochastic decision between competing rules The speaker cannot go for the most reliable rule all the time –The most reliable rule in both the blue language and the red language is palatalizing  the L’s should not differ –Albright and Hayes (2003) Novel verbs that are similar to many regular English verbs are more likely to take the regular past tense than novel verbs that are similar to neither regular nor irregular English verbs Regular rule is the most reliable one in both cases The two classes of words should not differ

30 If Rules compete The outcome of competition is influenced by reliability (Albright and Hayes 2003, Pierrehumbert 2006) Known words are retrieved from the lexicon not generated by the grammar Then An exceptionless rule loses productivity but can remain exceptionless if the triggering affix comes to be used mostly with segments that cannot undergo the rule. To account for the present results, Competition between rules must be resolved stochastically. The suffix and the stem shape must be chosen during a single decision stage. Summary

31 References Albright, A., and B. Hayes Rules vs. analogy in English past tenses: A computational / experimental study Cognition, 90, Bybee, J., and J. Newman Are stem changes as natural as affixes? Linguistics, 33, Kapatsinski, V. M Characteristics of a rule-based default are dissociable: Evidence against the Dual Mechanism Model. In S. Franks, F. Y. Gladney, and M. Tasseva-Kurktchieva, eds. Formal Approaches to Slavic Linguistics 13: The South Carolina Meeting, Ann Arbor, MI: Michigan Slavic Publications. Levikova, S. I Bol’shoj slovar’ molodezhnogo slenga. [The big dictionary of youth slang]. Moscow: Fair-Press. Pierrehumbert, J. B The statistical basis of an unnatural alternation. In L. Goldstein, D.H. Whalen, and C. Best (eds), Laboratory Phonology VIII: Varieties of Phonological Competence, Berlin: Mouton de Gruyter. Sheveleva, M. S Obratnyj slovar’ russkogo jazyka. [Reverse dictionary of Russian]. Moscow: Sovetskaja Enciklopedija.