Presentation is loading. Please wait.

Presentation is loading. Please wait.

Paul De Palma George Luger Departments of Computer Science Gonzaga University University of New Mexico 1.

Similar presentations


Presentation on theme: "Paul De Palma George Luger Departments of Computer Science Gonzaga University University of New Mexico 1."— Presentation transcript:

1 Paul De Palma George Luger Departments of Computer Science Gonzaga University University of New Mexico (depalma@gonzaga.edu) 1

2  Reversal of the expected linear ordering of sounds  Instead of xy we find yx  Examples  tl shift: borrowed noun chipotle  chipolte (SAE: a spice)  ts shift: binyan 5 hitsader  histader (Modern Hebrew: “he got organized” )  hr shift: dative singular tehernek  dative plural terhek (Hungarian: “load”)  rh shift: Expected tiirhisaskhus  actual tihriasku (Pawnee: “he is called”)  Metathesis Myth: sporadic, irregular, due to performance errors  String of sounds realized as xy in language A can be yx in language B 2

3  A usage-based phonological account (Elizabeth Hume)  primarily synchronic  Can be extended to language change  Utterance Selection Theory (William Croft)  Genetic Algorithm operationalizes Utterance Selection Theory 3

4  Metathesis requires two conditions 1. An indeterminate speech signal 2. Output that conforms to existing patterns in language  Example: chipotle/chipolte  In SAE, tl (stop consonant preceding a lateral) is indeterminate  Stop consonant following the lateral is frequent in post-vocalic position (cold,sold,mold,fold,molt,bolt,jolt,colt)  SAE speakers transform tl to lt 4

5  Natural Selection requires:  A population of individuals with distinct characteristics  A mechanism for replicating those characteristics  Interaction among individuals and the environment  Selective pressure from the environment producing differential reproduction of the individuals and characteristics  Extended to language:  Language: A population of utterances (not a system of signs or a collection of words and rules that operate on them)  Normal replication: utterance conforms to the conventions of language use  Altered replication: utterance violates convention  Selection: graduate establishment of a new convention through use 5

6  Operationalizes (i.e, renders computationally precise)  Usage-based account of metathesis  Usage-based account of language change  Based loosely on the Darwinian notion of natural selection 6

7 GA() { Initialize(population); //build initial population ComputeCost(population); //apply cost function Sort(population); //rank population while (population has not converged on a good-enough solution) { Pair(population); //decide which members reproduce Mate(population); //exchange characteristics Mutate(population); //randomly perturb genes Sort(population); //rank population TestConvergence(population); //has a new species appeared? } 7

8 Embodies most of the theory being modeled. For example, 1. Prevocalic stop (e.g., te) is more salient than a postvocalic stop. Give a fitness boost. 2. Penalize words with postvocalic stops (e.g., et) 3. Glottals (e.g., g), liquids (e.g., l), glides (e.g., w) bleed into adjacent sounds when followed by a stop (e.g., t). Penalize sequences like lt. 4. A stop followed by any non-stop consonant (e.g., tl) is perceptually weak. Penalize stop/non-stop consonant sequences 5. A stop followed by a strident (e.g., ts) is perceptually weak. Penalize prestrident stops. 8

9  Each utterance in the population is tagged with a collection of boosts and penalties  The collection makes the underlying phonological theory computationally precise 9

10  Encode the GA as a collection of objects in Java executable under Linux  Parameters  Population size: 64 strings  Mutation factor:.5%  For each of 1, 2, 4 base strings in the population, begin at parity then double the number of target strings three times  Fill out the balance of the population with randomly generated character sequences  For each population configuration  Run GA 250 times  250 generations per run  Collect results per run 10

11 1. Input an initial population of the base word and the target word 2. Generate random sequences of characters that fill out the population. 3. Assign a fitness value to each of the sequences that comprise the population. 4. Sort the population by fitness value 5. Collect the population into two-tuples from highest to lowest fitness 6. Exchange pieces of sounds between each pair 7. Randomly shift a fixed fraction of the sounds  the action of chemical/biological/radiological mutagens on individuals. 8. Sort the population. Stop if some predetermined condition is met, else go to step 3. 11

12  chipotle/chipolte  After 60 generations chipolte tokens are 95% of the population  chipotle disappears within 3 generations  hitsader/histader  After 48 generations histader tokens are 97.3% of the population  hitsader disappears within 2 generations 12

13  Accurate but underspecified  Computational model supplies missing precision  Usage-based aspect modeled as a frequency affect  Target tokens tends stabilize more quickly at a higher fraction as their number in the initial population increases  The larger the number of base tokens in the initial population, the better the performance 13

14  Hume’s account of metathesis can be reframed as an account of (one type) of language change  Can be rendered computationally precise using the Genetic Algorithm 14

15 15

16 Ratio of Base to Target Generation Chipotle Disappeared Generation Chipolte Stabilized Percent of Chipolte Tokens at Stabilization 1:1311973.4 1:236893.7 1:436096.8 1:824498.4 2:239092.1 2:437296.8 2:825098.4 2:1623198.4 4:435896.8 4:824498.4 4:1623398.4 4:3212698.4 16

17 Ratio of Base to Target Generation Hitsader Disappeared Generation Histader Stabilized Percent of Histader Tokens at Stabilization 1:127984.3 1:226598.4 1:425598.4 1:823998.4 2:225998.4 2:424798.4 2:824398.4 2:1613398.4 4:424998.4 4:814398.4 4:1613298.4 4:32123100 17

18  Use transcribed corpora to determine the frequency of both vulnerable cues and the targets of metathetic change  Use frequencies to weight penalties and rewards (adding precision to statement like, “[they] contribute to indeterminacy: /t/ with perceptually vulnerable cues and /l/ with stretched out features,” Hume, 2004, p.223)  Generate all instances of metathesis within a language 18

19 Croft, W. (2000). Explaining Language Change: An Evolutionary Approach. Harlow, England: Pearson. Hume, E. (2004). The Indeterminancy/Attestation Model of Metathesis. Language 80(2): 203-237. 19


Download ppt "Paul De Palma George Luger Departments of Computer Science Gonzaga University University of New Mexico 1."

Similar presentations


Ads by Google