Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Simulations. Decision Trees 1. Decision trees (classification trees) Designed to find the combination of variables that accounts for most of.

Similar presentations


Presentation on theme: "Computer Simulations. Decision Trees 1. Decision trees (classification trees) Designed to find the combination of variables that accounts for most of."— Presentation transcript:

1 Computer Simulations

2 Decision Trees 1. Decision trees (classification trees) Designed to find the combination of variables that accounts for most of the data. (The measured data are nominal.) A. Splits are made that maximize one category in one branch and minimize it in the other. B. All combinations of independent variables are tried. C. Branches are added until adding more doesn’t give you a better fit. D. No statistical significance is calculated.

3 Decision Trees 2. Labov’s Department Store Study How it was done? What does it show? How do you judge the results? e.g. if there is 25% deletion in Saks and 30% in Macy’s? What if there is interaction? (2 or more variables working together) Logistic regression is one method, decision trees are another.

4 Decision Trees For Department store study: Dependent Variable: pronunciation of /r/ Independent Variables: Store: Klein’s, Macy’s, Saks Word: fourth, floor Try #: 1 st -normal or 2 nd -emphatic Question: How do the independent variables effect the dependent variable?

5 Decision Trees Lines of data look like this: Kleins, floor, emphatic, no-R. Kleins, fourth, emphatic, R. Saks, floor, non-emphatic, no-R.

6 1. Clerks in Klein’s do not pronounce /r/ (195/216, 90.3% correct). 2. Fourth is pronounced without an /r/ (192/270, 71.1% correct). 3. Clerks in Saks pronounce the /r/ in floor (52/82, 63.4% correct). 4. Clerks in Macy’s pronounce /r/ in floor as an emphatic second response (31/51, 60.8% correct).

7 Decision Trees 2. Oprah Winfrey’s pronunciation of [aj] as [aj] or [a] (monophthongization) (Mendoza-Denton, Hay, Jannedy) What social factors affect it? The researchers included: A. The person Oprah was talking about B. The race of that person C. The gender of the person D. Class of word (I is a pronoun, light is a noun (or verb)) E. Frequency of the word G. What sound precedes [aj]

8 Decision Trees Lines of data look like this: (“So, I talked to Tina the other day.”) Tina, black, female, I, 5443.7, [ow]

9 The decision tree only found a good fit with two variables: A. The person Oprah was talking about B. The kind of sound that precedes [aj] What are the “rules” this tree gives? Decision trees are good at making sense of messy data.

10 Analogical Modeling

11 Generative Approach We need to conserve storage space in our brain Store only what is unpredictable Sang, thought Use rules to derive or parse the predictable Add –ed to regular verbs Walked, formed Connections between irregular are OK, but not regulars Rang~sang (causes *brang) Drove~dove (causes *arrove) ***steam~seam, truck~luck~tuck***

12 Spanish Stress Generative approach Most stress is predictable so it isn’t stored Rules applied in production if word ends in C (except –s and –n) stress the final syllable /animal/ > [animál], /motor/ > [motór] if a words ends in V (and –s, -n) stress the penult syllable /tisa/ > [tísa], /komen/ > [cómen] antepenults stress in unpredictable so it must be stored or marked somehow /periódiko/ > [periódico], /depósito/ > [depósito]

13 Analogical Approach What evidence is there that we need to conserve storage space in our brain Lots of evidence we store many details of words Store everything, not just what is unpredictable Sang, thought, walked, formed All word form connections between others that are semantically, phonetically, morphologically, relationally similar Semantically: cut/tear, break/breach Phonetically: tribe/bribe Morphologically: sit/sat, reveal/revelation Relationally (collocates): homework/school, nurse/shot

14 Spanish Stress Analogical approach All words stored with stress No rules needed Find similar words and apply stress they have Animál has final stress due to its neighbors with final stress (tamál, formár,... ) Antepenults stress IS predictable based on other words with this stress. Lots of them end in –iko, and -ito /periódiko/ > [periódiko], /depósito/ > [depósito]

15 Analogical modeling How do you test the theory that analogy explains stress? You need a model

16 Analogical modeling How do you test the theory that analogy explains stress? You need a model ANALOGICAL MODELING OF LANGUAGE (Royal Skousen)

17 Analogical modeling You need a database that approximates what speakers know For Spanish it’s the 5000 most frequent words You need information about the words that is used to find similar words Phonemic and morphological information about Spanish words Relatando Phones by syllable Re= la= tan =do= Morphology Re= la= tan =do= Gerund Stress placement Penult

18 Analogical modeling Tool question Where would you get the 5000 most frequent words?

19 Analogical modeling Outcome is probabilistic

20 Analogical modeling Results (based onmajority rules)

21 Analogical modeling Leave one out simulation Take each word out one by one Pretend you don’t know where the stress is Use analogy to predict it

22 Analogical modeling Leave one out simulation Take each word out one by one Pretend you don’t know where the stress is Use analogy to predict it Outcome 94.4% correct

23 Analogical modeling Leave one out simulation Take each word out one by one Pretend you don’t know where the stress is Use analogy to predict it Outcome 94.4% correct How does this compare to applying rules?

24 Analogical modeling Leave one out simulation Take each word out one by one Pretend you don’t know where the stress is Use analogy to predict it Outcome 94.4% correct How does this compare to applying rules? Rules get 86.6% correct

25 Analogical modeling Is AM doing it the way people do? Hochberg taught kids made up words, then observed their stress errors AM made regularization and irregularization errors in same direction

26

27 Analogy in phonology càpitalálisticcapi[ ɾ ]alistic mìlitarísticmili[th]aristic Same prosodic structure, but different realizations of /t/. Same rule should apply to both

28 Analogy in phonology Steriade’s (2000) rule predicts flap in both cases capi[ ɾ ]al rule explains the flap in capi[ ɾ ]alistic mili[t]ary analogy messes up the rule in the stop in mili[th]aristic

29 Analogy in phonology Analogy says there is not rule and it’s all analogy Lets’ determine outcome (e.g. [ ɾ ] or [t]) based on the similarity of the test form to a database of stored instances.

30 Analogy in phonology Analogy says there is not rule and it’s all analogy Lets’ determine outcome (e.g. [ ɾ ] or [t]) based on the similarity of the test form to a database of stored instances. 3,719 instances of allophones of /t/ taken from TIMIT (a tool from LDC!) 630 speakers read 10 sentences. Utterances transcribed 644 [ ɾ ], 234 [ ʔ ], 284 [Ø], 760 [t], 860 [t], and 969 [th], 48 [d].

31 Analogy in phonology Each instance of /t/ is encoded to include its allophonic realization and the context it appears in. The phones or boundaries three slots to the left and right of /t/, and stress are encoded. e.g. I know I didn't meet her 1) [ ɾ ], 2) word boundary, 3) [m], 4) [i], 5) word boundary, 6) [ ɚ ], 7)pause, 8) primary stress, 9) unstressed

32 Analogy in phonology Test words: capitalistic, negativistic, positivistic, primitivistic, relativistic, habitability, irritability, immutability, dissatisfaction. Two simulations: Base words of test words contain [ ɾ ] in database. Base words of test words contain [th] in database.

33

34 ANALOGY IN PHONOLOGY The pronunciation of the base form influences that of the derived for per analogy. The base form is not the only word influencing the derived form. capi[th]alistic predicted at 90%, yet capital only accounts for 30% of this. Words such as appetite, hepatitis, and particular also influence the outcome.

35 35 Ambisyllabicity Common in English Merriam Webster: si.lly, ho.llow, ba.lance Cambridge:sill.y, ho.llow or holl.ow, bal.ance People vary in their perceptions, practices This has implications for doubled consonants (ambisyllabicity) Frequently observed in the data Hessari / Hesaari


Download ppt "Computer Simulations. Decision Trees 1. Decision trees (classification trees) Designed to find the combination of variables that accounts for most of."

Similar presentations


Ads by Google