# 15 lines representing a bull Traditional statistics Assumes data is independent Comparative methods.

## Presentation on theme: "15 lines representing a bull Traditional statistics Assumes data is independent Comparative methods."— Presentation transcript:

15 lines representing a bull

Traditional statistics Assumes data is independent Comparative methods

English Fish Danish Fisk Dutch Visch Fish Ryba Czech Ryba Russian Ryba Bulgarian Riba 23 other languages 34other languages

13517 Average 17 1 Who, Three 35 Person, Dirty

Englishheresea(A)(A)waterwhen Germanhiersee, meer(A,B)(A,B)wasserwenn Frenchicimer(B)(B)eauquand Italianqui, quamare(B)(B)acquaquando Greekedothalasa(C)(C)neropote Hittitekaaruna-(D)watarkuwapi Languages Meanings sea (A)meer (B)thalasa (C)aruna- (D) English1000 German1100 French0100 Italian0100 Greek0010 Hittite0001

Q 01 0 Non cognate 1 Cognate Q 10 0 1 0 1 0 0 0 0 Time 1000 years

Results = Data + Method

Most probable Random tree -58204 Log units 4.1 x 10 14107 Infinite number of poor trees

Out group Greek Indo-Iranian Slavic Germanic Celtic Romance

Name, 3 cognate classes Class A, Gypsy (Alav), Persian (Esm) Class B, Latvian (Vards), Lithuanian (Vardas) Class C, All the rest, Hindi (Nam), Greek (Onoma), Italian (Nome) Class A Class BClass C B A A B C A A C B C C B B A, C B, ect The estimated instantiations transition rate To many parameters, not enough data

2 cognate classes Slow rateFast rate Class 1 Class 2

Red Salt Five

Mean = 3.05 1.82 Median = 2.74 Min. = 0.09 Max = 9.27 100 fold difference Mean rates for the 200 words Slow two, who, one, night, to die Fast dirty, to turn, to stab,

Word Half life 50% chance of the word being replaced by a non-cognate form Years Mean5260 Median2530 Min750 Max76530 Based on IE being 8000 years

I-E tree showing variation in rates of lexical replacement, per 10k years One 0.43Ear 0.88Sand 4.5 ROMANCE GERMANIC GREEK GERMANIC SLAVIC INDO- IRANIAN

Spoken word frequency British National Corpus N = 4840 words mean = 194 geometric mean = 35.94 median = 25

Distribution of frequency of word use (20-100 million words) Most words used < 100 times per million

r=0.87 r=0.88 r=0.87 Frequent of use is very stable thru out IE

Frequency vs rate of lexical evolution r=-0.37 r=-0.35 r=-0.41 r=-0.32

Parts of speech conjunctions ---- prepositions ---- adjectives ---- verbs ---- nouns ---- special adverbs---- pronouns ---- numbers ---- R 2 =0.50 R 2 =0.48 Numbers, pronouns, special adverbs Stronger selection?

Some similarities between linguistic and genetic systems

Similar presentations