Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jing-Shin Chang1 Morphology & Finite-State Transducers Morphology: the study of constituents of words Word = {a set of morphemes, combined in language-dependent.

Similar presentations


Presentation on theme: "Jing-Shin Chang1 Morphology & Finite-State Transducers Morphology: the study of constituents of words Word = {a set of morphemes, combined in language-dependent."— Presentation transcript:

1 Jing-Shin Chang1 Morphology & Finite-State Transducers Morphology: the study of constituents of words Word = {a set of morphemes, combined in language-dependent ways}  morpheme: small meaning bearing unit  e.g., books = book+s, cats = cat + s Classes of Morphemes  stem (root)  affixes ( 詞綴 ) Morphological Parsing (or Analysis):  breaking down surface forms (or input forms) into stem and affixes  e.g., foxes = “fox” + “-es” (+N, +PL)  stemming: mapping surface form to stem (extracting stem from surface form) Morphological Generation:  generate surface forms from stem and morphological features

2 Jing-Shin Chang2 Morphology & Finite-State Transducers Applications:  spelling check, tokenization for parsing Knowledge for Morphological Analysis  morphological rules (morphotactics): constituents of words & order  spelling rules (orthographic rules): spelling changes Dictionary/Lexicon:  list of stems and affixes  stems of regular words (plus irregular variants) as indexing keys  not efficient to enumerate all morphological variants  some morphemes are productive: can be applied to all words or new words (impossible to list all of them)  morphological variants depends on spelling as well as pronunciation  morphologically complex languages (e.g., Turkish) may have a large number of morphological variants

3 Jing-Shin Chang3 Morphology & Finite-State Transducers Models for morphological analysis/generation  generate-and-test: enumerate all possibilities & test against constraints  FSA / two-level FST model: modeling lexicon, morphological rules and orthographic rules as finite state automata or transducers

4 Jing-Shin Chang4 English Morphology Morphology:  the study of the way words are built up from smaller meaning-bearing units (morphemes)  morpheme: the minimal meaning-bearing unit in a language Classes of Morphemes  stem (root): main morpheme of the word, supplying main meaning  affixes ( 詞綴 ): add additional meanings Affixes:  prefixes: un-happy  suffixes: eat-s  infixes: inserted inside the stem  Philipine language Tagalog: hingi (“borrow”) => h-um-ingi (agent of borrow)  circumfixes:  sagen (“to say”) => ge-sag-t (“said”) (German) [pp]

5 Jing-Shin Chang5 English Morphology Affixes:  concatenative: prefix & suffixes  non-concatenative: infixes & templatic morphology Templatic: root-and-pattern  Arabic, Hebrew, Semitic languages  Hebrew: lmd (“learn”, “study”) (tri-consonantal root)  active voice template: CaCaC => lamad (‘he studied’)  intensive CiCeC template: => limed (‘he taught’)  intensive passive template CuCaC => lumad (‘he was taught’) Multiple affixes: un-believabl-y Agglutinative languages:  languages that tends to string affixes together (Turkish, Japanese, Korean)

6 Jing-Shin Chang6 English Morphology Infection:  stem + morphemes => same class  e.g., book + s => books (same meaning, same part of speech( 詞類 )) Derivation:  stem + morphemes => different class  e.g., computerize + ation => computerization [verb => noun]

7 Jing-Shin Chang7 English Morphology Inflectional Morphology  only Noun, Verb, Adjective, Adverb can be inflected Noun: Plural, Possessive  Regular: Plural (+s/+es/+ies), Possessive (+’s, +s’)  Irregular: ox-en, mouse => mice Verb (main/ 一般, modal/ 助, primary/be):  Forms: stem ( 現 / 不定 ), -s ( 現 /P3SG), -ing( 動名 / 現分 ), -ed ( 過 / 過分 / 完成 )  Regular: (+s/+es,-y+ies), -e+ing/+ing/+.ing (consonant doubling), +d/+ed/+.ed  Irregular: e.g., eat => ate, eaten (+en), catch => caught  Consonant doubling: ( 短母音 )+ 單子音 => double  -c => -ck (picnicked) Adjective/Adverb: comparative/extreme  happy => happier, happiest, happily

8 Jing-Shin Chang8 English Morphology Derivational Morphology  usually resulting in different classes  need part of speech (POS) conversion from root POS & affixes to get correct POS Nominalization: V/A => N  computerize => computerization  more examples … N/V => A  computation => computational  more examples …

9 Jing-Shin Chang9 Chinese Morphology Chinese Morphemes  hard to be distinguished from characters and words and compound words  free morphemes  bound morphemes Examples  副 - 總統, 前 - 妻, 非 - 經濟 ( 因素 )  學生 - 們  哈日 - 族, 銀髮 - 族  工業 - 化, 綠 - 化, 藍 - 化, 腐 - 化, 石 - 化, 神 - 化  公務 - 員, 業務 - 員, 推銷 - 員, 運動 - 員

10 Jing-Shin Chang10


Download ppt "Jing-Shin Chang1 Morphology & Finite-State Transducers Morphology: the study of constituents of words Word = {a set of morphemes, combined in language-dependent."

Similar presentations


Ads by Google