October 2004CSA3050 NLP Algorithms1 CSA3050: Natural Language Algorithms Morphological Parsing.

Slides:



Advertisements
Similar presentations
Jing-Shin Chang1 Morphology & Finite-State Transducers Morphology: the study of constituents of words Word = {a set of morphemes, combined in language-dependent.
Advertisements

Finite State Automata. A very simple and intuitive formalism suitable for certain tasks A bit like a flow chart, but can be used for both recognition.
CS Morphological Parsing CS Parsing Taking a surface input and analyzing its components and underlying structure Morphological parsing:
Computational Morphology. Morphology S.Ananiadou2 Outline What is morphology? –Word structure –Types of morphological operation – Levels of affixation.
CSA4050: Advanced Topics in NLP Semantics IV Partial Execution Proper Noun Adjective.
Finite-State Transducers: Applications in Natural Language Processing Heli Uibo Institute of Computer Science University of Tartu
Morphological Analysis Chapter 3. Morphology Morpheme = "minimal meaning-bearing unit in a language" Morphology handles the formation of words by using.
Finite-State Transducers Shallow Processing Techniques for NLP Ling570 October 10, 2011.
October 2006Advanced Topics in NLP1 Finite State Machinery Xerox Tools.
Regular Expressions (RE) Used for specifying text search strings. Standarized and used widely (UNIX: vi, perl, grep. Microsoft Word and other text editors…)
6/2/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 2 Giuseppe Carenini.
Morphology & FSTs Shallow Processing Techniques for NLP Ling570 October 17, 2011.
6/10/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 3 Giuseppe Carenini.
LIN3022 Natural Language Processing Lecture 3 Albert Gatt 1LIN3022 Natural Language Processing.
Computational language: week 9 Finish finite state machines FSA’s for modelling word structure Declarative language models knowledge representation and.
1 Morphological analysis LING 570 Fei Xia Week 4: 10/15/07 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.
Morphology See Harald Trost “Morphology”. Chapter 2 of R Mitkov (ed.) The Oxford Handbook of Computational Linguistics, Oxford (2004): OUP D Jurafsky &
Morphological analysis
CS 4705 Morphology: Words and their Parts CS 4705 Julia Hirschberg.
CS 4705 Lecture 3 Morphology: Parsing Words. What is morphology? The study of how words are composed from smaller, meaning-bearing units (morphemes) –Stems:
Finite State Transducers The machine model we will study for morphological parsing is called the finite state transducer (FST) An FST has two tapes –input.
CS 4705 Morphology: Words and their Parts CS 4705 Julia Hirschberg.
Introduction to English Morphology Finite State Transducers
Morphology and Finite-State Transducers. Why this chapter? Hunting for singular or plural of the word ‘woodchunks’ was easy, isn’t it? Lets consider words.
Morphology (CS ) By Mugdha Bapat Under the guidance of Prof. Pushpak Bhattacharyya.
Lecture 3, 7/27/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 4 28 July 2005.
ICS611 Introduction to Compilers Set 1. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
October 2006Advanced Topics in NLP1 CSA3050: NLP Algorithms Finite State Transducers for Morphological Parsing.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture4 1 August 2007.
Morphological Recognition We take each sub-lexicon of each stem class and we expand each arc (e.g. the reg-noun arc) with all the morphemes that make up.
Introduction Morphology is the study of the way words are built from smaller units: morphemes un-believe-able-ly Two broad classes of morphemes: stems.
Computational Linguistics Yoad Winter *General overview *Examples: Transducers; Stanford Parser; Google Translate; Word-Sense Disambiguation * Finite State.
LING 388: Language and Computers Sandiway Fong Lecture 22: 11/10.
10/8/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 2 Giuseppe Carenini.
Session 11 Morphology and Finite State Transducers Introduction to Speech Natural and Language Processing (KOM422 ) Credits: 3(3-0)
October 2004CSA3050 NL Algorithms1 CSA3050: Natural Language Algorithms Words, Strings and Regular Expressions Finite State Automota.
Finite State Transducers
Chapter 3: Morphology and Finite State Transducer Heshaam Faili University of Tehran.
CSA2050 Introduction to Computational Linguistics Lecture 3 Examples.
Finite State Transducers for Morphological Parsing
Words: Surface Variation and Automata CMSC Natural Language Processing April 3, 2003.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture 3 27 July 2007.
Morphological Analysis Chapter 3. Morphology Morpheme = "minimal meaning-bearing unit in a language" Morphology handles the formation of words by using.
Human Language Technology Finite State Transducers.
CS 4705 Lecture 3 Morphology. What is morphology? The study of how words are composed of morphemes (the smallest meaning-bearing units of a language)
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
CSA3050: Natural Language Algorithms Finite State Devices.
The Simplest NL Applications: Text Searching and Pattern Matching Read J & M Chapter 2.
Natural Language Processing Chapter 2 : Morphology.
October 2007Natural Language Processing1 CSA3050: Natural Language Algorithms Words and Finite State Machinery.
1/11/2016CPSC503 Winter CPSC 503 Computational Linguistics Lecture 2 Giuseppe Carenini.
CSA4050: Advanced Topics in NLP Computational Morphology II Introduction 2 Level Morphology.
November 2003Computational Morphology VI1 CSA4050 Advanced Topics in NLP Non-Concatenative Morphology – Reduplication – Interdigitation.
Two Level Morphology Alexander Fraser & Liane Guillou CIS, Ludwig-Maximilians-Universität München Computational Morphology.
CIS, Ludwig-Maximilians-Universität München Computational Morphology
Lecture 7 Summary Survey of English morphology
Speech and Language Processing
Basic Parsing with Context Free Grammars Chapter 13
Morphology: Parsing Words
CPSC 503 Computational Linguistics
CSCI 5832 Natural Language Processing
Speech and Language Processing
CSCI 5832 Natural Language Processing
LING 138/238 SYMBSYS 138 Intro to Computer Speech and Language Processing Dan Jurafsky 11/24/2018 LING 138/238 Autumn 2004.
By Mugdha Bapat Under the guidance of Prof. Pushpak Bhattacharyya
CPSC 503 Computational Linguistics
CPSC 503 Computational Linguistics
CPSC 503 Computational Linguistics
Morphological Parsing
CSCI 5832 Natural Language Processing
Presentation transcript:

October 2004CSA3050 NLP Algorithms1 CSA3050: Natural Language Algorithms Morphological Parsing

October 2004CSA3050 NLP Algorithms2 Morphology Morphemes: The smallest unit in a word that bear some meaning, such as rabbit and s, are called morphemes. Combination of morphemes to form words that are legal in some language. Two kinds of morphology –Inflectional –Derivational

October 2004CSA3050 NLP Algorithms3 Inflectional/Derivational Morphology Inflectional +s plural +ed past category preserving productive: always applies (esp. new words, e.g. fax) systematic: same semantic effect Derivational +ment category changing escape+ment not completely productive: detractment* not completely systematic: apartment

October 2004CSA3050 NLP Algorithms4 Noun Inflections RegularIrregular Singularcatchurchmouseox Pluralcatschurchesmiceoxen

October 2004CSA3050 NLP Algorithms5 Morphological Parsing Morphological Parser Input Word cats Output Analysis cat N PL Output is a string of morphemes Reversibility?

October 2004CSA3050 NLP Algorithms6 Morphological Parsing The goal of morphological parsing is to find out what morphemes a given word is built from. mousemouse N SG micemouse N PL foxesfox N PL

October 2004CSA3050 NLP Algorithms7 2 Steps 1.Split word up into its possible components, using + to indicate possible morpheme boundaries. cats cat + s foxesfox + s foxesfoxe + s 2.Look up the categories of the stems and the meaning of the affixes, using a lexicon of stems and affixes cat + scat + NP + PL fox + sfox + N + PL.

October 2004CSA3050 NLP Algorithms8 Step 1: Surface  Intermediate FST

October 2004CSA3050 NLP Algorithms9 Step 1: Surface  Intermediate Operation

October 2004CSA3050 NLP Algorithms10 2. Intermediate  Morphemes Possible inputs to the transducer are: Regular noun stem: cat Regular noun stem + s: cat+s Singular irregular noun stem: mouse Plural irregular noun stem: mice

October 2004CSA3050 NLP Algorithms11 2. Intermediate  Morphemes Transducer

October 2004CSA3050 NLP Algorithms12 Handling Stems cat /cat mice/mouse

October 2004CSA3050 NLP Algorithms13 Completed Stage 2

October 2004CSA3050 NLP Algorithms14 Joining Stages 1 and 2 If the two transducers run in a cascade (i.e. we let the second transducer run on the output of the first one), we can do a morphological parse of (some) English noun phrases. We can change also the direction of translation (in translation mode). This transducer can also be used for generating a surface form from an underlying form.

October 2004CSA3050 NLP Algorithms15 Prolog The transducer specifications we have seen translate easily into Prolog format except for the other transition. arc(1,3,z:z). arc(1,3,s:s). arc(1,3,x:x). arc(1,2,#:+). arc(1,3, ).

October 2004CSA3050 NLP Algorithms16 Handling other arcs arc(1,3,z:z) :- !. arc(1,3,s:s) :- !. arc(1,3,x:x) :- !. arc(1,2,#:+) :- !. arc(1,3,X:X) :- !.

October 2004CSA3050 NLP Algorithms17 Combining Rules Consider the word “berries”. Two rules are involved –berry + s –y → ie under certain circumstances. Combinations of such rules can be handled in two ways –Cascade, i.e. sequentially –Parallel Algorithms exist for combining transducers together in series or in parallel. Such algorithms involve computations over regular relations.

October 2004CSA3050 NLP Algorithms18 3 Related Frameworks REGULAR LANGUAGES REGULAR EXPRESSIONS FSA

October 2004CSA3050 NLP Algorithms19 REGULAR RELATIONS REGULAR RELATIONS AUGMENTED REGULAR EXPRESSION S FINITE STATE TRANSDUCERS

October 2004CSA3050 NLP Algorithms20 Putting it all together execution of FST i takes place in parallel

October 2004CSA3050 NLP Algorithms21 Kaplan and Kay The Xerox View FSTi are aligned but separate FSTi intersected together

October 2004CSA3050 NLP Algorithms22 Summary Morphological processing can be handled by finite state machinery Finite State Transducers are formally very similar to Finite State Automata. They are formally equivalent to regular relations, i.e. sets of pairings of sentences of regular languages.