Presentation is loading. Please wait.

Presentation is loading. Please wait.

October 2004CSA3050 NLP Algorithms1 CSA3050: Natural Language Algorithms Morphological Parsing.

Similar presentations


Presentation on theme: "October 2004CSA3050 NLP Algorithms1 CSA3050: Natural Language Algorithms Morphological Parsing."— Presentation transcript:

1 October 2004CSA3050 NLP Algorithms1 CSA3050: Natural Language Algorithms Morphological Parsing

2 October 2004CSA3050 NLP Algorithms2 Morphology Morphemes: The smallest unit in a word that bear some meaning, such as rabbit and s, are called morphemes. Combination of morphemes to form words that are legal in some language. Two kinds of morphology –Inflectional –Derivational

3 October 2004CSA3050 NLP Algorithms3 Inflectional/Derivational Morphology Inflectional +s plural +ed past category preserving productive: always applies (esp. new words, e.g. fax) systematic: same semantic effect Derivational +ment category changing escape+ment not completely productive: detractment* not completely systematic: apartment

4 October 2004CSA3050 NLP Algorithms4 Noun Inflections RegularIrregular Singularcatchurchmouseox Pluralcatschurchesmiceoxen

5 October 2004CSA3050 NLP Algorithms5 Morphological Parsing Morphological Parser Input Word cats Output Analysis cat N PL Output is a string of morphemes Reversibility?

6 October 2004CSA3050 NLP Algorithms6 Morphological Parsing The goal of morphological parsing is to find out what morphemes a given word is built from. mousemouse N SG micemouse N PL foxesfox N PL

7 October 2004CSA3050 NLP Algorithms7 2 Steps 1.Split word up into its possible components, using + to indicate possible morpheme boundaries. cats cat + s foxesfox + s foxesfoxe + s 2.Look up the categories of the stems and the meaning of the affixes, using a lexicon of stems and affixes cat + scat + NP + PL fox + sfox + N + PL.

8 October 2004CSA3050 NLP Algorithms8 Step 1: Surface  Intermediate FST

9 October 2004CSA3050 NLP Algorithms9 Step 1: Surface  Intermediate Operation

10 October 2004CSA3050 NLP Algorithms10 2. Intermediate  Morphemes Possible inputs to the transducer are: Regular noun stem: cat Regular noun stem + s: cat+s Singular irregular noun stem: mouse Plural irregular noun stem: mice

11 October 2004CSA3050 NLP Algorithms11 2. Intermediate  Morphemes Transducer

12 October 2004CSA3050 NLP Algorithms12 Handling Stems cat /cat mice/mouse

13 October 2004CSA3050 NLP Algorithms13 Completed Stage 2

14 October 2004CSA3050 NLP Algorithms14 Joining Stages 1 and 2 If the two transducers run in a cascade (i.e. we let the second transducer run on the output of the first one), we can do a morphological parse of (some) English noun phrases. We can change also the direction of translation (in translation mode). This transducer can also be used for generating a surface form from an underlying form.

15 October 2004CSA3050 NLP Algorithms15 Prolog The transducer specifications we have seen translate easily into Prolog format except for the other transition. arc(1,3,z:z). arc(1,3,s:s). arc(1,3,x:x). arc(1,2,#:+). arc(1,3, ).

16 October 2004CSA3050 NLP Algorithms16 Handling other arcs arc(1,3,z:z) :- !. arc(1,3,s:s) :- !. arc(1,3,x:x) :- !. arc(1,2,#:+) :- !. arc(1,3,X:X) :- !.

17 October 2004CSA3050 NLP Algorithms17 Combining Rules Consider the word “berries”. Two rules are involved –berry + s –y → ie under certain circumstances. Combinations of such rules can be handled in two ways –Cascade, i.e. sequentially –Parallel Algorithms exist for combining transducers together in series or in parallel. Such algorithms involve computations over regular relations.

18 October 2004CSA3050 NLP Algorithms18 3 Related Frameworks REGULAR LANGUAGES REGULAR EXPRESSIONS FSA

19 October 2004CSA3050 NLP Algorithms19 REGULAR RELATIONS REGULAR RELATIONS AUGMENTED REGULAR EXPRESSION S FINITE STATE TRANSDUCERS

20 October 2004CSA3050 NLP Algorithms20 Putting it all together execution of FST i takes place in parallel

21 October 2004CSA3050 NLP Algorithms21 Kaplan and Kay The Xerox View FSTi are aligned but separate FSTi intersected together

22 October 2004CSA3050 NLP Algorithms22 Summary Morphological processing can be handled by finite state machinery Finite State Transducers are formally very similar to Finite State Automata. They are formally equivalent to regular relations, i.e. sets of pairings of sentences of regular languages.


Download ppt "October 2004CSA3050 NLP Algorithms1 CSA3050: Natural Language Algorithms Morphological Parsing."

Similar presentations


Ads by Google