Presentation is loading. Please wait.

Presentation is loading. Please wait.

Natural Language Processing - English Grammar -

Similar presentations


Presentation on theme: "Natural Language Processing - English Grammar -"— Presentation transcript:

1 74.406 Natural Language Processing - English Grammar -
(Mostly) English Grammar Morphology, Word Classes, POS Tagging Grammar Extensions on the Sentence and Phrase Level Sentence Level Constructs Noun Phrase - Modifications Verb Phrase - Subcategorization (Jurafsky, Ch. 3, 6.1, 8 and 9; Allen Ch. 2)

2 Morphology

3 Basics of Morphology Morpheme = "minimal meaning-bearing unit in a language" e.g. cats, cat, -s Non-Concatenative Morphology templatic morphology: modify word templates Hebrew: lmd (study, learn) - limed ("he taught") - lumad ("he was taught") Concatenative Morphology word stem + prefix + suffix (+ infix + circumfix) Inflectional Morphology word stem + grammatical morpheme; same word class; cat+s Derivational Morphology word stem + grammat. morpheme; other word class; mob+b+ing

4 Inflectional Morphology
word stem + grammatical morpheme cat+s only for nouns, verbs, and some adjectives Nouns plural: regular: +s, +es irregular: mouse - mice; ox - oxen rules for exceptions: e.g. -y -> -ies like: butterfly - butterflies possessive: +'s, +' Verbs main verbs (sleep, eat, walk) modal verbs (can, will, should) primary verbs (be, have, do)

5 Inflectional Morphology (verbs)
Verb Inflections only for: main verbs (sleep, eat, walk); primary verbs (be, have, do) Morpholog. Form Regularly Inflected Form stem walk merge try map -s form walks merges tries maps -ing participle walking merging trying mapping past; -ed participle walked merged tried mapped Morph. Form Irregularly Inflected Form stem eat catch cut -s form eats catches cuts -ing participle eating catching cutting -ed past ate caught cut -ed participle eaten caught cut

6 Inflectional and Derivational Morphology (adjectives)
Adjective Inflections and Derivations: prefix un- unhappy adjective, negation suffix -ly happily adverb, mode -er happier adjective, comparative 1 -est happiest adjective, comparative 2 suffix -ness happiness noun plus combinations, like unhappiest, unhappiness. Distinguish different adjective classes, which can or cannot take certain inflectional or derivational forms, e.g. no negation for big.

7 Morphological Processing
Knowledge lexical entry: stem plus possible prefixes, suffixes plus word classes, e.g. endings for verb forms (see tables above) rules: how to combine stem and affixes, e.g. add s to form plural of noun as in dogs orthographic rules: spelling, e.g. double consonant as in mapping Processing: Finite State Transducers take information above and analyze word token / generate word form

8 Fig. 3.3 FSA for verb inflection.

9 Fig. 3.4 Simple FSA for adjective inflection.
Fig. 3.5 More detailed FSA for adjective inflection.

10 Fig. 3.7 Compiled FSA for noun inflection.

11 Fig. 3.12 Lexical and intermediate tape of a FS Transducer
Fig Lexical, intermediate, and surface tape after spelling transformation.

12 Word Classes and POS Tagging

13 Word Classes morphological properties distributional properties
Sort words into categories according to: morphological properties Which types of morphological forms do they take? e.g. form plural: noun+s; 3rd person: verb+s distributional properties What other words or phrases can occur nearby? e.g. possessive pronoun before noun semantic coherence Classify according to similar semantic type. e.g. nouns refer to object-like entities

14 Open vs. Closed Word Classes
Open Class Types The set of words in these classes can change over time, with the development of the language, e.g. spaghetti and download Open Class Types: nouns, verbs, adjectives, adverbs

15 Open vs. Closed Word Classes
Closed Class Types The set of words in these classes are very much determined and hardly ever change for one language. Closed Class Types: prepositions, determiners, pronouns, conjunctions, auxiliary verbs, particles, numerals

16 Open Class Words: Nouns
denote objects, concepts, entities, events Proper Nouns Names for specific individual objects, entities e.g. the Eiffel Tower, Dr. Kemke Common Nouns Names for categories, classes, abstracts, events e.g. fruit, banana, table, freedom, sleep, race, ... Count Nouns enumerable entities, e.g. two bananas Mass Nouns not countable items, e.g. water, salt, freedom

17 Open Class Words: Verbs
denote actions, processes, and states e.g. smoke, dream, rest, run several morphological forms, e.g. non-3rd person - eat 3rd person - eats progressive/ - eating present participle/ gerundive past participle - eaten simple past - ate

18 Open Class Words: Verbs (2)
Verbs - use of morphological forms, examples: non-3rd person eat I eat. We eat. They eat. 3rd person eats He eats. She eats. It eats. progressive eating He is eating. He will be eating. He has been eating. e.g. present participle He is eating. gerundive Eating scorpions [NP] is common in China. use as adjective Eating children [NP] are common at McDonalds. past participle eaten He has eaten the scorpion. The scorpion was eaten. simple past ate He ate the scorpion.

19 Open Class Words: Adjectives
denote qualities or properties of objects e.g. heavy, blue, content most languages have concepts for colour - white, green, ... age - young, old, ... value - good, bad, ... not all languages have adjectives as separate class

20 Open Class Words: Adverbs 1
denote modifications of actions (verbs) or qualities (adjectives) e.g. walk slowly or heavily drunk Directional or Locational adverbs specify direction or location e.g. go home, stay here

21 Open Class Words: Adverbs 2
Degree Adverbs specify extent of process, action, property e.g. extremely slow, very modest Manner Adverbs specify manner of action or process e.g. walk slowly, run fast Temporal Adverbs specify time of event or action e.g. yesterday, Monday

22 Closed Word Classes Closed Class Types:
Prepositions: on, under, over, at, from, to, with, ... Determiners: a, an, the, ... Pronouns: he, she, it, his, her, who, I, ... Conjunctions: and, or, as, if, when, ... Auxiliary verbs: can, may, should, are, … Particles: up, down, on, off, in, out, … Numerals: one, two, three, ..., first, second, ...

23 Closed Word Class: Prepositions
occur before noun phrases; describe relations; often spatial or temporal relations e.g. on the table spatial in two hours temporal

24 Closed Word Class: Pronouns
reference to entities, events, relations etc. Personal Pronouns refer to persons or entities, e.g. you, he, it, ... Possessive Pronouns possession or relation between person and object, e.g. his, her, my, its, ... Wh-Pronouns reference in question or back reference, e.g. Who did this ..., Frieda, who is 80 years old ...

25 Closed Word Class: Conjunctions
join phrases or sentences semantics is varied and complex Coordinating Conjunction Join two phrases or sentences on the same level through conjunctions like and, or, but, ... e.g. He takes a cat and a dog. He takes a dog and she takes a cat. Subordinating Conjunction Connect embedded phrases through e.g. that e.g. He thinks that the cat is nicer than the dog.

26 Closed Word Class: Auxiliary Verbs
Mark semantic features of main verb. Often describe tense and modality aspects. Semantics is difficult. Tense addition expressing present, past or future, ... e.g. He will take the cat home. Aspect addition expressing completion of action e.g. He is taking the cat home. (incomplete) Mood addition expressing necessity of action e.g. He can take the cat home. (possible)

27 Closed Word Class: Copula, Modal Verbs
Copula (be, do, have) and Modal Verbs (can, should, ...) are subclasses of Auxiliary Verbs. Describe state, process, or tense / modality of action. Semantics: difficult (e.g. modal logic) State / Process: be and do e.g. He is at home. He does nothing. Tense: have e.g. He has taken the cat home. Modality: can, ought to, should, must e.g. He can take the cat home. (possibility)

28 POS Tagging - Taggers Methods for POS Tagging: Rule-Based Tagging
use dictionary to assign POS; then use rules to disambiguate words Stochastic Tagging determines tags based on the probability of the occurrence of the tag, given the observed word, in the context of the preceding tags. Similar to Hidden Markov Models (probabilistic finite state machines). Learn tagging rules. Problem in POS Tagging: Ambiguity Problem in POS Tagging: Which tag set to use?

29 POS Tagging - Tagsets Tagsets for English Penn Treebank, 45 tags
Brown corpus, 87 tags C5 tagset, 61 tags C7 tagset, 146 tags For references see Jurafsky, p.296 C5 and C7 tagsets are listed in Appendix C

30 Fig Penn Treebank, 45 tags

31 Fig. 8.5 English modal verbs and frequency counts from the CELEX on-line dictionary.

32 Ambiguity in POS Tagging
Fig Word types and ambiguity in the Brown corpus.

33 Sentence Level Constructs

34 Sentence Level Constructs I
declarative “This flight leaves at 9 am.” S → NP VP imperative “Book this flight for me.” S → VP

35 Sentence Level Constructs II
yes-no-question “Does this flight leave at 9 am?” S → Aux NP VP wh-question “When does this flight leave Winnipeg?” S → Wh-NP Aux NP VP

36 Noun Phrase Modification 1
Noun Phrase Modifiers head = the central noun of the NP modifiers = additions to head noun included in NP modifiers before the head noun (prenominal) modifiers after the head noun (post-nominal) examples: determiners, adjectives, PPs e.g. the young man the girl with the red hat

37 Noun Phrase Modification - Prenominal
determiner the, a, this, some, ... predeterminer all the flights cardinal numbers, ordinal numbers one flight, the first flight, ... quantifiers much, little

38 Noun Phrase Modification - Prenominal
adjectives a first-class flight, a long flight adjective phrase the least expensive flight Grammar Rule NP → (Det) (Card) (Ord) (Quant) (AP) Nominal PROJECT!

39 Noun Phrase Modification - Postnominal
prepositional phrase PP all flights from Chicago Nominal → Nominal PP (PP) (PP) non-finite clause, gerundive postmodifers all flights arriving after 7 pm Nominal → GerundVP GerundVP → GerundV NP | GerundV PP | ... relative clause a flight that serves breakfast Nominal → Nominal RelClause RelClause → (who | that) VP

40 Verb Subcategorization
Different verbs accept or need different constituents or complements. VP = Verb + other constituents (complements) e.g. He buys the books. Verbs can be classified according to the complements they accept or need. e.g. give needs two complements He gave her the books. sleep accepts no complement He sleeps.

41 Verb Complements sentential complement NP complement VP  Verb NP
VP  Verb inf-sentence I want to fly from Boston to Chicago. NP complement VP  Verb NP I want this flight. no complement VP  Verb I sleep.

42 Other Verb Complements
Prepositional Phrases + other Modifiers can be added to specify location or time of action, state or event described by verb VP  Verb PP PP I fly from Boston to Chicago. VP  Verb PP I sleep in the barn. VP  Verb PP ADV I sleep in the barn tonight.

43 Assignment 1-B Extend the grammar in the Earley Parser by integrating:
complex VPs through sub-categorization and complements complex NPs through pre- and post-modifiers some adverbs (e.g. temporal or manner) plus rule extensions You should define 3-5 new / modified rules in each category. Write down the new rules, and add sample parse outputs generated with the parser program, to illustrate the working of your rules (last chart state is sufficient).


Download ppt "Natural Language Processing - English Grammar -"

Similar presentations


Ads by Google