Presentation is loading. Please wait.

Presentation is loading. Please wait.

RECOGNISING NOMINALISATIONS Supervisors: Dr. Alex Lascarides Dr. Mirella Lapata (Andrew) Yuk On KONG University of Edinburgh.

Similar presentations


Presentation on theme: "RECOGNISING NOMINALISATIONS Supervisors: Dr. Alex Lascarides Dr. Mirella Lapata (Andrew) Yuk On KONG University of Edinburgh."— Presentation transcript:

1 RECOGNISING NOMINALISATIONS Supervisors: Dr. Alex Lascarides Dr. Mirella Lapata (Andrew) Yuk On KONG University of Edinburgh

2 DEFINITION “Nominalisation refers to the process of forming a noun from some other word-class. (e.g. red+ness) or (in classical transformational grammar especially) the derivation of a noun phrase from an underlying clause (e.g. Her answering of the letter….from She answered the letter). The term is also used in the classification of relative clauses (e.g. What concerns me is her attitude)…….” (Crystal 1997)

3 Nominalisations (1 st definition) from verbs only are considered here, e.g. "statement" from "state". Problem: WORD--noun? from a verb or not? Nominalsations derived from verbs are very productive in English and are usually created by means of suffixation (i.e., suffixes that form nouns are attached to verb bases).

4 EXCLUSIONS Nominals, e.g. the poor, the wounded Nominalisation NOT From Verb, e.g. redness -ing form, e.g. the making of the movie Antidisestablish-ment-arian-ism

5 REGULAR? Nominalisenominalisation Interpretinterpretation Interruptinterruption Associateassociation deletedeletion breakbreakage leakleakage

6 Confineconfinement Refinerefinement (but definedefinition) submitsubmission admitadmission (but also admittance) remitremission; remittance; remit

7 VERB=NOUN DebateDebate (not debation); debater Paypay Lovelove Bossboss Standstand purchasepurchase Lielie (“tell a lie”) (cf lie down)

8 VERB=NOUN (except stress) transfertransfer transport importimport rebelrebel; (rebellion)

9 1 VERB, >1 NOUNS Collectcollection; collector Interpretinterpretation; interpreter Covercover; coverage Conductconduction; conductor; Dependdependant/dependent; dependence; dependency

10 SEMANTICS Conductconduction(conduct electricity/heat) Conductconduct (behave/organise)

11 WHEN TO USE WHICH SUFFIX -tion/-sion er/or Debatedebater Talktalker Collectcollector Conductconductor

12 IRREGULAR NOMINALISATION Choosechoice Succeedsuccess;succession;successor Decide decision Sellsale

13 PSEUDO-NOMINALISATION mote??Motion (noun; a very small piece of dust) DepartDeparture; Department??? Apartapartment????

14 WHY BOTHER? The identification of nominalisations and their associated verbs (e.g. "statement" and "state"). important for a number of NLP tasks: –machine translation –information retrieval –automatic learning of machine-readable dictionaries –grammar induction

15 HOW ? nominalisation is a productive morphological phenomenon: list all acceptable nominalised forms? New words?

16 techniques NOT focusing on nominalisations build rules machine-learning approaches to induce morphological structures using large corpora knowledge-free induction of inflectional morphologies (Schone and Jurafsky 2001).

17 SCHONE AND JURAFSKY (2001) Schone and Jurafsky (2001) have performed work for acquiring cognates and morphological variants. –Induced semantics—Latent Semantic Analysis (LSA) –Induced orthographic info –Induced syntactic info –Transitive information –Affix frequencies

18 GOAL OF THIS STUDY The principal goal of this project is to develop a system which can recognise nominalisations, together with the verbs from which they are derived.

19 EXPERIMENT 1 (baseline) identify nouns using the tags in the corpus identify potential nominalisations from the list of nouns with a list of nominalisation suffixes find the corresponding potential verb for each by identifying the verb (from among verbs as tagged) that shares with it the greatest number of letters in sequence accept a pair of nominalisation and verb if the % letter matched > 50% and discard any other

20 EXPERIMENT 2 using decision tree to build a model possible features include: -letter similarity between verbs and nouns -suffix frequency -verb frequency -verb semantics -subject of noun -subject of verb

21 EVALUATION experiments will be based on the BNC corpus. The obtained nominalisations will be evaluated against the CELEX morphological lexicon and manually annotated data. Precision, recall and F-score

22 BRITISH NATIONAL CORPUS Over 100 million words Corpus of modern English Both spoken (10%) and written (90%) Each word is automatically tagged by the CLAWS stochastic POS tagger 65 different tags encoded using SGML to represent POS tags and a variety of other structural properties of texts (e.g. headings, paragraphs, lists, etc.)

23 Shopping including collection of prescriptions Daysitting and nightsitting

24 CELEX English, Dutch and German Annotated by human using lemmata from two dictionaries of English 52,446 lemmata and 160,594 wordforms orthographic, phonological, morphological, syntactic and frequency information morphological structure, e.g. ((celebrate),(ion))

25 MILESTONES 6/2002Experiment 1—baseline 7/2002Experiment 2 8/2002Write-up 9/2002Finalise report


Download ppt "RECOGNISING NOMINALISATIONS Supervisors: Dr. Alex Lascarides Dr. Mirella Lapata (Andrew) Yuk On KONG University of Edinburgh."

Similar presentations


Ads by Google