Presentation is loading. Please wait.

Presentation is loading. Please wait.

Whose presentation is this? SUBJ(present, Violeta Seretan) OBL(collaborate, Lorenzo Thione) PP-OBJ(with, Lorenzo Thione) SUBJ(supervise, Martin van den.

Similar presentations


Presentation on theme: "Whose presentation is this? SUBJ(present, Violeta Seretan) OBL(collaborate, Lorenzo Thione) PP-OBJ(with, Lorenzo Thione) SUBJ(supervise, Martin van den."— Presentation transcript:

1 Whose presentation is this? SUBJ(present, Violeta Seretan) OBL(collaborate, Lorenzo Thione) PP-OBJ(with, Lorenzo Thione) SUBJ(supervise, Martin van den Berg) (Decoding the predicate-argument structure of nominalizations)

2 10/25/20052 Overview nominalization problem nominalization problem NOMLEX resource NOMLEX resource Denominalizer service based on NOMLEX Denominalizer service based on NOMLEX additional resources (CSLI) additional resources (CSLI) APIs for NOMLEX, CSLI APIs for NOMLEX, CSLI related and future work related and future work demo demo

3 10/25/20053 Text normalization for QA Mark Twain published Adventures of Huckleberry Finn in 1885 in America. Mark Twain published Adventures of Huckleberry Finn in 1885 in America. –Who published H.F.? –Where was H.F. published? –When was H.F. published? QA/NLU needs to deal with a large spectrum of variation in text: QA/NLU needs to deal with a large spectrum of variation in text: 1. morphological: published, publishes 2. syntactic: H.F. was published 3. lexical: {novel, book, masterpiece, work} {publish, write, author, appear} 4. nominalization: the publication Normalization (via parsing): Normalization (via parsing): 1. base word form: publishes -> publish; published -> publish 2. canonical word order: SUBJ(publish, Mark Twain); OBJ(publish, H.F.) Lexical semantic resources: Lexical semantic resources: 3. synonyms, hyponyms, hypernyms, … What about nominalization? What about nominalization?

4 10/25/20054 Nominalization Since the publication of Huckleberry Finn in 1885, there have been many reactions to the novel, some of them quite extreme. –When was H.F. published? Nominalization : NP having “a systematic correspondence with a clause structure” (Quirk et al. 1985) Goal: decoding the clause structure publication of Huckleberry Finn OBJ(publish, Huckleberry Finn) deverbal noun nominalization matrix verb

5 10/25/20055 Mark Twain’s publication of his book Mark Twain’s publication of his book possessive determiner PP adjunct (nominal arguments) the book publication by Mark Twain the book publication by Mark Twain modifier PP adjunct (nominal arguments) Mark Twain - publish – book SUBJECT OBJECT (verbal roles) SUBJECT OBJECT (verbal roles) Mapping nominal arguments into verbal roles

6 10/25/20056 Role ambiguity Rome’s destruction – SUBJ or OBJ? OBJ(destroy, Rome) SUBJ(destroy, Rome) A. Rome’s destruction by barbarians OBJ B. Rome’s destruction of Carthage SUBJ Rome’s destruction – OBJ (by default) John’s admiration – SUBJ (by default)

7 10/25/20057 NOMLEX – NOMinalization LEXicon Macleod et al., New York University Macleod et al., New York University 1’025 deverbal nouns 1’025 deverbal nouns detailed mapping from nominal arguments to verb roles detailed mapping from nominal arguments to verb roles :ORTH "destruction" :VERB "destroy" :VERB-SUBC ((NOM-NP :SUBJECT ((N-N-MOD) (DET-POSS) (DET-POSS) (PP :PVAL ("by"))) (PP :PVAL ("by"))) :OBJECT ((DET-POSS) :OBJECT ((DET-POSS) (N-N-MOD) (N-N-MOD) (PP :PVAL ("of"))) (PP :PVAL ("of"))) :REQUIRED ((OBJECT :DET-POSS-ONLY T :REQUIRED ((OBJECT :DET-POSS-ONLY T :N-N-MOD-ONLY T)))) :N-N-MOD-ONLY T)))) default role role to assign

8 10/25/20058 NOMLE X ML (NOM :ORTH "accusation" :PLURAL "accusations" :PLURAL "accusations" :PLURAL-FREQ "not rare" :PLURAL-FREQ "not rare" :VERB "accuse" :VERB "accuse" :NOUN-SUBC ((NOUN-PP :PVAL ("about"))) :NOUN-SUBC ((NOUN-PP :PVAL ("about"))) :NOM-TYPE ((VERB-NOM)) :NOM-TYPE ((VERB-NOM)) :VERB-SUBJ ((DET-POSS) :VERB-SUBJ ((DET-POSS) (N-N-MOD) (N-N-MOD) (PP :PVAL ("by"))) (PP :PVAL ("by"))) :SUBJ-ATTRIBUTE ((COMMUNICATOR)) :SUBJ-ATTRIBUTE ((COMMUNICATOR)) :OBJ-ATTRIBUTE ((COMMUNICATOR)) :OBJ-ATTRIBUTE ((COMMUNICATOR)) :VERB-SUBC ((NOM-NP-PP :SUBJECT ((DET-POSS) :VERB-SUBC ((NOM-NP-PP :SUBJECT ((DET-POSS) (N-N-MOD) (N-N-MOD) (PP :PVAL ("by"))) (PP :PVAL ("by"))) :OBJECT ((PP :PVAL ("against"))) :OBJECT ((PP :PVAL ("against"))) :PVAL ("of")) :PVAL ("of")) (NOM-NP :SUBJECT ((DET-POSS) … (NOM-NP :SUBJECT ((DET-POSS) … Perl

9 10/25/20059 NOMLEX API in Java com.fxpal.sake.test (NomLexInterface) com.fxpal.ltng.services.normalization.noun.nomlex (NomLex, NomLexEntry, NomLexClassConstants, Subcat)

10 10/25/ How useful? Oracle acquired PeopleSoft at the end of last year. Oracle’s acquisition of PeopleSoft at the end of last year… Google hits, 10/25/2005: "Oracle acquisition of PeopleSoft" "Oracle acquired PeopleSoft" "Oracle's PeopleSoft acquisition" "Oracle acquires PeopleSoft" 1’020 "Oracle has acquired PeopleSoft" 248 "Oracle will acquire PeopleSoft" 424 More hits:587 ~14’

11 10/25/ Argument-role mapping Oracle's acquisition of PeopleSoft possessive PP (of ) :ORTH "acquisition" :VERB "acquire" :VERB-SUBC ((NOM-NP :SUBJECT ((DET-POSS) (N-N-MOD) (N-N-MOD) (PP :PVAL ("by"))) (PP :PVAL ("by"))) :OBJECT ((N-N-MOD) :OBJECT ((N-N-MOD) (PP :PVAL ("of")))) (PP :PVAL ("of")))) acquire, Oracle SUBJ(acquire, Oracle) acquire, PeopleSoft OBJ(acquire, PeopleSoft)

12 10/25/ Denominalizer Input: sentence Input: sentence Output: pairs nominal argument – verb role Output: pairs nominal argument – verb role for each nominalization (noun, (argument –role)*)* Exemples: Oracle's acquisition of PeopleSoft finally materialized after an 18 months struggle between the two companies. Oracle's acquisition of PeopleSoft finally materialized after an 18 months struggle between the two companies. (acquisition, (Oracle - SUBJECT) (PeopleSoft - OBJECT)) Oracle acquisition finally materialized. Oracle acquisition finally materialized. (acquisition, (Oracle - SUBJECT) (Oracle - OBJECT))

13 10/25/ Algorithm parse sentence for each deverbal noun get noun arguments for each NOMLEX entry for noun for each subcat of the entry 1. match arguments against subcat 2. filter assignment results select a subcat output assignments for selected subcat Note: overlapping nominalizations ok: an increase in product sales com.fxpal.ltng.services.normalization.noun.*

14 10/25/ Matching Oracle's acquisition of PeopleSoft finally materialized. Arguments (acquisition): POSS(acquisition, Oracle) ADJUNCT(acquisition, of) PP-OBJ(of, PeopleSoft) NOM-NP :SUBJECT((DET-POSS) (N-N-MOD) (PP :PVAL ("by"))) (PP :PVAL ("by"))) :OBJECT ((N-N-MOD) (PP :PVAL ("of")))

15 10/25/ Filtering Oracle's PeopleSoft acquisition finally materialized. Arguments (acquisition): POSS(acquisition, Oracle) MOD(acquisition, PeopleSoft) NOM-NP SUBJECT((DET-POSS) (N-N-MOD) (PP :PVAL ("by"))) (PP :PVAL ("by"))) OBJECT ((N-N-MOD) (PP :PVAL ("of"))) Alternatives: Oracle: SUBJECT PeopleSoft: SUBJECT, OBJECT

16 10/25/ NOMLEX constraints (1) Uniqueness Constraint: Uniqueness Constraint: A verbal role may be filled only once. Oracle's PeopleSoft acquisition Matching alternatives: Oracle: SUBJECT PeopleSoft: SUBJECT, OBJECT

17 10/26/ NOMLEX constraints (2) Ordering Constraint: Ordering Constraint: If there are multiple pre-nominal arguments, they must appear in the order: SUBJECT, INDIRECT OBJECT, DIRECT OBJECT, OBLIQUE. FX’s printer sales grew by 50%. Matching alternatives:, OBJECT FX: SUBJECT, OBJECT printer: SUBJECT, OBJECT printer order: FX, printer verbal roles: SUBJECT, OBJECT

18 10/25/ NOMLEX constraints (3) Obligatoriness Constraint: Obligatoriness Constraint: By default, the subject and object are optional. A NOMLEX entry can specify obligatory roles to be filled. circulation - REQUIRED (SUBJECT) blood circulation SUBJ(circulate, blood) destruction - REQUIRED ((OBJECT :DET-POSS-ONLY T :N-N-MOD-ONLY T)))) :N-N-MOD-ONLY T)))) Rome’s destruction OBJ(destroy, Rome)

19 10/25/ Selectional Restrictions com.fxpal.ltng.services.normalization.noun.csli (Nouns, Verbs, NounsVerbs)

20 10/25/ Applying selectional restrictions room reservation room reservationAlternatives: room - SUBJECT, OBJECT reserve - selectional restrictions: SUBJECT: sentient; OBJECT: * room - location, physobj semantic types for about 5000 N semantic types for about 5000 N selectional restrictions for about 5000 V selectional restrictions for about 5000 V 459/941 verbs from NOMLEX (48.77%)

21 10/25/ Coverage extension What if a noun is not in NOMLEX? What if a noun is not in NOMLEX? 1.additional deverbal nouns in the CSLI data 4’087 “event nouns” 3348 new, 739 already in NOMLEX 3348/ % more data 2.NOMLEX template: NOM-NP :SUBJECT((DET-POSS) (N-N-MOD) (PP :PVAL ("by"))) (PP :PVAL ("by"))) :OBJECT ((DET-POSS) (N-N-MOD) (PP :PVAL ("of")))

22 10/25/ Future work extensive test and evaluation extensive test and evaluation other nominalization data other nominalization data –deverbal noun recognition –mapping information (FrameNet) other lexical resources other lexical resources PropBank – semantic roles VerbLex – selectional restrictions role assignment in context role assignment in context –word sense disambiguation, anaphora, discourse –collocations the author will make no accusation SUBJ(make, author) -> SUBJ (accuse, author)

23 10/25/ Related work PUNDIT system (Dahl et al., 1987) PUNDIT system (Dahl et al., 1987) SNOWY QA system (Hull and Gomez 1996) SNOWY QA system (Hull and Gomez 1996) NOMLEX for IE (Meyers et al., 1998) NOMLEX for IE (Meyers et al., 1998) N-N interpretation (Lapata 2002, Girju et al. 2004) N-N interpretation (Lapata 2002, Girju et al. 2004)

24 10/25/ Dahl, Deborah A., Palmer, Martha S.; and Passonneau, Rebecca J "Nominalizations in PUNDIT." Proceedings of the 25th Annual Meeting of the Association for Computational Linguistics, Stanford, CA. Dahl, Deborah A., Palmer, Martha S.; and Passonneau, Rebecca J "Nominalizations in PUNDIT." Proceedings of the 25th Annual Meeting of the Association for Computational Linguistics, Stanford, CA. Girju, Roxana, Ana-Maria Giuglea, Marian Olteanu, Ovidiu Fortu, Orest Bolohan, and Dan Moldovan. Support vector machines applied to the classification of semantic relations in nominalized noun phrases. In Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics, Girju, Roxana, Ana-Maria Giuglea, Marian Olteanu, Ovidiu Fortu, Orest Bolohan, and Dan Moldovan. Support vector machines applied to the classification of semantic relations in nominalized noun phrases. In Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics, Hull, Richard and Fernando Gomez (1996). Semantic Interpretation of Nominalizations. PDF Format. Proceedings of the Thirteenth National Conference on Artificial Intelligence, Portland, Oregon, August, 1996, pp Hull, Richard and Fernando Gomez (1996). Semantic Interpretation of Nominalizations. PDF Format. Proceedings of the Thirteenth National Conference on Artificial Intelligence, Portland, Oregon, August, 1996, pp Lapata, Maria The Disambiguation of Nominalisations. Computational Linguistics 28:3, Lapata, Maria The Disambiguation of Nominalisations. Computational Linguistics 28:3, Macleod, Catherine, Ralph Grishman, Adam Meyers, Leslie Barrett, and Ruth Reeves Nomlex: A lexicon of nominalizations. In Proceedings of the 8 th International Congress of the European Association for Lexicography, pages 187–193, Liège, Belgium. Macleod, Catherine, Ralph Grishman, Adam Meyers, Leslie Barrett, and Ruth Reeves Nomlex: A lexicon of nominalizations. In Proceedings of the 8 th International Congress of the European Association for Lexicography, pages 187–193, Liège, Belgium. Meyers A., et al. Using NOMLEX to produce nominalization patterns for information extraction. In Proceedings of the COLING-ACL Workshop on Computational Treatment of Nominals, Meyers A., et al. Using NOMLEX to produce nominalization patterns for information extraction. In Proceedings of the COLING-ACL Workshop on Computational Treatment of Nominals, Quirk, S. R., Greenbaum, G. Leech, and J. Svartvik A comprehensive grammar of English language, Longman, Harlow. Quirk, S. R., Greenbaum, G. Leech, and J. Svartvik A comprehensive grammar of English language, Longman, Harlow. Terada Akira, Tokunaga Takenobu. Corpus based method of transforming nominalized phrases into clauses for text mining application. IEICE Transactions on Information and Systems. Vol.E86-D. No.9. pp Terada Akira, Tokunaga Takenobu. Corpus based method of transforming nominalized phrases into clauses for text mining application. IEICE Transactions on Information and Systems. Vol.E86-D. No.9. pp References

25 10/25/ Thank you!

26 10/25/ Selectional restrictions data CSLI resource: CSLI resource: –nouns4447 semantic types (ontology) semantic types (ontology) –verbs4858 subcategorizations subcategorizations selectional restrictions selectional restrictions –noun-verb5700 V (9415 N) noun-verb pairs noun-verb pairs

27 10/25/ Grammatical Transfer NOMLEXXLEExample DET-POSSPOSS Rome's destruction PP ADJUNCT, PP-OBJ (POS=NOUN) destruction of Carthage TO-INFXCOMP the desire to leave AS-NP- PHRASE ADJUNCT, PP-OBJ (as, POS=NOUN) his resignation as chairman N-N-MODMOD the room reservation P-ING ADJUNCT, PP-OBJ (POS=VERB) the accusation against launching ING ADJUNCT, QA_PROG(+) my appreciation being there FOR-TO-INF ADJUNCT, SUBJ the wish for him to go ADVP ADJUNCT (POS=ADV) his departure abroad AS-ING ADJUNCT, PP-OBJ (as, POS=VERB), QA_PROG(+) characterization as being AS-ADJP ADJUNCT, PP-OBJ (as, POS=ADJ) the characterization as useful P-POSSING ADJUNCT, PP-OBJ(POS=VERB), POSS the acceptance of his talking

28 10/25/ FrameNet aim: word – semantico-syntactic mapping aim: word – semantico-syntactic mapping semantic roles: frame elements (frame-specific) semantic roles: frame elements (frame-specific) BNC corpus (100M words); American English – LDC, ANC BNC corpus (100M words); American English – LDC, ANC more than 600 frames, about words more than 600 frames, about words communicatorevalueereason not expressed (27/48) possessive determiner (6/48) PP (from) (2/48) … not expressed (40/48) PP (against) (5/48) PP (about) (3/48) … PP (of) (9/48) S (that) (9/48) not expressed (8/48) … PP (about) (3/48) … Example: accusation frame: Judgment_communication FE (for this word) and their realization:

29 10/25/ NOMLEX constraints (4) restrictions on possible combinations restrictions on possible combinations –specified in NOMLEX entry adaptation :NOT ((AND :SUBJECT ((DET-POSS) (N-N-MOD)) :OBJECT ((N-N-MOD)) *plants' weather adaptation plants’ adaptation to weather Note: Not implemented (cannot decide which assignment to remove).

30 10/25/ Denominalizer UI parse triples output com.fxpal.sake.test.DenominalizerTest


Download ppt "Whose presentation is this? SUBJ(present, Violeta Seretan) OBL(collaborate, Lorenzo Thione) PP-OBJ(with, Lorenzo Thione) SUBJ(supervise, Martin van den."

Similar presentations


Ads by Google