PropBanks, 10/30/03 1 Penn Putting Meaning Into Your Trees Martha Palmer Paul Kingsbury, Olga Babko-Malaya, Scott Cotton, Nianwen Xue, Shijong Ryu, Ben.

Slides:



Advertisements
Similar presentations
SEMANTIC ROLE LABELING BY TAGGING SYNTACTIC CHUNKS
Advertisements

Page 1 SRL via Generalized Inference Vasin Punyakanok, Dan Roth, Wen-tau Yih, Dav Zimak, Yuancheng Tu Department of Computer Science University of Illinois.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Relation Extraction.
Columbia, 1/29/04 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Columbia University New York City January 29, 2004.
Layering Semantics (Putting meaning into trees) Treebank Workshop Martha Palmer April 26, 2007.
Multilinugual PennTools that capture parses and predicate-argument structures, and their use in Applications Martha Palmer, Aravind Joshi, Mitch Marcus,
VerbNet Martha Palmer University of Colorado LING 7800/CSCI September 16,
E XTRACTING SEMANTIC ROLE INFORMATION FROM UNSTRUCTURED TEXTS Diana Trandab ă 1 and Alexandru Trandab ă 2 1 Faculty of Computer Science, University “Al.
Natural Language Processing Semantic Roles. Semantics Road Map 1.Lexical semantics 2.Disambiguating words Word sense disambiguation Coreference resolution.
Grammatical Relations and Lexical Functional Grammar Grammar Formalisms Spring Term 2004.
Overview of the Hindi-Urdu Treebank Fei Xia University of Washington 7/23/2011.
Outline Linguistic Theories of semantic representation  Case Frames – Fillmore – FrameNet  Lexical Conceptual Structure – Jackendoff – LCS  Proto-Roles.
Language Data Resources Treebanks. A treebank is a … database of syntactic trees corpus annotated with morphological and syntactic information segmented,
Semantic Role Labeling Abdul-Lateef Yussiff
A Joint Model For Semantic Role Labeling Aria Haghighi, Kristina Toutanova, Christopher D. Manning Computer Science Department Stanford University.
10/9/01PropBank1 Proposition Bank: a resource of predicate-argument relations Martha Palmer University of Pennsylvania October 9, 2001 Columbia University.
April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 1 Layering of Annotations in the Penn Discourse TreeBank (PDTB) Rashmi Prasad Institute.
1 Penn English and Chinese PropBanks Martha Palmer University of Pennsylvania with Olga Babko-Malaya, Nianwen Xue, and Ben Snyder April 14, 2005 Semantic.
Towards Parsing Unrestricted Text into PropBank Predicate- Argument Structures ACL4 Project NCLT Seminar Presentation, 7th June 2006 Conor Cafferkey.
Annotating language data Tomaž Erjavec Institut für Informationsverarbeitung Geisteswissenschaftliche Fakultät Karl-Franzens-Universität Graz Tomaž Erjavec.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
Steven Schoonover.  What is VerbNet?  Levin Classification  In-depth look at VerbNet  Evolution of VerbNet  What is FrameNet?  Applications.
The Relevance of a Cognitive Model of the Mental Lexicon to Automatic Word Sense Disambiguation Martha Palmer and Susan Brown University of Colorado August.
Introduction to treebanks Session 1: 7/08/
1 NSF-ULA Sense tagging and Eventive Nouns Martha Palmer, Miriam Eckert, Jena D. Hwang, Susan Windisch Brown, Dmitriy Dligach, Jinho Choi, Nianwen Xue.
Simple Features for Chinese Word Sense Disambiguation Hoa Trang Dang, Ching-yi Chia, Martha Palmer, Fu- Dong Chiou Computer and Information Science University.
1 Annotation Guidelines for the Penn Discourse Treebank Part B Eleni Miltsakaki, Rashmi Prasad, Aravind Joshi, Bonnie Webber.
Recovering empty categories. Penn Treebank The Penn Treebank Project annotates naturally occurring text for linguistic structure. It produces skeletal.
Tasks Talk: ULA08 Workshop March 18, 2007 A Talk about Tasks Unified Linguistic Annotation Workshop Adam Meyers New York University March 18, 2008.
NomBank 1.0: ULA08 Workshop March 18, 2007 NomBank 1.0 Released 12/2007 Unified Linguistic Annotation Workshop Adam Meyers New York University March 18,
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
OntoNotes project Treebank Syntax Training Data Decoders Propositions Verb Senses and verbal ontology links Noun Senses and targeted nominalizations Coreference.
PropBank Martha Palmer University of Colorado. Unified Linguistic Annotation: Merging PropBank, NomBank, TimeBank, Penn Discourse Treebank, Coreference,
10/9/01PropBank1 Proposition Bank: a resource of predicate-argument relations Martha Palmer, Dan Gildea, Paul Kingsbury University of Pennsylvania February.
EMPOWER 2 Empirical Methods for Multilingual Processing, ‘Onoring Words, Enabling Rapid Ramp-up Martha Palmer, Aravind Joshi, Mitch Marcus, Mark Liberman,
CIS630 1 Penn Putting Meaning Into Your Trees Martha Palmer Collaborators: Paul Kingsbury, Olga Babko-Malaya, Bert Xue, Scott Cotton Karin Kipper, Hoa.
Automatic Extraction of Opinion Propositions and their Holders Steven Bethard, Hong Yu, Ashley Thornton, Vasileios Hatzivassiloglou and Dan Jurafsky Department.
PropBank, VerbNet & SemLink Edward Loper. PropBank 1M words of WSJ annotated with predicate- argument structures for verbs. –The location & type of each.
Korean Treebank & Propbank Martha Palmer, Narae Han, Jinyoung Choi, Shijong Ryu University of Pennsylvania May 23, 2005.
The Prague (Czech-)English Dependency Treebank Jan Hajič Charles University in Prague Computer Science School Institute of Formal and Applied Linguistics.
Semantic Role Labeling for Arabic using Kernel Methods Mona Diab Alessandro Moschitti Daniele Pighin.
Penn 1 Kindle: Knowledge and Inference via Description Logics for Natural Language Dan Roth University of Illinois, Urbana-Champaign Martha Palmer University.
AQUAINT Workshop – June 2003 Improved Semantic Role Parsing Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard, Ashley Thornton, Wayne Ward,
Semantic Role Labeling: English PropBank
11 Chapter 19 Lexical Semantics. 2 Lexical Ambiguity Most words in natural languages have multiple possible meanings. –“pen” (noun) The dog is in the.
Ideas for 100K Word Data Set for Human and Machine Learning Lori Levin Alon Lavie Jaime Carbonell Language Technologies Institute Carnegie Mellon University.
Resemblances between Meaning-Text Theory and Functional Generative Description Zdeněk Žabokrtský Institute of Formal and Applied Linguistics Charles University,
1 Discourse Connectives and Their Argument Structure: Annotating a discourse treebank ARAVIND K. JOSHI Department of Computer and Information Science August.
Combining Lexical Resources: Mapping Between PropBank and VerbNet Edward Loper,Szu-ting Yi, Martha Palmer September 2006.
Natural Language Processing
NLP. Introduction to NLP Background –From the early ‘90s –Developed at the University of Pennsylvania –(Marcus, Santorini, and Marcinkiewicz 1993) Size.
LING 6520: Comparative Topics in Linguistics (from a computational perspective) Martha Palmer Jan 15,
CSE391 – 2005 NLP 1 Events From KRR lecture. CSE391 – 2005 NLP 2 Ask Jeeves – A Q/A, IR ex. What do you call a successful movie? Tips on Being a Successful.
Princeton 11/06/03 1 Penn Putting Meaning Into Your Trees Martha Palmer University of Pennsylvania Princeton Cognitive Science Laboratory November 6, 2003.
ARDA Visit 1 Penn Lexical Semantics at Penn: Proposition Bank and VerbNet Martha Palmer, Dan Gildea, Paul Kingsbury, Olga Babko-Malaya, Bert Xue, Karin.
NLP. Introduction to NLP Last week, Min broke the window with a hammer. The window was broken with a hammer by Min last week With a hammer, Min broke.
1 Fine-grained and Coarse-grained Word Sense Disambiguation Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003.
Multilinugual PennTools that capture parses and predicate-argument structures, for use in Applications Martha Palmer, Aravind Joshi, Mitch Marcus, Mark.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Chinese Proposition Bank Nianwen Xue, Chingyi Chia Scott Cotton, Seth Kulick, Fu-Dong Chiou, Martha Palmer, Mitch Marcus.
CIS630, 9/13/04 1 Penn Putting Meaning into Your Trees Martha Palmer CIS630 September 13, 2004.
COSC 6336: Natural Language Processing
Leonardo Zilio Supervisors: Prof. Dr. Maria José Bocorny Finatto
Semantic/Thematic Roles Oct 9, 2007 Christopher Manning
English Proposition Bank: Status Report
Coarse-grained Word Sense Disambiguation
ADDING EVENT VARIABLES TO PROPBANK
Lecture 19 Word Meanings II
Structure of a Lexicon Debasri Chakrabarti 13-May-19.
Progress report on Semantic Role Labeling
Presentation transcript:

PropBanks, 10/30/03 1 Penn Putting Meaning Into Your Trees Martha Palmer Paul Kingsbury, Olga Babko-Malaya, Scott Cotton, Nianwen Xue, Shijong Ryu, Ben Snyder PropBanks I and II site visit University of Pennsylvania, October 30, 2003

PropBanks, 10/30/03 2 Penn Proposition Bank: From Sentences to Propositions Powell met Zhu Rongji Proposition: meet(Powell, Zhu Rongji ) Powell met with Zhu Rongji Powell and Zhu Rongji met Powell and Zhu Rongji had a meeting... When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane)) debate consult join wrestle battle meet(Somebody1, Somebody2)

PropBanks, 10/30/03 3 Penn Capturing semantic roles*  JK broke [ ARG1 the LCD Projector.]  [ARG1 The windows] were broken by the hurricane.  [ARG1 The vase] broke into pieces when it toppled over. SUBJ *See also Framenet,

PropBanks, 10/30/03 4 Penn Outline  Introduction  Proposition Bank  Starting with Treebanks  Frames files  Annotation process and status  PropBank II  Automatic labelling of semantic roles  Chinese Proposition Bank

PropBanks, 10/30/03 5 Penn A TreeBanked Sentence Analysts S NP-SBJ VP have VP beenVP expecting NP a GM-Jaguar pact NP that SBAR WHNP-1 *T*-1 S NP-SBJ VP would VP give the US car maker NP an eventual 30% stake NP the British company NP PP- LOC in (S (NP-SBJ Analysts) (VP have (VP been (VP expecting (NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that) (S (NP-SBJ *T*-1) (VP would (VP give (NP the U.S. car maker) (NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British company)))))))))))) Analysts have been expecting a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.

PropBanks, 10/30/03 6 Penn The same sentence, PropBanked Analysts have been expecting a GM-Jaguar pact Arg0 Arg1 (S Arg0 (NP-SBJ Analysts) (VP have (VP been (VP expecting Arg1 (NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that) (S Arg0 (NP-SBJ *T*-1) (VP would (VP give Arg2 (NP the U.S. car maker) Arg1 (NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British company)))))))))))) that would give *T*-1 the US car maker an eventual 30% stake in the British company Arg0 Arg2 Arg1 expect(Analysts, GM-J pact) give(GM-J pact, US car maker, 30% stake)

PropBanks, 10/30/03 7 Penn Frames File Example: expect Roles: Arg0: expecter Arg1: thing expected Example: Transitive, active: Portfolio managers expect further declines in interest rates. Arg0: Portfolio managers REL: expect Arg1: further declines in interest rates

PropBanks, 10/30/03 8 Penn Frames File example: give Roles: Arg0: giver Arg1: thing given Arg2: entity given to Example: double object The executives gave the chefs a standing ovation. Arg0: The executives REL: gave Arg2: the chefs Arg1: a standing ovation

PropBanks, 10/30/03 9 Penn Trends in Argument Numbering  Arg0 = agent  Arg1 = direct object / theme / patient  Arg2 = indirect object / benefactive / instrument / attribute / end state  Arg3 = start point / benefactive / instrument / attribute  Arg4 = end point

PropBanks, 10/30/03 10 Penn Ergative/Unaccusative Verbs Roles (no ARG0 for unaccusative verbs) Arg1 = Logical subject, patient, thing rising Arg2 = EXT, amount risen Arg3* = start point Arg4 = end point Sales rose 4% to $3.28 billion from $3.16 billion. The Nasdaq composite index added 1.01 to on paltry volume.

PropBanks, 10/30/03 11 Penn Function tags for English/Chinese (arguments or adjuncts?)  Variety of ArgM’s (Arg#>4):  TMP - when?  LOC - where at?  DIR - where to?  MNR - how?  PRP -why?  TPC – topic  PRD -this argument refers to or modifies another  ADV –others  CND – conditional  DGR – degree  FRQ - frequency

PropBanks, 10/30/03 12 Penn Inflection  Verbs also marked for tense/aspect  Passive/Active  Perfect/Progressive  Third singular (is has does was)  Present/Past/Future  Infinitives/Participles/Gerunds/Finites  Modals and negation marked as ArgMs

PropBanks, 10/30/03 13 Penn Word Senses in PropBank  Orders to ignore word sense not feasible for 700+ verbs  Mary left the room  Mary left her daughter-in-law her pearls in her will Frameset leave.01 "move away from": Arg0: entity leaving Arg1: place left Frameset leave.02 "give": Arg0: giver Arg1: thing given Arg2: beneficiary How do these relate to traditional word senses as in WordNet?

PropBanks, 10/30/03 14 Penn Overlap between Groups and Framesets – 95% WN1 WN2 WN3 WN4 WN6 WN7 WN8 WN5 WN 9 WN10 WN11 WN12 WN13 WN 14 WN19 WN20 Frameset1 Frameset2 develop Palmer, Dang & Fellbaum, NLE 2004

PropBanks, 10/30/03 15 Penn Annotator accuracy – ITA 84%

PropBanks, 10/30/03 16 Penn English PropBank Status - ( w/ Paul Kingsbury & Scott Cotton)  Create Frame File for that verb - DONE  3282 lemmas, framesets  First pass: Automatic tagging (Joseph Rosenzweig)  Second pass: Double blind hand correction  118K predicates – all but 300 done  Third pass: Solomonization (adjudication)  Betsy Klipple, Olga Babko-Malaya – 400 left  Frameset tags  700+, double blind, almost adjudicated, 92% ITA  Quality Control and general cleanup

PropBanks, 10/30/03 17 Penn Quality Control and General Cleanup  Frame File consistency checking  Coordination with NYU  Insuring compatibility of frames and format  Leftover tasks  have, be, become  Adjectival usages  General cleanup  Tense tagging  Finalizing treatment of split arguments, ex. say, and symmetric arguments, ex. match  Supplementing sparse data w/ Brown for selected verbs

PropBanks, 10/30/03 18 Penn Summary of English PropBank Paul Kingsbury, Olga Babko-Malaya, Scott Cotton GenreWordsFrames FilesFrameset Tags Released Wall Street Journal* (financial subcorpus) 300K< July, 02 Wall Street Journal* (Penn TreeBank II) 1000K< Dec, 03? (March, 03) English Translation of Chinese TreeBank * ITIC funding 100K<1500July, 04 Sinorama, English corpus NSF-ITR funding 150K<2000July, 05 English half of DLI Military Corpus ARL funding 50K< 1000July, 05

PropBanks, 10/30/03 19 Penn PropBank II  Nominalizations NYU  Lexical Frames DONE  Event Variables, (including temporals and locatives)  More fine-grained sense tagging  Tagging nominalizations w/ WordNet sense  Selected verbs and nouns  Nominal Coreference  not names  Clausal Discourse connectives – selected subset

PropBanks, 10/30/03 20 Penn PropBank I Also, [ Arg0 substantially lower Dutch corporate tax rates] helped [ Arg1 [ Arg0 the company] keep [ Arg1 its tax outlay] [ Arg3- PRD flat] [ ArgM-ADV relative to earnings growth]]. relative to earnings… flatits tax outlaythe company keep the company keep its tax outlay flat tax rateshelp ArgM-ADVArg3- PRD Arg1Arg0REL Event variables; ID# h23 k16 nominal reference; sense tags; help2,5 tax rate1 keep1 company1 discourse connectives { } I

PropBanks, 10/30/03 21 Penn Summary of Multilingual TreeBanks, PropBanks Parallel Corpora TextTreebankPropBank IProp II Chinese Treebank Chinese 500K English 400K Chinese 500K English 100K Chinese 500K English 350K* Ch 100K En 100K Arabic Treebank Arabic 500K English 500K Arabic 500K English 100K Korean Treebank Korean 180K English 50K Korean 180K English 50K Korean100K+ English 50K * Also 1M word English monolingual PropBank

PropBanks, 10/30/03 22 Penn Agenda  PropBank I 10:30 – 10:50  Automatic labeling of semantic roles  Chinese Proposition Bank  Proposition Bank II 10:50 – 11:30  Event variables – Olga Babko Malaya  Sense tagging – Hoa Dang  Nominal coreference – Edward Loper  Discourse tagging – Aravind Joshi  Research Areas – 11:30 – 12:00  Moving forward – Mitch Marcus  Alignment improvement via dependency structures– Yuan Ding  Employing syntactic features in MT – Libin Shen  Lunch 12:00 – 1:30 White Dog  Research Area - 1:30 – 1:45  Clustering – Paul Kingsbury  DOD Program presentation – 1:45 – 2:15  Discussion 2:15 – 3:00