SALSA The Saarbrücken Lexical Semantics Annotation & Acquisition Project Aljoscha Burchardt, Katrin Erk, Anette Frank, Andrea Kowalski, Sebastian Pado,

Slides:



Advertisements
Similar presentations
Automatic Methods to Supplement Broad-Coverage Subcategorization Lexicons Michael Schiehlen, Kristina Spranger Institut für Maschinelle Sprachverarbeitung.
Advertisements

A centralized approach to language resources Piek Vossen S&T Forum on Multilingualism, Luxembourg, June 6th 2005.
CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
Recognizing Textual Entailment Challenge PASCAL Suleiman BaniHani.
June 6, 20073rd PIRE Meeting1 Tectogrammatical Representation of English in Prague Czech-English Dependency Treebank Lucie Mladová Silvie Cinková, Kristýna.
FATE: a FrameNet Annotated corpus for Textual Entailment Marco Pennacchiotti, Aljoscha Burchardt Computerlinguistik Saarland University, Germany LREC 2008,
The SALSA experience: semantic role annotation Katrin Erk University of Texas at Austin.
Building Text Meaning Representations from Contextually Related Frames – A Case Study – Aljoscha Burchardt Anette Frank Manfred Pinkal Saarland University.
FATE: a FrameNet Annotated corpus for Textual Entailment Marco Pennacchiotti, Aljoscha Burchardt Computerlinguistik Saarland University, Germany LREC 2008,
Steven Schoonover.  What is VerbNet?  Levin Classification  In-depth look at VerbNet  Evolution of VerbNet  What is FrameNet?  Applications.
Semantic Frames: FrameNet. What is FrameNet? FrameNet is an ongoing project at the International Computer Science Institute located in Berkeley California.
Final Review CS4705 Natural Language Processing. Semantics Meaning Representations –Predicate/argument structure and FOPC Thematic roles and selectional.
OntoNotes/PropBank Participants: BBN, Penn, Colorado, USC/ISI.
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
Automatic Classification of Semantic Relations between Facts and Opinions Koji Murakami, Eric Nichols, Junta Mizuno, Yotaro Watanabe, Hayato Goto, Megumi.
Creating a Bilingual Ontology: A Corpus-Based Approach for Aligning WordNet and HowNet Marine Carpuat Grace Ngai Pascale Fung Kenneth W.Church.
Shallow semantic parsing: Making most of limited training data Katrin Erk Sebastian Pado Saarland University.
Markov Model Based Classification of Semantic Roles A Final Project in Probabilistic Methods in AI Course Submitted By: Shlomit Tshuva, Libi Mann and Noam.
Mono- and bilingual modeling of selectional preferences Sebastian Padó Institute for Computational Linguistics Heidelberg University (joint work with Katrin.
OntoNotes project Treebank Syntax Training Data Decoders Propositions Verb Senses and verbal ontology links Noun Senses and targeted nominalizations Coreference.
Knowledge-Based NLP and the Semantic Web Sergei Nirenburg Institute for Language and Information Technologies University of Maryland Baltimore County Workshop.
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
Corpus-based Induction of an LFG Syntax-Semantics Interface for Frame Semantic Processing Anette Frank, Jiří Semecký
Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text Soo-Min Kim and Eduard Hovy USC Information Sciences Institute 4676.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Carmen Banea, Rada Mihalcea University of North Texas A Bootstrapping Method for Building Subjectivity Lexicons for Languages.
PropBank, VerbNet & SemLink Edward Loper. PropBank 1M words of WSJ annotated with predicate- argument structures for verbs. –The location & type of each.
NLP superficial and lexic level1 Superficial & Lexical level 1 Superficial level What is a word Lexical level Lexicons How to acquire lexical information.
Assessing the Impact of Frame Semantics on Textual Entailment Authors: Aljoscha Burchardt, Marco Pennacchiotti, Stefan Thater, Manfred Pinkal Saarland.
Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern, Amnon Lotan, Shachar Mirkin, Eyal Shnarch, Lili Kotlerman, Jonathan Berant and Ido.
The Impact of Grammar Enhancement on Semantic Resources Induction Luca Dini Giampaolo Mazzini
Learner corpus analysis and error annotation Xiaofei Lu CALPER 2010 Summer Workshop July 13, 2010.
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
A Web Application for Customized Corpus Delivery Nancy Ide, Keith Suderman, Brian Simms Department of Computer Science Vassar College USA.
Interpreting Dictionary Definitions Dan Tecuci May 2002.
Based on “Semi-Supervised Semantic Role Labeling via Structural Alignment” by Furstenau and Lapata, 2011 Advisors: Prof. Michael Elhadad and Mr. Avi Hayoun.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
The Current State of FrameNet CLFNG June 26, 2006 Fillmore.
Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
Textual Entailment | Learning Lexical Entailment | Wikipedia | Extraction Types | Results & Evaluations | Conclusions & Future Work 1 /20 Extracting a.
MASC The Manually Annotated Sub- Corpus of American English Nancy Ide, Collin Baker, Christiane Fellbaum, Charles Fillmore, Rebecca Passonneau.
Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó.
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
11 Chapter 19 Lexical Semantics. 2 Lexical Ambiguity Most words in natural languages have multiple possible meanings. –“pen” (noun) The dog is in the.
Spanish FrameNet Project Autonomous University of Barcelona Marc Ortega.
Evaluating Semantic Metadata without the Presence of a Gold Standard Yuangui Lei, Andriy Nikolov, Victoria Uren, Enrico Motta Knowledge Media Institute,
The interface between model-theoretic and corpus-based semantics
Combining Lexical Resources: Mapping Between PropBank and VerbNet Edward Loper,Szu-ting Yi, Martha Palmer September 2006.
Relation Alignment for Textual Entailment Recognition Cognitive Computation Group, University of Illinois Experimental ResultsTitle Mark Sammons, V.G.Vinod.
GermaNet-WS II A WordNet “Detour” to FrameNet Aljoscha Burchardt Katrin Erk Anette Frank* Saarland University, DFKI* Saarbrücken
WordNet Enhancements: Toward Version 2.0 WordNet Connectivity Derivational Connections Disambiguated Definitions Topical Connections.
Multi-lingual Semantic Annotation: Theory and Applications June 26 and 27, 2006 Saarbrücken.
1 STO A Lexical Database of Danish for Language Technology Applications Anna Braasch Center for Sprogteknologi Copenhagen SPINN Seminar, October 27, 2001.
Automatic acquisition for low frequency lexical items Nuria Bel, Sergio Espeja, Montserrat Marimon.
Lecture 19 Word Meanings II Topics Description Logic III Overview of MeaningReadings: Text Chapter 189NLTK book Chapter 10 March 27, 2013 CSCE 771 Natural.
Knowledge Structure Vijay Meena ( ) Gaurav Meena ( )
SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland.
Human-Assisted Machine Annotation Sergei Nirenburg, Marjorie McShane, Stephen Beale Institute for Language and Information Technologies University of Maryland.
1 CPA: Where do we go from here? Research Institute for Information and Language Processing, University of Wolverhampton; UPF Barcelona; University of.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Chapter 11 Language. Some Questions to Consider How do we understand individual words, and how are words combined to create sentences? How can we understand.
Semantic Role Labelling Using Chunk Sequences Ulrike Baldewein Katrin Erk Sebastian Padó Saarland University Saarbrücken Detlef Prescher Amsterdam University.
WP4 Models and Contents Quality Assessment
WordNet: A Lexical Database for English
Automatic Extraction of BI-RADS Features from Cross-Institution and Cross-Language Free-Text Mammography Reports Houssam Nassif, Terrie Kitchner, Filipe.
Grant Number: IIS Institution of PI: Brigham Young University PI’s: David W. Embley, Stephen W. Liddle, Deryle W. Lonsdale Title:
Lecture 19 Word Meanings II
CS224N Section 3: Corpora, etc.
Presentation transcript:

SALSA The Saarbrücken Lexical Semantics Annotation & Acquisition Project Aljoscha Burchardt, Katrin Erk, Anette Frank, Andrea Kowalski, Sebastian Pado, Manfred Pinkal

Semantic Annotation in SALSA Manual semantic annotation of 0.8 million words of syntactically annotated German newspaper text (TIGER Corpus, Releases 1, 2) with frames and frame elements (Berkeley FrameNet Database), staying as close as possible to the Berkeley FrameNet database

SALSA: What's special? SALSA is about German Cross-lingual divergencies?

Cross-lingual Divergencies Convincing cross-lingual portability results (E  D) in general Adaptation necessary because of Inappropriate granularity of distinctions between FEs Missing FEs (Rare cases of) inappropriate granularity of frames

SALSA: What's special? SALSA is about German Cross-lingual divergencies? Corpus-driven lexicon development through exhaustive full-text annotation Difficult cases Incompleteness of Berkeley FrameNet

Difficult cases Metaphors Support Verb Constructions Idioms

Difficult phenomena: Some Figures Sample of 246 LemmasSub-corpus nehmen Number% % Standard readings463885,7%4217,4% Metaphor3696,8%3815,8% Support3266,0%13254,8% Idiom791,5%2912,0% Non-literal use77414,3%19982,6% Total ,0%241100,0%

SALSA corpus: Release I Total size of annotated instances Consistent annotation through different verification steps All occurrences/readings of > 400 German verbal predicates (different frequency bands) Scheduled for Summer 2006

The SALTO Annotation Tool

SALSA II: Automatic Annotation and Acquisition Fred, Rosy, and Shalmaneser: A tool- chain for shallow semantic analysis  Talk by Katrin and Sebastian

SALSA II: Automation Fred, Rosy, and Shalmaneser: A tool- chain for shallow semantic analysis  Talk by Katrin and Sebastian The Detour System (through WordNet to FrameNet)  Talk by Anette and Al

Fred & Rosy Fred, Detour & Rosy

SALSAII: Automation Fred, Rosy, and Shalmaneser: A tool- chain for shallow semantic analysis  Talk by Katrin and Sebastian The Detour System (through WordNet to FrameNet)  Talk by Anette and Al Cross-lingual projection of frame- semantic information  Katrin and Sebastian

Cross-lingual Projection

SALSAII: Automation & Application Fred, Rosy, and Shalmaneser: A tool-chain for shallow semantic analysis  Talk by Katrin and Sebastian The Detour System (through WordNet to FrameNet)  Talk by Anette and Al Cross-lingual projection of frame-semantic information  Katrin and Sebastian Textual Entailment (RTE)  Anette and Al

t: In 1983, Aki Kaurismäki directed his first full-time feature. h: Aki Kaurismäki directed a film.

t: In 1983, Aki Kaurismäki directed his first full-time feature. h: Aki Kaurismäki directed a film. WordNet related Grammatically related

SALSA: Future Work Bottstrapping frame information by data expansion techniques Linking lexical semantic resourcs with upper-model ontologies Analysis of non-compositional phenomena A worked-out semantic lexicon Application to textual entailment