Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó.

Slides:



Advertisements
Similar presentations
Machine Learning Approaches to the Analysis of Large Corpora : A Survey Xunlei Rose Hu and Eric Atwell University of Leeds.
Advertisements

1 CS 388: Natural Language Processing: N-Gram Language Models Raymond J. Mooney University of Texas at Austin.
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Albert Gatt Corpora and Statistical Methods – Lecture 7.
CL Research ACL Pattern Dictionary of English Prepositions (PDEP) Ken Litkowski CL Research 9208 Gue Road Damascus,
A Bilingual Corpus of Inter-linked Events Tommaso Caselli♠, Nancy Ide ♣, Roberto Bartolini ♠ ♠ Istituto di Linguistica Computazionale – ILC-CNR Pisa ♣
Automatic Metaphor Interpretation as a Paraphrasing Task Ekaterina Shutova Computer Lab, University of Cambridge NAACL 2010.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
Collective Word Sense Disambiguation David Vickrey Ben Taskar Daphne Koller.
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
Markov Model Based Classification of Semantic Roles A Final Project in Probabilistic Methods in AI Course Submitted By: Shlomit Tshuva, Libi Mann and Noam.
Predicting the Semantic Orientation of Adjectives
Article by: Feiyu Xu, Daniela Kurz, Jakub Piskorski, Sven Schmeier Article Summary by Mark Vickers.
Distributional clustering of English words Authors: Fernando Pereira, Naftali Tishby, Lillian Lee Presenter: Marian Olteanu.
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
A Framework for Named Entity Recognition in the Open Domain Richard Evans Research Group in Computational Linguistics University of Wolverhampton UK
Towards the automatic identification of adjectival scales: clustering adjectives according to meaning Authors: Vasileios Hatzivassiloglou and Kathleen.
Using Information Content to Evaluate Semantic Similarity in a Taxonomy Presenter: Cosmin Adrian Bejan Philip Resnik Sun Microsystems Laboratories.
Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text Soo-Min Kim and Eduard Hovy USC Information Sciences Institute 4676.
Albert Gatt Corpora and Statistical Methods Lecture 5.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
1 Statistical NLP: Lecture 10 Lexical Acquisition.
Computational Lexical Semantics Lecture 8: Selectional Restrictions Linguistic Institute 2005 University of Chicago.
PropBank, VerbNet & SemLink Edward Loper. PropBank 1M words of WSJ annotated with predicate- argument structures for verbs. –The location & type of each.
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
Modeling Documents by Combining Semantic Concepts with Unsupervised Statistical Learning Author: Chaitanya Chemudugunta America Holloway Padhraic Smyth.
Distributional Part-of-Speech Tagging Hinrich Schütze CSLI, Ventura Hall Stanford, CA , USA NLP Applications.
Assessing the Impact of Frame Semantics on Textual Entailment Authors: Aljoscha Burchardt, Marco Pennacchiotti, Stefan Thater, Manfred Pinkal Saarland.
Spring /22/071 Beyond PCFGs Chris Brew Ohio State University.
Copyright 2007, Toshiba Corporation. How (not) to Select Your Voice Corpus: Random Selection vs. Phonologically Balanced Tanya Lambert, Norbert Braunschweiler,
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Statistical NLP: Lecture 8 Statistical Inference: n-gram Models over Sparse Data (Ch 6)
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
A Weakly-Supervised Approach to Argumentative Zoning of Scientific Documents Yufan Guo Anna Korhonen Thierry Poibeau 1 Review By: Pranjal Singh Paper.
Improving Subcategorization Acquisition using Word Sense Disambiguation Anna Korhonen and Judith Preiss University of Cambridge, Computer Laboratory 15.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Combining Lexical Resources: Mapping Between PropBank and VerbNet Edward Loper,Szu-ting Yi, Martha Palmer September 2006.
Introduction Chapter 1 Foundations of statistical natural language processing.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Finding frequent and interesting triples in text Janez Brank, Dunja Mladenić, Marko Grobelnik Jožef Stefan Institute, Ljubljana, Slovenia.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Language Modeling Putting a curve to the bag of words Courtesy of Chris Jordan.
Evaluation issues in anaphora resolution and beyond Ruslan Mitkov University of Wolverhampton Faro, 27 June 2002.
Probabilistic Text Structuring: Experiments with Sentence Ordering Mirella Lapata Department of Computer Science University of Sheffield, UK (ACL 2003)
11 Project, Part 3. Outline Basics of supervised learning using Naïve Bayes (using a simpler example) Features for the project 2.
Comparing Word Relatedness Measures Based on Google n-grams Aminul ISLAM, Evangelos MILIOS, Vlado KEŠELJ Faculty of Computer Science Dalhousie University,
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
1 Fine-grained and Coarse-grained Word Sense Disambiguation Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003.
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
Multi-Criteria-based Active Learning for Named Entity Recognition ACL 2004.
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
N-Gram Model Formulas Word sequences Chain rule of probability Bigram approximation N-gram approximation.
RELATION EXTRACTION, SYMBOLIC SEMANTICS, DISTRIBUTIONAL SEMANTICS Heng Ji Oct13, 2015 Acknowledgement: distributional semantics slides from.
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
Coarse-grained Word Sense Disambiguation
Statistical NLP: Lecture 9
N-Gram Model Formulas Word sequences Chain rule of probability
Topic Models in Text Processing
Statistical NLP : Lecture 9 Word Sense Disambiguation
Statistical NLP: Lecture 10
Presentation transcript:

Modelling Human Thematic Fit Judgments IGK Colloquium 3/2/2005 Ulrike Padó

Overview (Very) quick introduction to my framework Testing the Semantic Module  Different input corpora  Smoothing Comparing the Semantic Module to standard selectional preference methods

Modelling Semantic Processing General idea: Build a  probabilistic  large scale  broad coverage model of syntactic and semantic sentence processing

Semantic Processing Assign thematic roles on the basis of co- occurrence statistics from semantically annotated corpora Corpus-based frequency estimates of:  Semantic Subcategorisation (Probability of seeing the role with the verb)  Selectional Preferences (Probability of seeing the argument head in a role given the verb frame)

Testing the Semantic Module Evaluate just thematic fit of verbs and argument phrases Evaluation: 1.Correlate predictions with human judgments 2.Role labelling (prefer correct role) Try  Different input corpora  Smoothing

Training Data Frequency counts from the PropBank (ca verb types)  Very specific domain  Relatively flat, syntax-based annotation FrameNet (ca verb types)  Deep semantic annotation: Frames code situations, group verbs that describe similar events and their arguments  Extracted from balanced corpus  Skewed sample through frame-wise annotation

Development/Test Data Development: 60 verb-argument pairs from McRae et al. 98  Two judgments for each data point: Agent/Patient  Use to determine optimal parameters of clustering (number of clusters, smoothing) Test: 50 verb-argument pairs, 100 data points

Sparse Data Raw frequencies are sparse:  1 (Dev)/2 (Test) pairs seen in PropBank  0 (Dev)/2 (Test) pairs seen in FrameNet Use semantic classes as level of abstraction: Class-based smoothing

Smoothing Reconstruct probabilities for unseen data Smoothing by verb and noun classes  Count class members instead of word tokens Compare two alternatives :  Hand-constructed classes  Induced verb classes (clustering)

Hand-constructed Verb and Noun classes WordNet: Use top-level ontology and synsets as noun classes VerbNet: Use top-level classes for verbs Presumably correct and reliable Result: No significant correlations with human data for either training corpus

Induced Verb Classes Automatically cluster verbs  Group by similarities of argument heads, paths from argument to verb, frame, role labels  Determine optimal number of clusters and parameters of the clustering algorithm on the development set

Induced Classes, PB/FN Data points covered  / Significance Raw data 2-/- 2 All Arguments 59ns 12  =0.55/ p<0.05 Just NPs 48ns 16  =0.56/ p<0.05

Results Hand-built classes do not work (with this amount of data) Module achieves reliable correlations with FN data:  Important result for the overall feasibility of my model

Adding Noun Classes (PB/FN) Data points covered  / Significance Raw data 2-/- 2 PB, all args, Noun classes 4  =1/ p<0.01 FN, just NPs, Noun classes 18  =0.63/ p<0.01

Results Hand-built classes do not work (with this amount of data) Module achieves reliable correlations with FN data Adding noun classes helps yet a little

Comparison with Selectional Preference Methods Have established that our system reliably predicts human data How do we do in comparison to standard computational linguistics methods?

Selectional Preference Methods Clark & Weir (2002)  Add data points by finding the topmost class in WN that still reliably mirrors the target word frequency Resnik (1996)  Quantify contribution of WN class n to the overall preference strength of the verb Both rely on WN noun classes, no verb class smoothing

Selectional Preference Methods (PB/FN) Data points covered  / Significance Labelling (Cov/Acc) Sem. Module 118  =0.63/ p< %/47.4% Sem. Module 216  =0.56/ p< %/60% Clark & Weir 72ns84%/50% 23ns36%/50% Resnik 75ns74%/48.6% 46ns50%/48%

Results Too little input data  No results for selectional preference models  Small coverage for Semantic Module Semantic module manages to make predictions all the same  Relies on verb clusters: Verbs are less sparse than nouns in small corpora Annotate larger corpus with FN roles

Annotating the BNC Annotate large, balanced corpus: BNC  More data points for verbs covered in FN  More verb coverage (though purely syntactic annotation for unknown verbs) Results:  Annotation relatively sensible and reliable for non-FN verbs  Frame-wise annotation in FN causes problems for FN verbs