Outline P1EDA’s simple features currently implemented –And their ablation test Features we have reviewed from Literature –(Let’s briefly visit them) –Iftene’s.

Slides:

Advertisements

Similar presentations

Information Extraction Lecture 7 – Linear Models (Basic Machine Learning) CIS, LMU München Winter Semester Dr. Alexander Fraser, CIS.

Advertisements

Marking Schema question1: 40 marks question2: 40 marks question3: 20 marks total: 100 marks.

COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.

Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.

COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.

Improved TF-IDF Ranker

Optimizing search engines using clickthrough data

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

Deciding entailment and contradiction with stochastic and edit distance-based alignment Marie-Catherine de Marneffe, Sebastian Pado, Bill MacCartney, Anna.

Student Growth Objective (SGO) Evaluating SGO Quality

A method for unsupervised broad-coverage lexical error detection and correction 4th Workshop on Innovative Uses of NLP for Building Educational Applications.

Anindya Ghose Panos Ipeirotis Arun Sundararajan Stern School of Business New York University Opinion Mining using Econometrics A Case Study on Reputation.

Syntactic Contributions in the Entailment Task Lucy Vanderwende, Arul Menezes, Rion Snow (Stanford)

Normalized alignment of dependency trees for detecting textual entailment Erwin Marsi & Emiel Krahmer Tilburg University Wauter Bosma & Mariët Theune University.

Introduction to Computability Theory

Patch Descriptors CSE P 576 Larry Zitnick

1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.

1 Natural Language Processing for the Web Prof. Kathleen McKeown 722 CEPSR, Office Hours: Wed, 1-2; Tues 4-5 TA: Yves Petinot 719 CEPSR,

July 9, 2003ACL An Improved Pattern Model for Automatic IE Pattern Acquisition Kiyoshi Sudo Satoshi Sekine Ralph Grishman New York University.

Automatic Classification of Semantic Relations between Facts and Opinions Koji Murakami, Eric Nichols, Junta Mizuno, Yotaro Watanabe, Hayato Goto, Megumi.

UNED at PASCAL RTE-2 Challenge IR&NLP Group at UNED nlp.uned.es Jesús Herrera Anselmo Peñas Álvaro Rodrigo Felisa Verdejo.

Chapter 11: Limitations of Algorithmic Power

8-2 Basics of Hypothesis Testing

Alignment based EDA Motivation: an EDA that fully utilizes EOP –EOP now provides various components with clearly separated LAP / CORE. –The platform is.

SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.

Page 1 Relation Alignment for Textual Entailment Recognition Department of Computer Science University of Illinois at Urbana-Champaign Mark Sammons, V.G.Vinod.

Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.

Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at:

1 Lab Session-III CSIT-120 Fall 2000 Revising Previous session Data input and output While loop Exercise Limits and Bounds Session III-B (starts on slide.

1 Statistical NLP: Lecture 10 Lexical Acquisition.

Classifying Tags Using Open Content Resources Simon Overell, Borkur Sigurbjornsson & Roelof van Zwol WSDM ‘09.

Assessing the Impact of Frame Semantics on Textual Entailment Authors: Aljoscha Burchardt, Marco Pennacchiotti, Stefan Thater, Manfred Pinkal Saarland.

Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern, Amnon Lotan, Shachar Mirkin, Eyal Shnarch, Lili Kotlerman, Jonathan Berant and Ido.

Combining terminology resources and statistical methods for entity recognition: an evaluation Angus Roberts, Robert Gaizauskas, Mark Hepple, Yikun Guo.

Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop Nizar Habash and Owen Rambow Center for Computational Learning.

CSE314 Database Systems The Relational Algebra and Relational Calculus Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson Ed Slide Set.

The Language of Mathematics Basic Grammar. A word to the wise The purpose of this tutorial is to get you to understand what equations and inequalities.

Unsupervised Constraint Driven Learning for Transliteration Discovery M. Chang, D. Goldwasser, D. Roth, and Y. Tu.

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

1 Statistical NLP: Lecture 7 Collocations. 2 Introduction 4 Collocations are characterized by limited compositionality. 4 Large overlap between the concepts.

Ideas for 100K Word Data Set for Human and Machine Learning Lori Levin Alon Lavie Jaime Carbonell Language Technologies Institute Carnegie Mellon University.

Detecting compositionality using semantic vector space models based on syntactic context Guillermo Garrido and Anselmo Peñas NLP & IR Group at UNED Madrid,

Albert Gatt LIN3021 Formal Semantics Lecture 4. In this lecture Compositionality in Natural Langauge revisited: The role of types The typed lambda calculus.

A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.

Cluster-specific Named Entity Transliteration Fei Huang HLT/EMNLP 2005.

1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )

1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,

Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏

Finding frequent and interesting triples in text Janez Brank, Dunja Mladenić, Marko Grobelnik Jožef Stefan Institute, Ljubljana, Slovenia.

CSE 185 Introduction to Computer Vision Feature Matching.

Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -

◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.

1 Measuring the Semantic Similarity of Texts Author ： Courtney Corley and Rada Mihalcea Source ： ACL-2005 Reporter ： Yong-Xiang Chen.

From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:

SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland.

Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)

Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.

Multi-Criteria-based Active Learning for Named Entity Recognition ACL 2004.

Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,

Subject/Predicate Bell Ringer…

Relation Extraction (RE) via Supervised Classification See: Jurafsky & Martin SLP book, Chapter 22 Exploring Various Knowledge in Relation Extraction.

Natural Language Processing : Probabilistic Context Free Grammars Updated 8/07.

Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :

CRF &SVM in Medication Extraction

NYU Coreference CSCI-GA.2591 Ralph Grishman.

Structural testing, Path Testing

Recognizing Partial Textual Entailment

Clustering Algorithms for Noun Phrase Coreference Resolution

Presentation transcript:

Outline P1EDA’s simple features currently implemented –And their ablation test Features we have reviewed from Literature –(Let’s briefly visit them) –Iftene’s. –MacCarteny et al. (Stanford system) –BIUTEE gap mode features. Discussion: what we want to (re-)implement, and bring back into EOP. –As aligners, –or as features.

Current features for mk.1 Basic Idea: Simple features first. Word coverage ratio –How much of the H components (here, Tokens) are covered by those of T components? –“base alignment score” Content word coverage ratio –Content words are more important than, non- content words (prepositions, articles, etc) –“Penalize if missed content words”

Current features for mk.1 Proper Noun coverage ratio –Proper nouns (or name entities) are quite specific. Missing (no alignment) PNs should be penalized severely. –Iftene’s rules on NERs. Named entity drops are always non entailment. The only exception is dropping of first name. Verb coverage ratio –Two most effective features of an alignment-based system (Stanford) was –Is the main predicate of Hypothesis covered? –Are the arguments of that predicate covered?

Current results (with optimal settings on mk1 features and aligners ) English: 67.0 % (accuracy) –Aligners: identical.lemma, wordNet, VerbOcean, Meteor paraphrase –Features: word, content word, PN coverage. Italian: % (accuracy) –Aligners: identical.lemma, Italian WordNet –Features: word, content word, verb coverage German: 64.5 % (accuracy) –Aligners: identical.lemma, GermaNet –Features: word, content word, PN coverage.

Ablation test. impact of features ( accuracy (impact) ) ALL features (not necessarily best) Without Verb Coverage feature Without Proper Noun Coverage feat. Without Content word Coverage feat. EN (WN, VO, Para) (-0.25)66.0 (+0.75) (+1.625) IT (WN, Para) (+0.625) (- 0.25) (+2.5) DE (GN) ( ) (+1.125) 63.0 (+1.875)

Ablation test, impact of aligners (with best features of previous slide) EN (67.0 with all of the following + base) –without WordNet: (+1.875) –without VerbOcean: (+0.25) –without Paraphrase (meteror): (+2.125) IT ( with the following + base) –without WordNet(IT): (+0.125) –without Paraphrase (Vivi’s): (-0.5) DE (62.25 with the following + base) –without GermaNet: (+0.125) –without Paraphrase (meteor): 64.5 (-2.25)

FEATURES IN LITERATURE (PREVIOUS RTE SYSTEMS)

Iftene’s RTE system Approach: alignment score and threshold –Alignment has two parts: Positive contribution parts, Negative contribution parts –Use a (manually designed) score function to combine various scores into one final, global alignment score. –Learns a threshold to determine “entailment” (better then threshold) and “non-entailment” (all else)

Iftene’s RTE system Base unit of alignment: node-edge of tree –(Hypothesis) node – edge – node. –Text nodes dependency node-edge-nodes are compared with extended match. (partial match) –Alignment score forms the base-line for score. WordNet, and other resources are used on those matches Additional scores are designed to reflect various good / bad match

Iftene’s RTE system, features Numerical compatibility rule (positive rule) –Numbers and quantities are normally not mapped by lexical resource + local alignment “at least 80 percent” -> “more than 70 percent” “killed 109 people on board and four workers” -> “killed 113 people” –Special calculator was used to calculate the compatibility of the numeric expressions –Reported some impact (1% +) on accuracy. –Our choice: possible aligner candidate?

Iftene’s RTE system, features Negation rules –Truth of the verbs are are denoted on all verbs. –Traversing dependency tree and check existence of “not”, “never”, “may”, “might”, “cannot”, “could”, etc. Particle rules –Particle “to” gets special checking: strongly influenced by active verb, adverb, or noun before particle to –Search for positive (believe, glad, claim) and negative (failed, attempted) ques. “Non matching parts” → add negative score

Iftene’s RTE system, features Named Entity Rule –If an NE on Hypothesis not mapped –Outright rejection as non entailment Exception: if it is a human name, dropping (no alignment) of First name is Okay. –Our choice? NER aligner would be nice. (poor man’s ner coverage checking == current Proper Noun coverage feature)

Stanford TE system Stanford TE system (MacCarteny et al) –1) do monolingual alignment Trained on gold (manually prepared) alignment –2) get alignment score no negative elements in this alignment step. –3) apply feature extraction Design features that would reflect various linguistic phenomena

Stanford TE system, Polarity features Polarity features –Polarity of T-H is checked by existence of negative linguistic markers. Negation (not), downward-monotone marker (no, few), restricting prepositions (without, except) –Features on polarity: polarity of T, polarity of H, does two polarity T-H same? Our choice? –TruthTeller would be better. –But on the other hand, “word” based simple approaches might be useful for other languages.

Stanford TE system, Modality / Factivity features Modality preservation feature –Record modal changes from T to H, and generates a nominal feature. “could be XX” (T) -> “XX” (H) → “WEAK_NO” “cannot YY” (T) -> “not YY” (H) → “WEAK YES” Factivity preservation feature –Focus on verbs that affects “Truth” or “Factivity” “tried to escape” (T) -> “escape” (H) (Feature: false) “managed to escape” (T) -> “escape” (H) (Feature: true)

Stanford TE system, Adjunction feature If T-H are both in positive context –“A dog barked” -> “A dog barked loudly” (not safe adding) –“A dog barked carefully” -> “A dog barked” (safe dropping) If T-H are both in negative context –“The dog did not bark” -> “The dog did not bark loudly” (safe adding) –“The dog did not bark loudly” -> “The dog did not bark” (not safe dropping) Features: “not safe adjunct drop detected”, “not safe adjunct addition detected”, …

Stanford TE system Some other features … Antonym feature –Antonyms found in aligned region (with WordNet). Date/Numbers feature –Binary features that indicates “dates described in T – H aligned region are not matched”. Quantifier feature –Quantifies modifying two aligned parts are “not matched”.

BIUTEE gap mode Main approach of Gap mode –Transform Text as close to Hypothesis with reliable rules → T’ –Evaluate T’ – H pair, by extracting features to evaluate the pair. Two set of features –Lexical Gap features –Predicate Argument Gap features

BIUTEE gap mode, lexical gap feature Two numerical feature values –Score for non-predicate words –Score for predicate words –Not all words are equal: missing rare terms are more heavily penalized (weight: log prob) Sum of all missing terms’ weight, forms one feature value.

BIUTEE gap mode, predicate-argument gap feature Requires predicate-argument structure –How well the structures are covered from that of Text? Degree of matching from Full match to Partial match –Full match: same head words, same governing predicate, set of content words. –Category I, II, III partial matches are defined. 5 numeric features represents degree of match –No. of matched NE arguments, no. of matched non- NE arguments, no. of argument in Cat I, II, III.

Priorities? Features that we might hope to try soon … –Main verb of H matched? Its arguments matched? –Weighted coverage (such as IDF), on word coverage –Date matcher (an aligner) –Features that use TruthTeller alignments (number of matching/non-matching predicate truth) –Polarity/Modality/Factivity features (cheaper than TruthTeller … )