Recognizing Textual Entailment Progress towards RTE 4 Scott Settembre University at Buffalo, SNePS Research Group

Slides:



Advertisements
Similar presentations
Recognizing Textual Entailment Challenge PASCAL Suleiman BaniHani.
Advertisements

SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
2 Information Retrieval System IR System Query String Document corpus Ranked Documents 1. Doc1 2. Doc2 3. Doc3.
Overview of the KBP 2013 Slot Filler Validation Track Hoa Trang Dang National Institute of Standards and Technology.
Methods in Computational Linguistics II Queens College Lecture 1: Introduction.
Semantic Entailment Nathaniel Story Ginger Buckbee Greg Lorge Billy Dean.
LING 581: Advanced Computational Linguistics Lecture Notes April 27th.
Latent Semantic Analysis
Applications Chapter 9, Cimiano Ontology Learning Textbook Presented by Aaron Stewart.
Textual Entailment Using Univariate Density Model and Maximizing Discriminant Function “Third Recognizing Textual Entailment Challenge 2007 Submission”
Automatic Classification of Semantic Relations between Facts and Opinions Koji Murakami, Eric Nichols, Junta Mizuno, Yotaro Watanabe, Hayato Goto, Megumi.
Third Recognizing Textual Entailment Challenge Potential SNeRG Submission.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Latent Semantic Indexing Introduction to Artificial Intelligence COS302 Michael L. Littman Fall 2001.
A Confidence Model for Syntactically-Motivated Entailment Proofs Asher Stern & Ido Dagan ISCOL June 2011, Israel 1.
Chapter 5: Information Retrieval and Web Search
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Overview of the Fourth Recognising Textual Entailment Challenge NIST-Nov. 17, 2008TAC Danilo Giampiccolo (coordinator, CELCT) Hoa Trang Dan (NIST)
Chapter 2 Dimensionality Reduction. Linear Methods
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.
Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern, Amnon Lotan, Shachar Mirkin, Eyal Shnarch, Lili Kotlerman, Jonathan Berant and Ido.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
Speech Analysing Component in Automatic Tutoring Systems Presentation by Doris Diedrich and Benjamin Kempe.
Question Answering.  Goal  Automatically answer questions submitted by humans in a natural language form  Approaches  Rely on techniques from diverse.
Processing of large document collections Part 7 (Text summarization: multi- document summarization, knowledge- rich approaches, current topics) Helena.
Indices Tomasz Bartoszewski. Inverted Index Search Construction Compression.
A Machine Learning Approach to Sentence Ordering for Multidocument Summarization and Its Evaluation D. Bollegala, N. Okazaki and M. Ishizuka The University.
Recognizing textual entailment: Rational, evaluation and approaches Source:Natural Language Engineering 15 (4) Author:Ido Dagan, Bill Dolan, Bernardo Magnini.
Expressing Implicit Semantic Relations without Supervision ACL 2006.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
CONCLUSION & FUTURE WORK Given a new user with an information gathering task consisting of document IDs and respective term vectors, this can be compared.
Understanding The Semantics of Media Chapter 8 Camilo A. Celis.
1 Automatic Classification of Bookmarked Web Pages Chris Staff Second Talk February 2007.
June 5, 2006University of Trento1 Latent Semantic Indexing for the Routing Problem Doctorate course “Web Information Retrieval” PhD Student Irina Veredina.
CONCLUSION & FUTURE WORK Normally, users perform search tasks using multiple applications in concert: a search engine interface presents lists of potentially.
SINGULAR VALUE DECOMPOSITION (SVD)
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
LOGO A comparison of two web-based document management systems ShaoxinYu Columbia University March 31, 2009.
1 Sentence Extraction-based Presentation Summarization Techniques and Evaluation Metrics Makoto Hirohata, Yousuke Shinnaka, Koji Iwano and Sadaoki Furui.
Summarization Focusing on Polarity or Opinion Fragments in Blogs Yohei Seki Toyohashi University of Technology Visiting Scholar at Columbia University.
Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Kevin Heinrich, Lai Wei, and Michael W. Berry University of Tennessee.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 6. Dimensionality Reduction.
Latent Semantic Indexing and Probabilistic (Bayesian) Information Retrieval.
1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )
Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Evgeniy Gabrilovich and Shaul Markovitch
1 Latent Concepts and the Number Orthogonal Factors in Latent Semantic Analysis Georges Dupret
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
A Patent Document Retrieval System Addressing Both Semantic and Syntactic Properties Liang Chen*,Naoyuki Tokuda+, Hisahiro Adachi+ *University of Northern.
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
DISTRIBUTED INFORMATION RETRIEVAL Lee Won Hee.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Feature Assignment LBSC 878 February 22, 1999 Douglas W. Oard and Dagobert Soergel.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
Recognising Textual Entailment Johan Bos School of Informatics University of Edinburgh Scotland,UK.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
From Frequency to Meaning: Vector Space Models of Semantics
Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :
Vector-Space (Distributional) Lexical Semantics
CSE 635 Multimedia Information Retrieval
What is the Entrance Exams Task
Unsupervised Learning of Narrative Schemas and their Participants
Information Retrieval
Topic: Semantic Text Mining
Presentation transcript:

Recognizing Textual Entailment Progress towards RTE 4 Scott Settembre University at Buffalo, SNePS Research Group

Recognizing Textual Entailment Challenge (RTE) - Overview The task is to develop a system to determine if a given pair of sentences has the first sentence “entail” the second sentence The pair of sentences is called the Text-Hypothesis pair (or T- H pair) Participants are provided with 800 sample T-H pairs annotated with the correct entailment answers The final testing set consists of 800 non-annotated samples

Development set examples Example of a YES result As much as 200 mm of rain have been recorded in portions of British Columbia, on the west coast of Canada since Monday. British Columbia is located in Canada. Example of a NO result Blue Mountain Lumber is a subsidiary of Malaysian forestry transnational corporation, Ernslaw One. Blue Mountain Lumber owns Ernslaw One.

Entailment Task Types There are 4 different entailment tasks: –“IE” or Information Extraction Text: “An Afghan interpreter, employed by the United States, was also wounded.” Hypothesis: “An interpreter worked for Afghanistan.” –“IR” or Information Retrieval Text: “Catastrophic floods in Europe endanger lives and cause human tragedy as well as heavy economic losses” Hypothesis: “Flooding in Europe causes major economic losses.”

Entailment Task Types - continued The two remaining entailment tasks are: –“SUM” or Multi-document summarization Text: “Sheriff's officials said a robot could be put to use in Ventura County, where the bomb squad has responded to more than 40 calls this year.” Hypothesis: “Police use robots for bomb-handling.” –“QA” or Question Answering Text: “Israel's prime Minister, Ariel Sharon, visited Prague.” Hypothesis: “Ariel Sharon is the Israeli Prime Minister.”

RTE Results Our two runs submitted this year (2007) scored: –%62.62 (501 correct out of 800) –%61.00 (488 correct out of 800) For the 3 nd RTE Challenge of 2007, a %62.62 ties for 12 th out of 26 teams. –Top scores were %80, %72, %69, and %67. –Median: %61.75 –Range: %49 to %80 (up from %75.38 last year)

RTE Results Category breakdown consistent with last year –QA (question answering) average was %71[%75] –IR (information retrieval) average was %66[%63] –Summary average was %58[%61.5] –IE (information extraction) average was %52[%51] This relationship between the entailment categories was consistent between the groups as well.

Best Performers Hickl, one of the top performers, used techniques like these: –Lexical relationships, using Wordnet * –N-gram, word similiarity * –Anaphora resolution –Machine learning techniques * Entailment corpora, more than provided by RTE –Logical inference Using background knowledge *Also used by our submission

Best Performers Another top performer Tatu (%72), focused mainly on these techniques –Lexical relationships, using Wordnet –Anaphora resolution –Logical inference Using background knowledge A good performances came out of LSA, Lexical Semantic Analysis –%67 score came out of using LSA (top 4 performer) –Only 3 teams used LSA, 2 scored low (%58,%55)

List of Techniques Used Lexical similarity, using a dictionary/thesaurus source –Wordnet, DIRT, and MSOffice dictionary used n-gram, word similarity (also “bag of words”) Syntactic matching and aligning Semantic role labeling –Framenet, Probank, Verbnet Corpus (web-based) statistics –LSA – Latent Semantic Analysis Machine Learning Classification –ANNs (Neural networks), HMMs, SVM (Support Vector machines) Anaphora resolution Entailment corpora, background knowledge Logical Inference

Logical Inference Techniques Used SNePS should be here! Extended Wordnet or Wordnet 3.0 –Expresses word relationships in logic rules DIRT – a paraphrase database of world knowledge –Expresses equivalent paraphrases in terms of rules –i.e. X kills Y  X attacks Y –note: this rule did not contain (“and Y dies”) Framenet –Uses a Frame to express a relationship between a “objects” in a script along with other “objects”, like roles, situations, events Use specifically developed semantic inference modules Oddly, no one used OpenCyc

New Technique for our RTE 4 Submission Latent Semantic Analysis – LSA LSI technique developed back in 1988, addressing search indexing LSA improved upon LSI in 1990's, applied to summary and evaluation Important for us because result can be expressed as a metric or a feature vector, fits right into the RTE Tool Helps overcome the “poverty of the stimulus” problem, by “accommodating a very large number of local co-occurrence relations simultaneously” [Landauer, T. K., Foltz, P. W., & Laham, D. 1998]

How LSA Works The process includes –Setting up a matrix of words to words or words to documents –Performing a Singular Value Decomposition (SVD) on that matrix –Reducing the resulting three smaller matrixes by removing rows of 0 coefficients –We then reconstruct the original matrix, which essentially relates words (cells) that had not been directly related to each other initially, and redistributes the correlation between them –Then, depending on what relationship one is trying to find, we can extract the feature vectors we wish to compare and calculate the cosine between them –Uh huh, so what does this all mean… Let’s look at an oversimplified example

LSA - Oversimplified Example – part 1 We have two documents: D1 is about dogs, D2 cats D1 contains the words “dog” “pet” “leash” “walk” “bark” D2 contains the words “cat” “roam” “jump” “purr” “pet” At this level, we may not know if any of these words are related, especially if we have many documents and many words But, we can see that the word “pet” is in both documents This “may” imply that there is a relationship between some words in D1 and D2, simply because “pet” occurred in both

LSA - Oversimplified Example – part 2 If we construct a matrix, it would look like this In the “pet” column we can see there is a commonality After applying SVD, the matrix may smooth to something like this We can see now there is some relation between the documents, no longer concentrated on just one common word dogpetleas h wal k bar k catroa m jum p pur r D D dogpetleas h wal k bar k catroa m jum p pur r D D

LSA – How I Plan to Apply I will be creating two matricies –One matrix will contain data ONLY from entailed sentence pairs –Second matrix will be for non-entailed pairs Each matrix vector will contain word to word comparisons –Each row will contain a word that has been used in successful entailment –Each column will contain the passage to be entailed from –SVD performed on each, reduced, then combined again Now, to determine if a new pair is entailed –We calculate the feature vector associated with each word and each matrix –We then calculate the COS between each word/matrix vector –Then “combine” the COS for each vector set (vector set from each matrix) –Perform a linear discriminant function to classify entailment (from the RTE Tool of RTE3)

LSA – Progress Developing LSA in ACL Using Matrix package from Benefits of using LSA –No need to program rules or compare sentence structures –Mimics performance that humans have [see ref from before] –No need to consider all information, since correlations can be created between words even if a specific word has not been seen before Drawbacks of using LSA –I need a lot more data, unsure how much (I may be able to calculate later) –Linear algebra is complicated for my small symbolic brain –I’m at the mercy of the literature, though I made some innovation in LSA use

RTE Challenge - Final Notes See the continued progress at: RTE Web Site: Textual Entailment resource pool: Actual ranking released in June 2007 at: November 15, 2007 SNeRG MeetingScott Settembre