Automatic recognition of discourse relations Lecture 3.

Slides:



Advertisements
Similar presentations
Discourse Structure and Discourse Coherence
Advertisements

M. A. K. Halliday Notes on transivity and theme in English (4.2 – 4.5) Part 2.
Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
Using Syntax to Disambiguate Explicit Discourse Connectives in Text Source: ACL-IJCNLP 2009 Author: Emily Pitler and Ani Nenkova Reporter: Yong-Xiang Chen.
Exploring the Effectiveness of Lexical Ontologies for Modeling Temporal Relations with Markov Logic Eun Y. Ha, Alok Baikadi, Carlyle Licata, Bradford Mott,
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)
Polarity Analysis of Texts using Discourse Structure CIKM 2011 Bas Heerschop Erasmus University Rotterdam Frank Goossen Erasmus.
Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Exploiting Discourse Structure for Sentiment Analysis of Text OR 2013 Alexander Hogenboom In collaboration with Flavius Frasincar, Uzay Kaymak, and Franciska.
1 Discourse, coherence and anaphora resolution Lecture 16.
Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions Eric Breck and Claire Cardie Department of Computer.
Bag-of-Words Methods for Text Mining CSCI-GA.2590 – Lecture 2A
April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 1 Layering of Annotations in the Penn Discourse TreeBank (PDTB) Rashmi Prasad Institute.
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
Assuming normally distributed data! Naïve Bayes Classifier.
The use of unlabeled data to improve supervised learning for text summarization MR Amini, P Gallinari (SIGIR 2002) Slides prepared by Jon Elsas for the.
Optimizing Text Classification Mark Trenorden Supervisor: Geoff Webb.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
Automatic Classification of Semantic Relations between Facts and Opinions Koji Murakami, Eric Nichols, Junta Mizuno, Yotaro Watanabe, Hayato Goto, Megumi.
Predicting the Semantic Orientation of Adjectives
Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae.
Part of speech (POS) tagging
Extracting Interest Tags from Twitter User Biographies Ying Ding, Jing Jiang School of Information Systems Singapore Management University AIRS 2014, Kuching,
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
An Automatic Segmentation Method Combined with Length Descending and String Frequency Statistics for Chinese Shaohua Jiang, Yanzhong Dang Institute of.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Bayesian Networks. Male brain wiring Female brain wiring.
TagHelper: Basics Part 1 Carolyn Penstein Rosé Carnegie Mellon University Funded through the Pittsburgh Science of Learning Center and The Office of Naval.
2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.
Text Classification, Active/Interactive learning.
Researcher affiliation extraction from homepages I. Nagy, R. Farkas, M. Jelasity University of Szeged, Hungary.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
TEXT CLASSIFICATION USING MACHINE LEARNING Student: Hung Vo Course: CP-SC 881 Instructor: Professor Luo Feng Clemson University 04/27/2011.
An Effective Word Sense Disambiguation Model Using Automatic Sense Tagging Based on Dictionary Information Yong-Gu Lee
Recognizing Names in Biomedical Texts: a Machine Learning Approach GuoDong Zhou 1,*, Jie Zhang 1,2, Jian Su 1, Dan Shen 1,2 and ChewLim Tan 2 1 Institute.
Bootstrapping for Text Learning Tasks Ramya Nagarajan AIML Seminar March 6, 2001.
1 Multi-Perspective Question Answering Using the OpQA Corpus (HLT/EMNLP 2005) Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University.
Ideas for 100K Word Data Set for Human and Machine Learning Lori Levin Alon Lavie Jaime Carbonell Language Technologies Institute Carnegie Mellon University.
Lecture 21 Computational Lexical Semantics Topics Features in NLTK III Computational Lexical Semantics Semantic Web USCReadings: NLTK book Chapter 10 Text.
Minimally Supervised Event Causality Identification Quang Do, Yee Seng, and Dan Roth University of Illinois at Urbana-Champaign 1 EMNLP-2011.
Automatic Identification of Pro and Con Reasons in Online Reviews Soo-Min Kim and Eduard Hovy USC Information Sciences Institute Proceedings of the COLING/ACL.
Processing of large document collections Part 5 (Text summarization) Helena Ahonen-Myka Spring 2005.
Bag-of-Words Methods for Text Mining CSCI-GA.2590 – Lecture 2A Ralph Grishman NYU.
CHAPTER 6 Naive Bayes Models for Classification. QUESTION????
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Supertagging CMSC Natural Language Processing January 31, 2006.
©2012 Paula Matuszek CSC 9010: Text Mining Applications Lab 3 Dr. Paula Matuszek (610)
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
4. Relationship Extraction Part 4 of Information Extraction Sunita Sarawagi 9/7/2012CS 652, Peter Lindes1.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Objectives: Terminology Components The Design Cycle Resources: DHS Slides – Chapter 1 Glossary Java Applet URL:.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.../publications/courses/ece_8443/lectures/current/lecture_02.ppt.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Automatic sense prediction for implicit discourse relations in text Emily Pitler, Annie Louis, Ani Nenkova University of Pennsylvania ACL 2009.
1 Measuring the Semantic Similarity of Texts Author : Courtney Corley and Rada Mihalcea Source : ACL-2005 Reporter : Yong-Xiang Chen.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,
A Maximum Entropy Language Model Integrating N-grams and Topic Dependencies for Conversational Speech Recognition Sanjeev Khudanpur and Jun Wu Johns Hopkins.
Discourse: Structure and Coherence Kathy McKeown Thanks to Dan Jurafsky, Diane Litman, Andy Kehler, Jim Martin.
Recognizing Discourse Structure: Text Discourse & Dialogue CMSC October 16, 2006.
Natural Language Processing COMPSCI 423/723 Rohit Kate.
Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
 V = verb: action in the sentence  S = subject: noun or pronoun performing the action  DO = direct object: comes after an action verb and answers the.
Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia-Molina Department of Computer Science Stanford University SIGIR 2008 Presentation.
Content Selection: Supervision & Discourse
Improving a Pipeline Architecture for Shallow Discourse Parsing
Machine Learning in Practice Lecture 11
Automatic Detection of Causal Relations for Question Answering
Presentation transcript:

Automatic recognition of discourse relations Lecture 3

Can RST analysis be done automatically? In the papers we’ll read the question is really about local rhetorical relations Part of the problem is the availability of training data for automatic labelling Manual annotation is slow and expansive Lots of data can be cleverly collected, but is it appropriate (SL’07 paper)

ME’02 Discourse relations are often signaled by cue phrases CONTRAST: but EXPLANATION-EVIDANCE: because But not always. In a manually annotated corpus 25% of contrast and explanation-evidence relations marked explicitly by a cue phrase Mary liked the play, John hated it He wakes up early every morning. There is a construction site opposite his building.

Cleverly labeling data through patterns with cue phrases CONTRAST [BOS…EOS][BOS But…EOS] [BOS…][but…EOS] [BOS…][although…EOS] [BOS Although…,][…EOS] CAUSE-EXPLANATION [BOS…][because…EOS] [BOS Because…,][…EOS] [BOS…EOS][Thus,…EOS]

Extraction patterns CONDITION [BOS If…,][…EOS] [BOS If…][then…EOS] [BOS…][if…EOS] ELABORATION [BOS…EOS][BOS…for example…EOS] [BOS…][which…EOS] NO-RELATION-SAME-TEXT NO-RELATION-DIFF-TEXT

Main idea Pairs of words can trigger a given relation John is good in math and sciences. Paul fails almost every class he takes. Embargo—legally Features for classification the cartesian product of the words in the two text spans being annotated

Probability of word-pairs given a relation log(W1,W2|RLk) + log(P(RLk) Classification results are well above the baseline Using only content words did not seem to be very helpful Model does not perform that well on manually annotated examples

Discussion Would be interesting to see the list of the most informative word-pairs per relation Is there an intrinsic difference in clauses explicitly marked for a relation compared to those where the relation is implicit?

B-GMR’07: Offer several improvements over ME’02 Tokenizing and stemming Improves accuracy Reduces model size Vocabulary size limit/minimum frequency Using 6,400 most frequent words is best Using a stoplist Performance deteriorates (as in the original ME’02 paper!) Topic segmentation for better example collection

SL’07 Using automatically labeled examples to classify rhetorical relations Is it a good idea? The answer is no, as already hinted by the other papers

Two classifiers Word-pair based Naïve Bayes Multi-feature (41) BoosTexter model Positional Length Lexical POS Temporal Cohesion (pronouns and ellipsis)

Explicit note here, not in the previous papers The distribution of different relations in the automatically extracted corpus does not reflect the true distribution In all studies data is downsampled

Testing on explicit relations Results deteriorate for both machine learning approaches Still better than random Natural data does not seem suitable for training Do not generalize well to examples which occur naturally without unambiguous discourse markers

Training on manually labeled, unmarked data Less training data is available Worse for the Naïve Bayes classifer Good for the Boostexter model Why? Semantic redundancy between discourse markers and the context they appear in?

Using the Penn discourse tree bank Implicit relations Not that good performance Explicit relations Performance closer to that in automatically collected test set Cheap data collection for this task probably not that good idea after all!