Improving a Pipeline Architecture for Shallow Discourse Parsing

Slides:

Advertisements

Similar presentations

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.

Advertisements

LABELING TURKISH NEWS STORIES WITH CRF Prof. Dr. Eşref Adalı ISTANBUL TECHNICAL UNIVERSITY COMPUTER ENGINEERING 1.

Using Syntax to Disambiguate Explicit Discourse Connectives in Text Source: ACL-IJCNLP 2009 Author: Emily Pitler and Ani Nenkova Reporter: Yong-Xiang Chen.

Exploring the Effectiveness of Lexical Ontologies for Modeling Temporal Relations with Markov Logic Eun Y. Ha, Alok Baikadi, Carlyle Licata, Bradford Mott,

Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.

Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

Polarity Analysis of Texts using Discourse Structure CIKM 2011 Bas Heerschop Erasmus University Rotterdam Frank Goossen Erasmus.

Automatically Evaluating Text Coherence Using Discourse Relations Ziheng Lin, Hwee Tou Ng and Min-Yen Kan Department of Computer Science National University.

Developing writing skills meaningfully COHERENCE AND COHESION.

Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Semantic Role Labeling Abdul-Lateef Yussiff

Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.

Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.

Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.

Page 1 Generalized Inference with Multiple Semantic Role Labeling Systems Peter Koomen, Vasin Punyakanok, Dan Roth, (Scott) Wen-tau Yih Department of Computer.

Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University

COMP205 Comparative Programming Languages Part 1: Introduction to programming languages Lecture 2: Structure of programs and programming languages as communication.

Seven Lectures on Statistical Parsing Christopher Manning LSA Linguistic Institute 2007 LSA 354 Lecture 7.

1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.

A Comparison of Features for Automatic Readability Assessment Lijun Feng 1 Matt Huenerfauth 1 Martin Jansche 2 No´emie Elhadad 3 1 City University of New.

Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,

Open Information Extraction using Wikipedia

This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.

SOCIO-COGNITIVE APPROACHES TO TESTING AND ASSESSMENT

Automatic Readability Evaluation Using a Neural Network Vivaek Shivakumar October 29, 2009.

Unit-1 Introduction Prepared by: Prof. Harish I Rathod

1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

Minimally Supervised Event Causality Identification Quang Do, Yee Seng, and Dan Roth University of Illinois at Urbana-Champaign 1 EMNLP-2011.

Inference Protocols for Coreference Resolution Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Nick Rizzolo, Mark Sammons, and Dan Roth This research.

A New Multi-document Summarization System Yi Guo and Gorge Stylios Heriot-Watt University, Scotland, U.K. (DUC2003)

UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.

Support Vector Machines and Kernel Methods for Co-Reference Resolution 2007 Summer Workshop on Human Language Technology Center for Language and Speech.

Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)

An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,

Overview of Statistical NLP IR Group Meeting March 7, 2006.

Relation Extraction (RE) via Supervised Classification See: Jurafsky & Martin SLP book, Chapter 22 Exploring Various Knowledge in Relation Extraction.

The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.

Unsupervised Sparse Vector Densification for Short Text Similarity

Language Identification and Part-of-Speech Tagging

Content Selection: Supervision & Discourse

Neural Machine Translation

Advanced Computer Systems

Compiler Design (40-414) Main Text Book:

Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :

Linguistic Graph Similarity for News Sentence Searching

Concept Grounding to Multiple Knowledge Bases via Indirect Supervision

PRESENTED BY: PEAR A BHUIYAN

Authorship Attribution Using Probabilistic Context-Free Grammars

INAGO Project Automatic Knowledge Base Generation from Text for Interactive Question Answering.

By Dan Roth and Wen-tau Yih PowerPoint by: Reno Kriz CIS

Natural Language Processing (NLP)

Two Discourse Driven Language Models for Semantics

Aspect-based sentiment analysis

Machine Learning Week 1.

Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang

Information Structure and Prosody

Lecture 12: Data Wrangling

Introduction Task: extracting relational facts from text

Automatic Detection of Causal Relations for Question Answering

Natural Language Processing (NLP)

University of Illinois System in HOO Text Correction Shared Task

Chapter 10: Compilers and Language Translation

Rachit Saluja 03/20/2019 Relation Extraction with Matrix Factorization and Universal Schemas Sebastian Riedel, Limin Yao, Andrew.

The Winograd Schema Challenge Hector J. Levesque AAAI, 2011

Natural Language Processing (NLP)

Presentation transcript:

Improving a Pipeline Architecture for Shallow Discourse Parsing Yangqiu Song, Haoruo Peng, Parisa Kordjamshidi, Mark Sammons, and Dan Roth Cognitive Computation Group, University of Illinois Problem System Architecture Results Discourse Structure We present results for the individual components (based on cross-validation over the training corpus) and for components and the complete system on the blind test corpus. Error analysis suggests argument boundaries are problematic when the gold arguments omit some internal content; the sense classification requires better abstraction over argument content. Text that humans produce – documents, letters, emails, reports, news articles – have a coherent structure that imposes order over individual sentences. It can be thought of as relations between sentences, paragraphs, section headers, and titles that a human reader perceives in terms of explicit or implicit connections relating the information contained in each. This discourse structure is complex, and to infer it will require deep understanding of text. The task of shallow discourse parsing simplifies the problem with the goal of establishing some first steps towards inferring the deeper structure. Shallow discourse parsers identify and label local discourse relations that identify statements of interest and the connections between statements that occur within one or two sentences of each other. Connectives may be explicit, in which case the connection is expressed as one or more words (such as “next”, “if… then”, “as a result”, “in contrast”). They may also be implicit, in which case they can be inferred by the reader based on the semantics of neighboring statements. For example: The Shallow Discourse Parsing task identifies 15 possible types of discourse connection, and requires participating systems to identify the extent of the statements (the arguments) that are connected. This means leaving out irrelevant content such as attribution clauses: Documents Explicit Connective Identifier Argument Position Identifier Preprocessor A list of 99 connective phrases is used to identify candidates, which are classified using syntactic path features. The input is raw text. The evaluation used Penn Treebank and a new, blind test set. The text is processed with Part-of-Speech, Chunker, and syntactic constituent and dependency parsers. For each predicted connective, a classifier predicts whether the first argument is in the previous or the same sentence. Component Performance Prec Rec F1 Explicit Connective Identifier 92.97 93.91 93.44 Argument Positions 98.15 Argument 1 64.41 64.95 64.68 Argument 2 87.06 86.06 86.56 Explicit Connective Sense 83.18 Implicit Connective Sense 34.58 Attribution Identification* 82.94 58.02 68.27 Shallow Discourse Parsing Attribution Identifier Implicit Connective Classifier Explicit Connective Classifier Argument Labeler An attribution clause identifier labels candidate segments of arguments for exclusion from the final output. Candidate sentence pairs are considered for implicit connectives. A multi-class classifier using lexical features from the two sentences predicts either “no connective” or one of the possible connective labels. A multi-class classifier using lexico-syntactic features is used to label the explicit connectives identified earlier. Argument candidates are generated from syntactic parse constituents, and the candidates are scored and ranked using a machine-learned model. Argument 1 Table 1. Component-level performance measured on training data (10-fold cross-validation). ‘*’ indicates training data was from different Discourse data set. He added that ”[having just one firm do this isn't going to mean a hill of beans]. But [if this prompts others to consider the same thing, then it may become much more important].” Explicit connective: Comparison.Concession Component Performance Prec Rec F1 Explicit Connective Identifier 89.11 86.87 87.98 Argument 1 49.52 51.61 50.55 Argument 2 66.83 69.64 68.21 Argument 1 & Argument 2 40.48 42.18 41.31 Connective Sense (all) 21.02 16.81 16.49 Parser (overall) 17.62 18.36 17.98 Explorations and Innovations Argument 2 Our system follows (Lin et al. 2014) quite closely in the design of the pipeline architecture and basic features. We experimented with a number of enhancements to the original model. We experimented with an abstraction of constituency parse tree path features that uses function words rather than node labels where appropriate. This improved the performance of the argument classifier. Following work by (Peng et al. 2015), we used polarity features to improve the performance of the implicit connective classifier. Polarity features relate the positive/negative sentiment of predicates of consecutive sentences in a way that accounts for their context. We tried to use general statistical context features to improve implicit connective prediction. These features are based on correlations between entities and verbs/nouns/other entities in their immediate context. However, we were not able to improve overall performance on this task. We modeled global characteristics of sequences of connective labels by using neighboring label predictions as features when classifying connectives. The implementation used these features at test time but not during training (i.e., joint inference not joint learning). However, we were not able to improve system performance with this approach. Entity Context Features Argument 1 According to The Times, [Barings Bank was hammered by losses incurred by rogue trader Nick Leeson]. [The bank had to file for bankruptcy]. Function Words Table 2. Component-level and system performance measured on blind test set Implicit connective: Contingency.Cause.Result Joint Inference Argument 2 References Shared Task Polarity Features Ziheng Lin, Hwee Tou Ng, and Min-Yen Kan. “A PDTB- Styled End-to-End Discourse Parser”, Journal of Natural Language Engineering, 2014 Haoruo Peng, Daniel Khashabi, and Dan Roth, "Solving Hard Coreference Problems", Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics This work is supported by DARPA, NIGMS and the Multimodal Information Access & Synthesis Center at UIUC.