Discourse Parsing in the Penn Discourse Treebank: Using Discourse Structures to Model Coherence and Improve User Tasks Ziheng Lin Ph.D. Thesis Proposal.

Slides:



Advertisements
Similar presentations
Using Syntax to Disambiguate Explicit Discourse Connectives in Text Source: ACL-IJCNLP 2009 Author: Emily Pitler and Ani Nenkova Reporter: Yong-Xiang Chen.
Advertisements

Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.
Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.
Automatically Evaluating Text Coherence Using Discourse Relations Ziheng Lin, Hwee Tou Ng and Min-Yen Kan Department of Computer Science National University.
Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions Eric Breck and Claire Cardie Department of Computer.
Semantic Role Labeling Abdul-Lateef Yussiff
Learning with Probabilistic Features for Improved Pipeline Models Razvan C. Bunescu Electrical Engineering and Computer Science Ohio University Athens,
Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.
April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 1 Layering of Annotations in the Penn Discourse TreeBank (PDTB) Rashmi Prasad Institute.
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
Ang Sun Ralph Grishman Wei Xu Bonan Min November 15, 2011 TAC 2011 Workshop Gaithersburg, Maryland USA.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst Sept 20, 2004.
Annotation Types for UIMA Edward Loper. UIMA Unified Information Management Architecture Analytics framework –Consists of components that perform specific.
DS-to-PS conversion Fei Xia University of Washington July 29,
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
Automatic Classification of Semantic Relations between Facts and Opinions Koji Murakami, Eric Nichols, Junta Mizuno, Yotaro Watanabe, Hayato Goto, Megumi.
Page 1 Generalized Inference with Multiple Semantic Role Labeling Systems Peter Koomen, Vasin Punyakanok, Dan Roth, (Scott) Wen-tau Yih Department of Computer.
Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.
Introduction to Machine Learning Approach Lecture 5.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
Mining and Summarizing Customer Reviews
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
A Weakly-Supervised Approach to Argumentative Zoning of Scientific Documents Yufan Guo Anna Korhonen Thierry Poibeau 1 Review By: Pranjal Singh Paper.
Automatic classification for implicit discourse relations Lin Ziheng.
Efficiently Computed Lexical Chains As an Intermediate Representation for Automatic Text Summarization H.G. Silber and K.F. McCoy University of Delaware.
Recognizing Names in Biomedical Texts: a Machine Learning Approach GuoDong Zhou 1,*, Jie Zhang 1,2, Jian Su 1, Dan Shen 1,2 and ChewLim Tan 2 1 Institute.
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart
Transformation-Based Learning Advanced Statistical Methods in NLP Ling 572 March 1, 2012.
A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,
1 Intelligente Analyse- und Informationssysteme Frank Reichartz, Hannes Korte & Gerhard Paass Fraunhofer IAIS, Sankt Augustin, Germany Dependency Tree.
Prototype-Driven Learning for Sequence Models Aria Haghighi and Dan Klein University of California Berkeley Slides prepared by Andrew Carlson for the Semi-
Minimally Supervised Event Causality Identification Quang Do, Yee Seng, and Dan Roth University of Illinois at Urbana-Champaign 1 EMNLP-2011.
August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Supertagging CMSC Natural Language Processing January 31, 2006.
Automatic recognition of discourse relations Lecture 3.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Natural Language Generation with Tree Conditional Random Fields Wei Lu, Hwee Tou Ng, Wee Sun Lee Singapore-MIT Alliance National University of Singapore.
Wei Lu, Hwee Tou Ng, Wee Sun Lee National University of Singapore
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
NLP. Parsing ( (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken) ) (,,) (ADJP (NP (CD 61) (NNS years) ) (JJ old) ) (,,) ) (VP (MD will) (VP (VB join) (NP (DT.
Department of Computer Science The University of Texas at Austin USA Joint Entity and Relation Extraction using Card-Pyramid Parsing Rohit J. Kate Raymond.
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Chinese Named Entity Recognition using Lexicalized HMMs.
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
Natural Language Processing Vasile Rus
Linguistic Graph Similarity for News Sentence Searching
CRF &SVM in Medication Extraction
Authorship Attribution Using Probabilistic Context-Free Grammars
Relation Extraction CSCI-GA.2591
Improving a Pipeline Architecture for Shallow Discourse Parsing
Constraining Chart Parsing with Partial Tree Bracketing
Chunk Parsing CS1573: AI Application Development, Spring 2003
Presentation transcript:

Discourse Parsing in the Penn Discourse Treebank: Using Discourse Structures to Model Coherence and Improve User Tasks Ziheng Lin Ph.D. Thesis Proposal Advisors: Prof Min-Yen Kan and Prof Hwee Tou Ng

Introduction 2  A text is usually understood by its discourse structure  Discourse parsing: a process of  Identifying discourse relations, and  Constructing the internal discourse structure  A number of discourse frameworks has been proposed:  Mann & Thompson (1988)  Lascarides & Asher (1993)  Webber (2004)  …

Introduction 3  The Penn Discourse Treebank (PDTB):  Is a large-scale discourse-level annotation  Follows Webber’s framework  Understanding a text’s discourse structure is useful:  Discourse structure and textual coherence have a strong connection  Discourse parsing is useful in modeling coherence  Discourse parsing also helps downstream NLP applications  Contrast, Restatement  summarization  Cause  QA

Introduction 4  Research goals: 1. Design an end-to-end PDTB-styled discourse parser 2. Propose a coherence model based on discourse structures 3. Show discourse parsing improves downstream NLP application

Outline 5 1. Introduction 2. Literature review 1. Discourse parsing 2. Coherence modeling 3. Recognizing implicit discourse relations 4. A PDTB-styled end-to-end discourse parser 5. Modeling coherence using discourse relations 6. Proposed work and timeline 7. Conclusion

Discourse parsing 6  Recognize the discourse relations between two text spans, and  Organize these relations into a discourse structure  Two main classes of relations in PDTB:  Explicit relations: explicit discourse connective such as however and because  Implicit relations: no discourse connective, harder to recognize  parsing implicit relations is a hard task

Discourse parsing 7  Marcu & Echihabi (2002):  Word pairs extracted from two text spans  Collect implicit relations by removing connectives  Wellner et al. (2006):  Connectives, distance between text spans, and event-based features  Discourse Graphbank: explicit and implicit  Soricut & Marcu (2003):  Probabilistic models on sentence-level segmentation and parsing  RST Discourse Treebank (RST-DT)  duVerle & Prendinger (2009):  SVM to identify discourse structure and label relation types  RST-DT  Wellner & Pustejovsky (2007), Elwell & Baldridge (2008), Wellner (2009)

Coherence modeling 8  Barzilay & Lapata (2008):  Local coherence  Distribution of discourse entities exhibits certain regularities on a sentence-to-sentence transition  Model coherence using an entity grid  Barzilay & Lee (2004):  Global coherence  Newswire reports follow certain patterns of topic shift  Used a domain-specific HMM model to capture topic shift in a text

Outline 9 1. Introduction 2. Literature review 3. Recognizing implicit discourse relations 1. Methodology 2. Experiments 4. A PDTB-styled end-to-end discourse parser 5. Modeling coherence using discourse relations 6. Proposed work and timeline 7. Conclusion

Methodology 10  Supervised learning on a maximum entropy classifier  Four feature classes  Contextual features  Constituent parse features  Dependency parse features  Lexical features

Methodology: Contextual features 11  Dependencies between two adjacent discourse relations r 1 and r 2  independent  fully embedded argument  shared argument  properly contained argument  pure crossing  partially overlapping argument  Fully embedded argument and shared argument are the most common ones in the PDTB

Methodology: Contextual features 12  For an implicit relation curr that we want to classify, look at the surrounding two relations prev and next  six binary features:

Methodology: Constituent parse features 13  Collect all production rules  Three binary features to check whether a rule appears in Arg1, Arg2, and both S  NP VP NP  PRP PRP  “We” ……

Methodology: Dependency parse features 14  Encode additional information at the word level  Collect all words with the dependency types from their dependents:  Three binary features to check whether a rule appears in Arg1, Arg2, and both “had”  nsubj dobj “problems”  det nn advmod “at”  dep

Methodology: Lexical features 15  Marcu & Echihabi (2002) show word pairs are a good signal to classify discourse relations Arg1: John is good in math and sciences. Arg2: Paul fails almost every class he takes. (good, fails) is a good indicator for a contrast relation  Stem and collect all word pairs from Arg1 and Arg2 as features

Outline Introduction 2. Literature review 3. Recognizing implicit discourse relations 1. Methodology 2. Experiments 4. A PDTB-styled end-to-end discourse parser 5. Modeling coherence using discourse relations 6. Proposed work and timeline 7. Conclusion

Experiments 17  w/ feature selection  Employed MI to select the top 100 rules, and top 500 word pairs (as word pairs are more sparse)  Production rules, dependency rules, and word pairs all gave significant improvement with p < 0.01  Applying all feature classes yields the highest accuracy of 40.2%  Results show predictiveness of feature classes:  production rules > word pairs > dependency rules > context features w/o feature selectionw/ feature selection countaccuracycountaccuracy Production Rules11, % % Dependency Rules5, % % Word Pairs105, % % ContextYes28.5%Yes28.5% All35.0%40.2% Baseline 26.1%

Experiments 18  Question: can any of these feature classes be omitted to achieve the same level of performance?  Add in feature classes in the order of their predictiveness  production rules > word pairs > dependency rules > context features  The results confirm that  each additional feature class contributes a marginal performance improvement, and  all feature classes are needed for the optimal performance Production Rules Dependency Rules Word pairsContextAcc Yes40.2% % % %

Conclusion 19  Implemented an implicit discourse relation classifier  Features include:  Modeling of the context of the relations  Features extracted from constituent and dependency trees  Word pairs  Achieved an accuracy of 40.2%, a 14.1% improvement over the baseline With a component that handles implicit relations, continue to design a full parser

Outline Introduction 2. Literature review 3. Recognizing implicit discourse relations 4. A PDTB-styled end-to-end discourse parser 1. System overview 2. Components 3. Experiments 5. Modeling coherence using discourse relations 6. Proposed work and timeline 7. Conclusion

System overview 21  The parsing algo mimics the PDTB annotation procedure  Input – a free text T  Output – discourse structure of T in the PDTB style  Three steps:  Step 1: label Explicit relation  Step 2: label Non-Explicit relation (Implicit, AltLex, EntRel and NoRel)  Step 3: label attribution spans

System overview 22

Outline Introduction 2. Literature review 3. Recognizing implicit discourse relations 4. A PDTB-styled end-to-end discourse parser 1. System overview 2. Components 3. Experiments 5. Modeling coherence using discourse relations 6. Proposed work and timeline 7. Conclusion

Components: Connective classifier 24  Use syntactic features from Pitler & Nenkova (2009)  A connective’s context and POS give indication of its discourse usage  E.g., after is a discourse connective when it is followed by a present particle, such as “after rising 3.9%”  New contextual features for connective C:  C POS  prev + C, prev POS, prev POS + C POS  C + next, next POS, C POS + next POS  The path from C to the root

Components: Argument labeler 25  Label Arg1 and Arg2 spans in two steps:  Step 1: identify the locations of Arg1 and Arg2  Step 2: label their spans  Step 1 - argument position classifier:  Arg2 is always associated with the connective  Use contextual and lexical info to locate Arg1  Step 2 – argument extractor:  Case 1 – Arg1 and Arg2 in the same sentence  Case 2 – Arg1 in some previous sentence: assume the immediately previous

Components: Explicit classifier 26  Human agreement:  94% on Level-1  84% on Level-2  We train and test on Level-2 types  Features:  Connective C  C POS  C + prev

Components: Non-Explicit classifier 27  Non-Explicit: Implicit, AltLex, EntRel, NoRel  Modify the implicit classifier to include the AltLex, EntRel and NoRel  AleLex is signaled by non-connective expressions such as “That compared with”, which usually appear at the beginning of Arg2  Add another three features to check the beginning three words of Arg2

Components: Attribution span labeler 28  Label the attribution spans for Explicit, Implicit, and AltLex  Consists of two steps:  Step 1: split the text into clauses  Step 2: decide which clauses are attribution spans  Features from curr, prev and next clauses:  Unigrams of curr  Lowercased and lemmatized verbs in curr  First term of curr, Last term of curr, Last term of prev, First term of next  Last term of prev + first term of curr, Last term of curr + first term of next  Position of curr in the sentence  Production rules extracted from curr currnextprev

Outline Introduction 2. Literature review 3. Recognizing implicit discourse relations 4. A PDTB-styled end-to-end discourse parser 1. System overview 2. Components 3. Experiments 5. Modeling coherence using discourse relations 6. Proposed work and timeline 7. Conclusion

Experiments 30  Each component in the pipeline can be tested with two dimensions:  Whether there is error propagation from previous component (EP vs no EP), and  Whether gold standard parse trees and sentence boundaries or automatic parsing and sentence splitting are used (GS vs Auto)  Three settings:  GS + no EP: per component evaluation  GS + EP  Auto + EP: fully automated end-to-end evaluation

Experiments 31  Connective classifier  Argument extractor

Experiments 32  Explicit classifier  Non-explicit classifier

Experiments 33  Attribution span labeler  Evaluate the whole pipeline:  GS + EP gives F1 of 46.8% under partial match and 33% under exact match  Auto + EP gives F1 of 38.18% under partial match and 20.64% under exact match

Conclusion 34  Designed and implemented an end-to-end PDTB-styled parser  Incorporated the implicit classifier into the pipeline  Evaluated the system both component-wise as well as with error propagation  Reported overall system F1 for partial match of 46.8% with gold standard parses and 38.18% with full automation

Outline Introduction 2. Literature review 3. Recognizing implicit discourse relations 4. A PDTB-styled end-to-end discourse parser 5. Modeling coherence using discourse relations 1. A relation transition model 2. A refined approach: discourse role matrix 3. Conclusion 6. Proposed work and timeline 7. Conclusion

A relation transition model 36  Recall: Barzilay & Lapata (2008)'s coherence representation models sentence-to-sentence transitions of entities  Well-written texts follow certain patterns of argumentative moves  Reflected by relation transition patterns  A text T can be represented as a relation transition:

A relation transition model 37  Method and preliminary results:  Extract the relation bigrams from the relation transition sequence  [Cause Cause], [Cause Contrast], [Contrast Restatement], [Restatement Expansion]  A training/test instance is a pair of relation sequences:  S gs = gold standard sequence  S p = permuted sequence  Task: rank the pair (S gs, S p )  Ideally, S gs should be ranked higher, ie, more coherent  Baseline: 50%

A relation transition model 38  The rel transition sequence is sparse  Expect longer articles to give more predictable sequence  Perform experiments with diff sentence thresholds

Outline Introduction 2. Literature review 3. Recognizing implicit discourse relations 4. A PDTB-styled end-to-end discourse parser 5. Modeling coherence using discourse relations 1. A relation transition model 2. A refined approach: discourse role matrix 3. Conclusion 6. Proposed work and timeline 7. Conclusion

A refined approach: discourse role matrix 40  Instead of looking at the discourse roles of sentences, we look at the discourse roles of terms  Use sub-sequences of discourse roles as features  Comp.Arg2  Exp.Arg2, Comp.Arg1  nil, …

A refined approach: discourse role matrix 41  Experiments:  Compared with Barzilay & Lapata (2008) ’s entity grid model

Outline Introduction 2. Literature review 3. Recognizing implicit discourse relations 4. A PDTB-styled end-to-end discourse parser 5. Modeling coherence using discourse relations 6. Proposed work and timeline 1. Literature review on several NLP applications 2. Proposed work 3. Timeline 7. Conclusion

Literature review on several NLP applications 43  Text summarization:  Discourse plays an important role in text summarization  Marcu (1997) showed that RST tree is a good indicator of salience in text  PDTB relations are helpful in summarization:  Generic summarization: utilize Instantiation and Restatement relations to recognize redundancy  Update summarization: use Contrast relations to locate updates

Literature review on several NLP applications 44  Argumentative zoning (AZ):  Proposed by Teufel (1999) to automatically construct the rhetorical moves of argumentation of academic writings  Label sentences with 7 tags:  aim, textual, own, background, contrast, basis, and other  Has been shown that AZ can help in:  Summarization (Teufel & Moens, 2002)  Citation indexing (Teufel et al., 2006)

Literature review on several NLP applications 45  Why-QA:  Aims to answer generic question “Why X?”  Verberne et al. (2007) showed that discourse structure in RST framework is helpful in a why-QA system  Prasad and Joshi (2008) generate why-questions with the use of causal relations in the PDTB  We believe that the PDTB hierarchical relation typing will help in designing a why-QA system

Proposed work 46  Work done:  A system to automatically recognize implicit relations  Sec 3, EMNLP 2009  An end-to-end discourse parser  Sec 4, a journal in preparation  Coherence model based on discourse structures  Sec 5, ACL 2011  Next step, I propose to work on one of the NLP applications  Aim: show that discourse parsing can improve the performance of this NLP app

Timeline Sep – DecContinue working on the coherence modelDone 2010 Nov – DecWrite an ACL submission on the coherence modelDone 2011 Jan – MayWork on NLP applicationIn progress 2011 May – JulThesis write-up 2011 AugThesis defense

Outline Introduction 2. Literature review 3. Recognizing implicit discourse relations 4. A PDTB-styled end-to-end discourse parser 5. Modeling coherence using discourse relations 6. Proposed work and timeline 7. Conclusion

Conclusion 49  Designed and implemented an implicit discourse classifier in the PDTB  Designed and implemented an end-to-end discourse parser in the PDTB representation  Proposed a coherence model based on discourse relations  Proposed work: apply discourse parsing in one downstream NLP application  Summarization, argumentative zoning, or why-QA Parser Demo

Thank you! 50

Back up slides 51

The Penn Discourse Treebank 52  A discourse level annotation over the WSJ corpus  Adopts a binary predicate-argument view on discourse relations  Explicit relations: signaled by discourse connectives Arg2: When he sent letters offering 1,250 retired major leaguers the chance of another season, Arg1: 730 responded.  Implicit relations: Arg1: “I believe in the law of averages,” declared San Francisco batting coach Dusty Baker after game two. Arg2: [accordingly] “I’d rather see a so-so hitter who’s hot come up for the other side than a good hitter who’s cold.”

The Penn Discourse Treebank 53  AltLex relations: Arg1: For the nine months ended July 29, SFE Technologies reported a net loss of $889,000 on sales of $23.4 million. Arg2: AltLex [That compared with] an operating loss of $1.9 million on sales of $27.4 million in the year- earlier period.  EntRel: Arg1: Pierre Vinken, 61 years old, will join the board as a nonexecutive director Nov. 29. Arg2: Mr. Vinken is chairman of Elsevier N.V., the Dutch publishing group.

The Penn Discourse Treebank 54

Experiments 55  Classifier: OpenNLP MaxEnt  Training data: Sections 2 – 21 of the PDTB  Test data: Section 23 of the PDTB  Feature selection: Use Mutual Information(MI) to select features for production rules, dependency rules, and word pairs separately  Majority baseline: 26.1%, where all instances are classified into Cause

Components: Argument labeler 56

A relation transition model 57 can be represented by: or:

Experiments 58  The classifier labels no instances of Synchrony, Pragmatic Cause, Concession, and Alternative  The percentages of these four types are too small: totally only 4.76% in the training data  As Cause is the most predominant type, it has high recall but low precision

Methodology: Constituent parse features 59  Syntactic structure within one argument may constrain the relation type and the syntactic structure of the other argument (a)Arg1: But the RTC also requires “working” capital to maintain the bad assets of thrifts that are sold Arg2: [subsequently] That debt would be paid off as the assets are sold (b)Arg1: It would have been too late to think about on Friday. Arg2: [so] We had to think about it ahead of time.

Components: Connective classifier 60  PDTB defines 100 discourse connectives  Features from Pitler and Nenkova (2009):  Connective: because  Self category: IN  Parent category: SBAR  Left sibling category: none  Right sibling category: S  Right sibling contains a VP: yes  Right sibling contains a trace: no trace

Experiments 61  Connective classifier:  Adding the lexico-syntactic and path features significantly (p < 0.001) improves accuracy and F1 for both GS and Auto  The connective with the highest number of incorrect labels is and  and is always regarded as an ambiguous connective

Experiments 62  Argument position classifier:  Performance drops when EP and Auto are added in  The degradation is mostly due to the SS class  False positives propagated from connective classifier  For GS + EP: 30/36 classified as SS  For Auto + EP: 46/52 classified as SS   the difference between SS and PS is largely due to error propagation

Experiments 63  Argument extractor - argument node identifier:  F1 for Arg1, Arg2, and Rel (Arg1+Arg2)  Arg1/Arg2 nodes for subordinating connectives are the easiest ones to locate  97.93% F1 for Arg2, 86.98% F1 for Rel  Performance for discourse adverbials are the lowest  Their Arg1 and Arg2 nodes are not strongly bound

Experiments 64  Argument extractor:  Report both partial and exact match  GS + no EP gives a satisfactory Rel F1 of 86.24% for partial match  The performance for exact match is much lower than human agreement (90.2%)  Most misses are due to small portions of text being deleted from / added to the spans by the annotators

Experiments 65  Explicit classifier:  Human agreement = 84%  A baseline that uses only connective as features yields an F1 of 86% under GS + no EP  Adding new features improves to 86.77%

Experiments 66  Non-explicit classifier:  A majority baseline (all classified as EntRel) gives F1 in the low 20s  GS + no EP shows a F1 of 39.63%  Performance for GS + EP and Auto + EP are much lower  Still outperforms baseline by ~6%

Experiments 67  Attribution span labeler:  GS + no EP achieves F1 of 79.68% and 65.95% for partial and exact match  With EP: degradation is mostly due to the drop in precision  With Auto: degradation is mostly due to the drop in recall

Experiments 68  Evaluate the whole pipeline:  Look at the Explicit and Non-Explicit relations that are correctly identified  Define a relation as correct if its relation type is classified correctly, and both Arg1 and Arg2 are labeled correctly (partial or exact)  GS + EP gives F1 of 46.8% under partial match and 33% under exact match  Auto + EP gives F1 of 38.18% under partial match and 20.64% under exact match  A large portion of misses come from the Non-Explicit relations

A lexical model 69  Lapata (2003) proposed a sentence ordering model  Assume the coherence of adjacent sentences is based on lexical word pairs:  The coherence of the text is thus:  RST enforces two possible canonical orders of text spans:  Satellite before nucleus (e.g., conditional)  Nucleus before satellite (e.g., restatement)  A word pair-based model can be used to check whether these orderings are enforced

A lexical model 70  Method and preliminary results:  Extract (w i-1,j, C, w i,k ) as features:  Use mutual information to select top n features, n = 5000  Accuracy = 70%, baseline = 50%

Experiments 71  w/o feature selection  Production rules and word pairs yield significantly better performance  Contextual features perform slightly better than the baseline  Dependency rules perform slightly lower than baseline, and applying all feature classes does not yield the highest accuracy  noise w/o feature selection countaccuracy Production Rules11, % Dependency Rules5, % Word Pairs105, % ContextYes28.5% All35.0% Baseline 26.1%

Components: Argument labeler: Argument position classifier 72  Relative positions of Arg1:  SS: in the same sentence as the connective (60.9%)  PS: in some previous sentence of the connective (39.1%)  FS: in some sentence following the sentence of the connective (0%, only 8 instances, thus ignored)  Classify the relative position of Arg1 as SS or PS  Features:  Connective C, C POS  Position of C in the sentence (start, middle, end)  prev1, prev1 POS, prev1 + C, prev1 POS + C POS  prev2, prev2 POS, prev2 + C, prev2 POS + C POS

Components: Argument labeler: Argument extractor 73  When Arg1 is classified as in the same sentence (SS) as Arg2, it can be one of:  Arg1 before Arg2  Arg2 before Arg1  Arg1 embedded within Arg2  Arg2 embedded within Arg1  Arg1 and Arg2 nodes in the parse tree can be syntactically related in one of three ways:

Components: Argument labeler: Argument extractor 74  Design an argument node identifier to  identify the Arg1 and Arg2 subtree nodes within the sentence parse tree  Features:  Connective C  C’s syntactic category (subordinate, coordinate, adverbial)  Numbers of left and right siblings of C  Path P of C to the node under consideration  Path P and whether the size of C’s left sibling is greater than one  The relative position of the node to C

Components: Argument labeler: Argument extractor 75  When Arg1 is classified as in some previous sentence (PS), we use the majority classifier  Label the immediately previous sentence as Arg1 (76.9%)