Presentation is loading. Please wait.

Presentation is loading. Please wait.

Project LOGO End-to-End Discourse Parser Evaluation Sucheta Ghosh, Sara Tonelli, Giuseppe Riccardi, Richard Johansson Department of Information Engineering.

Similar presentations


Presentation on theme: "Project LOGO End-to-End Discourse Parser Evaluation Sucheta Ghosh, Sara Tonelli, Giuseppe Riccardi, Richard Johansson Department of Information Engineering."— Presentation transcript:

1 Project LOGO End-to-End Discourse Parser Evaluation Sucheta Ghosh, Sara Tonelli, Giuseppe Riccardi, Richard Johansson Department of Information Engineering and Computer Science University of Trento, Italy

2 Project LOGO Content Introduction Discourse Parser: what + why + how Discourse Parser & Penn Discourse TreeBank (PDTB) Our contribution Architecture Feature Result Conclusion 2End2End Disc Pars Eval

3 Project LOGO Introduction What: we refer to coherent structured group of sentences or expressions as a discourse Why: discourse structure to represent the meaning of the document How : Process flow: data (discourse) segmentation discourse parsing discourse structure Discourse structure includes relations (connective and its arguments ) lexically anchored in the document text Common Data Sources: Rhetorical Structure Tree (RST) & Penn Discourse TreeBank (PDTB ) We used this 3End2End Disc Pars Eval

4 Project LOGO Examples from PDTB(1) Arg1 -> I never gamble too far. Explicit Connective -> In particular Arg2 -> I quit after one try, whether I win or lose. [EXPANSION ] 4End2End Disc Pars Eval Each annotated relation includes a connective, two arguments and a sense label of connective Connective occur between two arguments or at the beginning of sentence or inside argument The top-level senses of three-layered hierarchy: TEMPORAL, CONTINGENCY, COMPARISON, EXPANSION

5 Project LOGO Examples from PDTB(2) When Mr. Green won a $240,000 verdict in a land condemnation case against the State in June 1983, he says, Judge OKicki unexpectedly awarded him an additional $100,000. [TEMPORAL ] As an indicator of the tight grain supply situation in the U.S., market analysts said that late Tuesday the Chinese government, which often buys U.S. grains in quantity, turned instead to Britain to buy 500,000 metric tons of wheat. [COMPARISON ] Since McDonalds menu prices rose this year, the actual deadline may have been more. [CONTINGENCY ] (Arg1 italicized, connectives underlined, Arg2 boldfaced) 5End2End Disc Pars Eval

6 Project LOGO PDTB Corpus Statistics Arg2 always in same sentence as connective 60.9% of the annotated Arg1 in same sentence as connective, 39.1% is in the previous sentence (30.1% adjacent, 9.0% non adjacent) We used this statistic information to establish baseline 6End2End Disc Pars Eval

7 Project LOGO Our Contribution Developed end-to-end discourse parser to retrieve discourse structure with explicit connective, 2 arg spans starting with text paragraph Evaluation Established system with Gold-standard data (PTB+PDTB) Evaluated with baseline Implemented same method in automated system Improvement of the automated system in terms of applicability Overlapping discourse segmentation technique (+2/-2 window) applied on the complete text Followed chunking strategy for classification The discourse model is a cascaded CRF 7End2End Disc Pars Eval

8 Project LOGO End-to-End Architecture 8End2End Disc Pars Eval Chunklink AddDiscourse RootExtract +Morpha By Sabaine Buchholz CoNLL00 task Pitler & Nenkova 09 Conn. SenseDet. Morph & All Feat Johansson+ Minnen et al Pruner Arg2Arg1 Doc Parser Parse_Tree

9 Project LOGO Features Features used for Arg1 and Arg2 segmentation and labeling. F1. Token (T) F2. Sense of Connective (CONN) F3. IOB chain (IOB) F4. PoS tag F5. Lemma (L) F6. Inflection (INFL) F7. Main verb of main clause (MV) F8. Boolean feature for MV (BMV) Additional feature used only for Arg1 F9. Arg2 Labels 9End2End Disc Pars Eval For more details: Ghosh et al IJCNLP 2011

10 Project LOGO Features: Arg1 Features used for Arg1 and Arg2 segmentation and labeling. F1. Token (T) F2. Sense of Connective (CONN) F3. IOB chain (IOB) F4. PoS tag F5. Lemma (L) F6. Inflection (INFL) F7. Main verb of main clause (MV) F8. Boolean feature for MV (BMV) Additional feature used only for Arg1 F9. Arg2 Labels 10End2End Disc Pars Eval For more details: Ghosh et al IJCNLP 2011

11 Project LOGO Features: Arg2 Features used for Arg1 and Arg2 segmentation and labeling. F1. Token (T) F2. Sense of Connective (CONN) F3. IOB chain (IOB) F4. PoS tag F5. Lemma (L) F6. Inflection (INFL) F7. Main verb of main clause (MV) F8. Boolean feature for MV (BMV) Additional feature used only for Arg1 F9. Arg2 Labels 11End2End Disc Pars Eval For more details: Ghosh et al IJCNLP 2011

12 Project LOGO Evaluation & Baseline Metrics: Precision, Recall and F1 measure Scoring schemes: Exact Match: correct if classified span exactly coincides with gold standard span Baseline (On the basis of statistics given at annotation manual) : Arg2: by labeling all tokens of the text span between the connective and the beginning of the next sentence Arg1: by labeling all tokens in the text span from the end of the previous sentence to the connective position; if the connective occurs at the beginning of a sentence, labeling previous sentence. 12End2End Disc Pars Eval

13 Project LOGO Exact Arg2 Results: Comparison Viewgraph 13End2End Disc Pars Eval PRF1 Baseline Gold-Standard Automatic AutoConn+GoldSPT GoldConn+AutoSPT Lightweight(Auto)

14 Project LOGO Exact Arg1 Results: Comparison Viewgraph 14End2End Disc Pars Eval PRF1 Baseline0.19 Gold-Standard Automatic AutoConn+GoldSPT GoldConn+AutoSPT Lightweight(Auto)

15 Project LOGO Features The IOB(Inside-Outside-Begin) chain all constituents on the path between the root note and the current leaf node of the tree. For example IOB chain feature for ``flashed: I-S/E-VP/E-SBAR/E-S/C-VP, where B-, I-, E- and C- indicate whether the given token is respectively at the beginning, inside, at the end of the constituent, or a single token chunk. 15End2End Disc Pars Eval

16 Project LOGO Conclusion The Automatic end2end system results nearly same with Gold standard We lead towards a lightweight version of the pipeline – shallow & less dependence of SPTs We wish to explore more features We improved our result by 5 points for Arg1 classification using a previous sentence feature (Ghosh et al IJCNLP 2011) The Automatic end2end system results nearly same with Gold standard We lead towards a lightweight version of the pipeline – shallow & less dependence of SPTs We wish to explore more features We improved our result by 5 points for Arg1 classification using a previous sentence feature (Ghosh et al IJCNLP 2011) 16End2End Disc Pars Eval

17 Project LOGO Thank you End2End Disc Pars Eval Sucheta Ghosh, Sara Tonelli, Giuseppe Riccardi, Richard Johansson Department of Information Engineering and Computer Science University of Trento, Italy {ghosh,

18 Project LOGO Previous Work Task limited to retrieving the argument heads (Wellner et al 2007, Elwell et al 2008) Dinesh et al. (2005) extracted complete arguments with boundaries, but only for a restricted class of connectives The identification of Arg1 has been only partially addressed in previous works (Prasad 2010) Automatic surface-sense classification (at class level) already reached the upper bound of inter-annotator agreement (Pitler and Nenkova, 2009) 18End2End Disc Pars Eval

19 Project LOGO Data & Tools Corpus Used: Penn Discourse Tree Bank (PDTB) For Gold Standard System: Penn Tree Bank (PTB) corpus is used Third party software/scripts used: Stanford Syntactic Tree Parser (by Klein & Manning 2003) AddDiscourse (Explicit Connective Classification) (Pitler and Nenkova 2008) ChunkLink.pl to extract IOB chains (by Sabine Buchholtz: CoNLL Shared Task 2000) RootExtractor: Syntactic Parse Tree (SPT) processors (by Richard Johansson) Morpha (Minnen et al 2001) Conditional Random Field: CRF++ by Taku Kudo 19End2End Disc Pars Eval

20 Project LOGO Overall Architecture Syntactic tree parser is used for automatic systems Connective Detection and classification tool is used for automatic systems PDTB & PTB are not used during end-to-end automatic testing phase 20End2End Disc Pars Eval

21 Project LOGO End2End Testing Phase 21End2End Disc Pars Eval

22 Project LOGO Conditional Random Field 22End2End Disc Pars Eval We use the CRF++ tool (http://crfpp.sourceforge.net/) for sequence labeling classification (Lafferty et al., 2001), with second-order Markov dependency between tags.http://crfpp Beside the individual specification of a feature in the feature description template, the features in various combinations are also represented. We used this tool because the output of CRF++ is compatible to CoNLL 2000 chunking shared task, and we view our task as a discourse chunking task. On the other hand, linear-chain CRFs for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Also Sha and Pereira (2003) claim that, as a single model, CRFs outperform other models for shallow parsing. We use the CRF++ tool (http://crfpp.sourceforge.net/) for sequence labeling classification (Lafferty et al., 2001), with second-order Markov dependency between tags.http://crfpp Beside the individual specification of a feature in the feature description template, the features in various combinations are also represented. We used this tool because the output of CRF++ is compatible to CoNLL 2000 chunking shared task, and we view our task as a discourse chunking task. On the other hand, linear-chain CRFs for sequence labeling offer advantages over both generative models like HMMs and classifiers applied at each sequence position. Also Sha and Pereira (2003) claim that, as a single model, CRFs outperform other models for shallow parsing.

23 Project LOGO Hill Climbing Algorithm 23End2End Disc Pars Eval function HILL-CLIMBING ( problem) returns a state that is a local maximum current 9 MAKE-NODE(problem.INITIAL-STATE) loop do neighbor highest-valued successor of current if (neighbor.VALUE < current.VALUE) then return current.STATE current 9< neighbor [Artificial Intelligence: Stuart J. Russel] The hill climbing search algorithm, the most basic local search technique. At each step the current node is replaced by the best neighbor; Here neighbor with the highest VALUE, but if a heuristic cost estimate h is used, we would find the neighbor with the lowest h. Hill climbing is greedy, fast local search We optimized this selected set with feature ablation technique, leaving 1 feature each time function HILL-CLIMBING ( problem) returns a state that is a local maximum current 9 MAKE-NODE(problem.INITIAL-STATE) loop do neighbor highest-valued successor of current if (neighbor.VALUE < current.VALUE) then return current.STATE current 9< neighbor [Artificial Intelligence: Stuart J. Russel] The hill climbing search algorithm, the most basic local search technique. At each step the current node is replaced by the best neighbor; Here neighbor with the highest VALUE, but if a heuristic cost estimate h is used, we would find the neighbor with the lowest h. Hill climbing is greedy, fast local search We optimized this selected set with feature ablation technique, leaving 1 feature each time

24 Project LOGO Features The IOB(Inside-Outside-Begin) chain corresponds to the syntactic categories of all the constituents on the path between the root note and the current leaf node of the tree. The corresponding feature would be I-S/E-VP/E-SBAR/E-S/C-VP, where B-, I-, E- and C- indicate whether the given token is respectively at the beginning, inside, at the end of the constituent, or a single token chunk. In this case, ``flashed" is at the end of every constituent in the chain, except for the last VP, which dominates one single leaf. 24End2End Disc Pars Eval

25 Project LOGO Result: Gold-lbl & Auto PRF1 Arg2Exact Partial Overlap Arg1Exact Partial Overlap PRF1 Arg2Exact Partial Overlap Arg1Exact semiPartial autoOverlap Arg1Exact fullPartial autoOverlap Gold-labeled Sys Output Automatic Sys Output 25End2End Disc Pars Eval (Baseline result in blue color)

26 Project LOGO Combo Result PRF1 Arg2Exact Partial Overlap Arg1Exact Partial Overlap PRF1 Arg2Exact Partial Overlap Arg1Exact Partial Overlap Gold Conn + Auto SPT Auto Conn + Gold SPT 26End2End Disc Pars Eval

27 Project LOGO Result: replc. IOB chain PRF1 Arg2Exact Partial Overlap Arg1Exact Partial Overlap End2End Disc Pars Eval


Download ppt "Project LOGO End-to-End Discourse Parser Evaluation Sucheta Ghosh, Sara Tonelli, Giuseppe Riccardi, Richard Johansson Department of Information Engineering."

Similar presentations


Ads by Google