Download presentation
Presentation is loading. Please wait.
Published byWinston Neal Modified over 9 years ago
1
Progress update Lin Ziheng
2
System overview 2
3
Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self category: IN – Parent category: SBAR – Left sibling category: none – Right sibling category: S – Right sibling contains a VP: yes 3
4
Components – Connective classifier New features – Conn POS – Prev word + conn: even though, particularly since – Prev word POS – Prev word POS + conn POS – Conn + Next word – Next word POS – Conn POS + Next word POS – All lemmatized verbs in the sentence containing conn 4
5
Components – Argument labeler 5
6
Argument labeler – Argument position classifier Relative positions of Arg1 – Arg1 and Arg2 in the same sentence: SS (60.9%) – Arg1 in the immediately previous sentence: IPS (30.1%) – Arg1 in some non-adjacent previous sentence: NAPS (9.0%) – Arg1 in some following sentence: FS (0%, only 8 instances) FS ignored 6
7
Argument labeler – Argument position classifier Features: – Connective string – Conn POS – Conn position in the sentence: first, second, third, third last, second last, or last – Prev word – Prev word POS – Prev word + conn – Prev word POS + conn POS – Second prev word – Second prev word POS – Second prev word + conn – Second prev word POS + conn POS 7
8
Argument labeler – Argument extractor SS cases: handcrafted a set of syntactically motivated rules to extract Arg1 and Arg2 8
9
Argument labeler – Argument extractor An example: 9
10
Argument labeler – Argument extractor IPS cases: label the sentence containing the connective as Arg2 and the immediately previous sentence as Arg1 NAPS cases: – Arg1 locates in the second previous sentence in 45.8% of the NAPS cases – Use the majority decision and assume Arg1 is always in the second previous sentence 10
11
Components – Explicit classifier Prasad et al. (2008) reported human agreements of 94% on Level 1 classes and 84% on Level 2 types A baseline using only connectives as features gives 95.7% and 86% on Sec. 23 – Difficult to improve acc. on testing section 3 types of features: – Connective string – Conn POS – Conn + prev word 11
12
Components – Non-explicit classifier Non-explicit: Implicit, AltLex, EntRel, NoRel – 11 Level 2 types for Implicit/AltLex, plus EntRel and NoRel 13 types 4 feature sets from Lin et al. (2009) – Contextual features – Constituent parse features – Dependency parse features – Word-pair features 3 features to capture AltLex: Arg2_word1, Arg2_word2, Arg2_word3 12
13
Components – Attribution span labeler Two steps: split the text into clauses, and decide which clauses are attribution spans Rule-based clause splitter: – first split a sentence into clauses by punctuations – for each clause, we further split it if one of the following production links if found: VP SBAR, S SINV, S S, SINV S, S SBAR, VP S 13
14
Components – Attribution span labeler Attr span classifier features: (curr, prev and next clauses) – Unigrams of curr – Lowercased and lemmatized vers in curr – The first and last terms of curr – The last term of prev – The first term of next – The last term of prev + the first term of curr – The last term of curr + the first term of next – The position of curr in the sentence – Punctuations rules extracted from curr 14
15
Evaluation Train: 02-21, dev: 22, test: 23 Each component is tested – without and with error propagation (EP) from previous component – with gold standard (GS) parse trees and sentence boundaries, and with automatic (Auto) parser and sentence splitter 15
16
Evaluation – Connective classifier GS: increased acc and F1 by 2.05% and 3.05% Auto: increased acc and F1 by 1.71% and 2.54% Contextual info is helpful 16
17
Evaluation – Argument position classifier Able to accurately label SS But performs badly on the NAPS class – Due to the similarity between IPS and NAPS classes 17
18
Evaluation – Argument extractor Human agreements on partial and exact matches: 94.5% and 90.2% Exact F1 much lower than partial F1 – Due to small portions of text deleted 18
19
Evaluation – Explicit classifier Baseline: using only connective strings – 86% GS + no EP F1 increased by 0.44% 19
20
Evaluation – Non-explicit classifier Majority baseline: all classified as EntRel Adding EP degrades F1 by ~13%, but still outperforms baseline by ~6% 20
21
Evaluation – Attribution span labeler When EP added: the decrease of F1 is largely due to the drop in precision When Auto added: the decrease of F1 is largely due the drop in recall 21
22
Evaluation – The whole pipeline Definition: a relation is correct if its relation type is classified correctly, and both Arg1 and Arg2 are partially or exactly matched GS + EP – Partial: 46.38% F1 – Exact: 31.72% F1 22
23
On-going changes Joint learning Change rule-based argument extractor to a machine learning approach 23
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.