Presentation is loading. Please wait.

Presentation is loading. Please wait.

Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates Matthew Gerber and Joyce Y. Chai Department of Computer Science Michigan State University.

Similar presentations


Presentation on theme: "Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates Matthew Gerber and Joyce Y. Chai Department of Computer Science Michigan State University."— Presentation transcript:

1 Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates Matthew Gerber and Joyce Y. Chai Department of Computer Science Michigan State University East Lansing, Michigan, USA Language & Interaction Research

2 A Motivating Example What can traditional SRL systems tell us? Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs.

3 A Motivating Example What can traditional SRL systems tell us? –Who is the producer? –What is produced? Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs.

4 A Motivating Example What can traditional SRL systems tell us? –Who is the producer? –What is produced? –What is manufactured? But that’s not the whole story… –Who is the manufacturer? Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs.

5 A Motivating Example What can traditional SRL systems tell us? –Who is the producer? –What is produced? –What is manufactured? But that’s not the whole story… –Who is the manufacturer? –Who ships? Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs.

6 A Motivating Example What can traditional SRL systems tell us? –Who is the producer? –What is produced? –What is manufactured? But that’s not the whole story… –Who is the manufacturer? –Who ships what? Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs.

7 A Motivating Example What can traditional SRL systems tell us? –Who is the producer? –What is produced? –What is manufactured? But that’s not the whole story… –Who is the manufacturer? –Who ships what to whom? Implicit arguments Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs.

8 ...saving shipping costs. Nominal predicates are not new NomBank (Meyers, 2007) –115k manual SRL analyses for 4,700 predicates Identifying NomBank arguments –Jiang and Ng (2006); Liu and Ng (2007) Many predicates lack arguments in NomBank –2008 CoNLL Shared Task (Surdeanu et al., 2008) –Gerber et al. (NAACL, 2009) Predicate filter improves performance –Do not address the recovery of implicit arguments Nominal SRL

9 Implicit Argument Identification Research questions –Where are implicit arguments? –Can we recover them? Related work –Japanese anaphora Indirect anaphora (Sasano et al., 2004) Zero-anaphora (Imamura et al., 2009) –Implicit Arguments Fine-grained domain model (Palmer et al., 1986) SemEval Task 10 (Ruppenhofer et al., 2010)

10 Outline Implicit Argument Annotation and Analysis Model Formulation and Features Evaluation Conclusions and Future Work

11 Ten most prominent NomBank predicates –Derived from verbal role set (ship shipment) –Frequency of nominal predicate –Difference between verbal/nominal argument counts [John] shipped [the package]. (argument count: 2) Shipping costs will decline. (argument count: 0) Predicates instances annotated: 1,254 Independently annotated by two annotators –Cohen’s Kappa: 67% –Agreement: both unfilled or both filled identically Data Annotation

12 Post-annotation Verb form Annotation Analysis Pre-annotation Verb form Average number of expressed arguments

13 Annotation Analysis Pre-annotation Verb form Post-annotation Verb form Average number of expressed arguments

14 Annotation Analysis Average arguments across all predicates –Pre-annotation: 1.1 –Post-annotation: 1.8 –Verb form: 2.0 Overall percentage of possible roles filled –Pre-annotation: 28.0% –Post-annotation: 46.2% ( 65%)

15 Annotation Analysis 90% within current or previous three sentences Only 55% within current sentence

16 Outline Implicit Argument Annotation and Analysis Model Formulation and Features Evaluation Conclusions and Future Work

17 Model Formulation Candidate selection –Core PropBank/NomBank arguments –Two-sentence candidate window Coreference chaining – Binary classification function – c2 c3 c1 Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs.

18 Model Features SRL structure Discourse structure Other

19 Model Features: SRL Structure VerbNet role transition c3 Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs.

20 send.theme.theme createproduct Feature value: create.product send.theme Captures script-like properties of events Multiple values are possible VerbNet role transition Model Features: SRL Structure c3 VN class: create VN role: product VN class: send arg1? Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. arg1

21 Narrative event chains (Chambers and Jurafsky, 2008) –PMI(manufacture.arg1, ship.arg1) –Computed from SRL output over Gigaword (Graff, 2003) –Advantages: better coverage + relationship strength Model Features: SRL Structure c3 Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs. arg1

22 Penn Discourse TreeBank (Prasad et al., 2008) –Feature value: Contingency.Cause.Result –Might help identify salient discourse segments Model Features: Discourse Structure c2 Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs.

23 Outline Implicit Argument Annotation and Analysis Model Formulation and Features Evaluation Conclusions and Future Work

24 Data processing –Gold SRL labels, OpenNLP coreference, GigaWord Training (sections 2-21, 24) –816 annotated predicates –650 implicitly filled argument positions –LibLinear logistic regression Testing (section 23) –437 annotated predicates –246 implicitly filled argument positions Baseline heuristic: matching argument positions Armstrong agreed to sell its carpet operations to Shaw Industries. The sale could help Armstrong. Evaluation Setting

25 Data processing –Gold SRL labels, OpenNLP coreference, GigaWord Training (sections 2-21, 24) –816 annotated predicates –650 implicitly filled argument positions –LibLinear logistic regression Testing (section 23) –437 annotated predicates –246 implicitly filled argument positions Baseline heuristic: matching argument positions Armstrong agreed to sell its carpet operations to Shaw Industries. The sale could help Armstrong. Evaluation Setting

26 Methodology (Ruppenhofer et al., 2010) –Ground-truth implicit arguments: –Predicted implicit argument: –Prediction score: –P: total prediction score / prediction count –R: total prediction score / true implicit positions Evaluation Setting Georgia-Pacific and Nekoosa produce market pulp, containerboard and white paper. The goods could be manufactured closer to customers, saving shipping costs.

27 Evaluation Results Overall F 1 –Baseline: 26.5% –Discriminative: 42.3% –Human annotator Two-sentence window: 58.4% Unlimited window: 67.0%

28 Evaluation Results # filled Baseline F 1 (%) Discriminative F 1 (%) p Oracle recall (%) sale60/ price53/ investor35/ < bid26/ plan20/ cost17/ loss12/ loan9/ investment8/ fund6/

29 Feature Ablation Ablation sets –SRL structure (e.g., VerbNet role transition) –Non-SRL information –Discourse structure Percent change (p-value) PRF1F1 Remove SRL str (<0.01)-36.1 (<0.01)-35.7 (<0.01) Remove non-SRL-26.3 (<0.01)-11.9 (0.05)-19.2 (<0.01) Remove discourse0.2 (0.95)1.0 (0.66)0.7 (0.73)

30 Improvements Versus Baseline Who is the seller? Two key pieces of information –Coreference chain for Olivetti (exporting/supplying) –Relationships between exporting, supplying, and sales Olivetti exports... Olivetti supplies... Olivetti has denied that it violated the rules, asserting that the shipments were properly licensed. However, the legality of these sales is still an open question.

31 Outline Implicit Argument Annotation and Analysis Model Formulation and Features Evaluation Conclusions and Future Work

32 Implicit arguments are prevalent –Add 65% to the coverage of NomBank Most implicit arguments are near the predicate –55% in current sentence –90% within three sentences Implicit arguments can be automatically extracted –SRL structure is currently the most informative –This is a difficult task and much work remains Ongoing investigations –Global inference instead of local classification –Unsupervised knowledge acquisition Data:


Download ppt "Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates Matthew Gerber and Joyce Y. Chai Department of Computer Science Michigan State University."

Similar presentations


Ads by Google