Download presentation

Presentation is loading. Please wait.

Published byMichael Cochran Modified over 4 years ago

1
Feature Forest Models for Syntactic Parsing Yusuke Miyao University of Tokyo

2
Probabilistic models for NLP Widely used for disambiguation of linguistic structures Ex.) POS tagging A pretty girl is crying NN DT VBZ JJ VBG NN DT VBZ JJ VBG NN DT VBZ JJ VBG NN DT VBZ JJ VBG NN DT VBZ JJ VBG P(NN|a/NN, pretty)

3
Probabilistic models for NLP Widely used for disambiguation of linguistic structures Ex.) POS tagging A pretty girl is crying NN DT VBZ JJ VBG NN DT VBZ JJ VBG NN DT VBZ JJ VBG NN DT VBZ JJ VBG NN DT VBZ JJ VBG

4
Probabilistic models for NLP Widely used for disambiguation of linguistic structures Ex.) POS tagging A pretty girl is crying NN DT VBZ JJ VBG NN DT VBZ JJ VBG NN DT VBZ JJ VBG NN DT VBZ JJ VBG NN DT VBZ JJ VBG

5
Implicit assumption Processing state = Primitive probability –Efficient algorithm for searching –Avoid exponential explosion of ambiguities NN DT VBZ JJ VBG NN DT VBZ JJ VBG NN DT VBZ JJ VBG NN DT VBZ JJ VBG NN DT VBZ JJ VBG A pretty girl is crying POS tag = processing state = primitive probability

6
The assumption is right? Ex.) Shallow parsing, NE recognition

7
The assumption is right? Ex.) Shallow parsing, NE recognition NP-B VP-I NP-I O VP-B A pretty girl is crying NP-B VP-I NP-I O VP-B NP-B VP-I NP-I O VP-B NP-B VP-I NP-I O VP-B NP-B VP-I NP-I O VP-B

8
The assumption is right? Ex.) Shallow parsing, NE recognition –B(Begin), I(Internal), O(Other) tags are introduced to represent multi-word tags NP-B VP-I NP-I O VP-B A pretty girl is crying NP-B VP-I NP-I O VP-B NP-B VP-I NP-I O VP-B NP-B VP-I NP-I O VP-B NP-B VP-I NP-I O VP-B

9
The assumption is right? Ex.) Syntactic parsing

10
The assumption is right? Ex.) Syntactic parsing What do you want to give? VP S S S P(VP|VPto give)

11
The assumption is right? Ex.) Syntactic parsing –Non-local dependencies are not represented What do you want to give? VP S S S P(VP|VPto give)

12
Problem of existing models Processing state Primitive probability

13
Problem of existing models Processing state Primitive probability How to model the probability of ambiguous structures with more flexibility?

14
Possible solution A complete structure is a primitive event –Ex.) Shallow parsing NP-B VP-I NP-I O VP-B A pretty girl is crying NP-B VP-I NP-I O VP-B NP-B VP-I NP-I O VP-B NP-B VP-I NP-I O VP-B NP-B VP-I NP-I O VP-B

15
Possible solution A complete structure is a primitive event –Ex.) Shallow parsing NPVP NPVP A pretty girl is crying NPVP NPVPNP VPNP All possible sequences

16
Possible solution A complete structure is a primitive event –Ex.) Shallow parsing Probability of the sequence of multi-word tags NPVP NPVP A pretty girl is crying NPVP NPVPNP VPNP All possible sequences

17
Possible solution A complete structure is a primitive event –Ex.) Shallow parsing Probability of the sequence of multi-word tags NPVP NPVP A pretty girl is crying NPVP NPVPNP VPNP All possible sequences

18
Possible solution A complete structure is a primitive event –Ex.) Syntactic parsing What do you want to give? VP S S S

19
Possible solution A complete structure is a primitive event –Ex.) Syntactic parsing what do you want to give ARG1 ARG2 MODIFY ARG2

20
Possible solution A complete structure is a primitive event –Ex.) Syntactic parsing Probability of argument structures what do you want to give ARG1 ARG2 MODIFY ARG2

21
Problem Complete structures have exponentially many ambiguities NPVP NPVP A pretty girl is crying NPVP NPVPNP VPNP Exponentially many sequences

22
Proposal Feature forest model [Miyao and Tsujii, 2002]

23
Proposal Feature forest model [Miyao and Tsujii, 2002] Conjunctive node Disjunctive node Features Exponentially many trees are packed Features are assigned to each conjunctive node

24
Feature forest model Feature forest models can be efficiently estimated without exponential explosion [Miyao and Tsujii, 2002]

25
Feature forest model Feature forest models can be efficiently estimated without exponential explosion [Miyao and Tsujii, 2002] When unpacking the forest, the model is equivalent to maximum entropy models [Berger et al., 1996]

26
Application to parsing Applying a feature forest model to disambiguation of argument structures

27
Application to parsing Applying a feature forest model to disambiguation of argument structures How to represent exponential ambiguities of argument structures with a feature forest?

28
Application to parsing Applying a feature forest model to disambiguation of argument structures How to represent exponential ambiguities of argument structures with a feature forest? –Argument structures are not trees, but DAGs (including reentrant structures)

29
want ARG1 ARG2 I argue1 1 ARG1 1 fact ARG1 want ARG1 ARG2 I argue2 1 ARG1 1 ARG2 fact Packing argument structures An example including reentrant structures She neglected the fact that I wanted to argue.

30
I Packing argument structures She neglected the fact that I wanted to argue.

31
want ARG1 ARG2 I argue1 1 ARG1 1 Packing argument structures Inactive parts: Argument structures whose arguments are all instantiated She neglected the fact that I wanted to argue. I

32
want ARG1 ARG2 I argue1 1 ARG1 1 Packing argument structures Inactive parts: Argument structures whose arguments are all instantiated Inactive parts are packed into conjunctive nodes She neglected the fact that I wanted to argue. I

33
Packing argument structures Inactive parts: Argument structures whose arguments are all instantiated Inactive parts are packed into conjunctive nodes She neglected the fact that I wanted to argue. I want A1 A2 argue1 I A1 argue1 I

34
want ARG1 ARG2 I argue2 1 ARG1 1 ARG2 ? Packing argument structures Inactive parts: Argument structures whose arguments are all instantiated Inactive parts are packed into conjunctive nodes She neglected the fact that I wanted to argue. I want A1 A2 argue1 I A1 argue1 I

35
Packing argument structures Inactive parts: Argument structures whose arguments are all instantiated Inactive parts are packed into conjunctive nodes She neglected the fact that I wanted to argue. I want A1 A2 argue1 I A1 argue1 I want A1 A2 argue2 I

36
Packing argument structures Inactive parts: Argument structures whose arguments are all instantiated Inactive parts are packed into conjunctive nodes She neglected the fact that I wanted to argue. I want A1 A2 argue1 I A1 argue1 I want A1 A2 argue2 I want ARG1 ARG2 I argue2 1 ARG1 1 ARG2 fact

37
Packing argument structures Inactive parts: Argument structures whose arguments are all instantiated Inactive parts are packed into conjunctive nodes She neglected the fact that I wanted to argue. I want A1 A2 argue1 I A1 argue1 I want A1 A2 argue2 I fact argue2 A1 A2 fact I

38
Packing argument structures Inactive parts: Argument structures whose arguments are all instantiated Inactive parts are packed into conjunctive nodes She neglected the fact that I wanted to argue. I want A1 A2 argue1 I A1 argue1 I want A1 A2 argue2 I fact argue2 A1 A2 fact I A1 want

39
Feature forest representation of argument structures fact A1 want fact argue2 A1 A2 want A1 A2 argue1 I A1 She neglected the fact that I wanted to argue. I argue1 I want A1 A2 argue2 I fact I she neglect A1 A2 fact she Conjunctive nodes correspond to argument structures whose arguments are all instantiated

40
Experiments Grammar: a treebank grammar of HPSG [Miyao and Tsujii, 2003] –Extracted from the Penn Treebank [Marcus et al., 1994] Section 02-21 Training: Section 02-21 of the Penn Treebank Test: sentences from Section 22 covered by the grammar Measure: Accuracy of dependencies in argument structures

41
Experiments Features: the combinations of –Surface strings/POS –Labels of dependencies (ARG1, ARG2, …) –Labels of lexical entries (head noun, transitive, …) –Distance Estimation algorithm: Limited-memory BFGS algorithm [Nocedal, 1980] with MAP estimation [Chen & Rosenfeld, 1999]

42
Preliminary results Estimation time: 143 min. Accuracy (precision/recall): exactpartial Baseline48.1 / 47.457.1 / 56.2 Unigram77.3 / 77.481.1 / 81.3 Feature forest85.5 / 85.388.4 / 88.2

43
Conclusion Feature forest models allow the probabilistic modeling of complete structures without exponential explosion The application to syntactic parsing resulted in the high accuracy

44
Ongoing work Refinement of the grammar and tuning of estimation parameters Development of efficient algorithms for best-first/beam search

Similar presentations

OK

Learning with lookahead: Can history-based models rival globally optimized models? Yoshimasa Tsuruoka Japan Advanced Institute of Science and Technology.

Learning with lookahead: Can history-based models rival globally optimized models? Yoshimasa Tsuruoka Japan Advanced Institute of Science and Technology.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google

Pdf to ppt online converter free Ppt on blue eye technology Ppt on our indian culture Ppt on condition based maintenance tools Ppt on teamviewer 10 Ppt on water scarcity images Ppt on autonomous car accidents Ppt on human body movements Ppt on volatility of stock market Ppt on enterprise java beans