Presentation is loading. Please wait.

Presentation is loading. Please wait.

HPSG parser development at U-tokyo Takuya Matsuzaki University of Tokyo.

Similar presentations


Presentation on theme: "HPSG parser development at U-tokyo Takuya Matsuzaki University of Tokyo."— Presentation transcript:

1 HPSG parser development at U-tokyo Takuya Matsuzaki University of Tokyo

2 Topics Overview of U-Tokyo HPSG parsing system Supertagging with Enju HPSG grammar

3 Overview of U-Tokyo parsing system Two different algorithms: – Enju parser: Supertagging + CKY algo. for TFS – Mogura parser: Supertagging + CFG-filtering Two disambiguation models: – one trained on PTB-WSJ – one trained on PTB-WSJ + Genia (biomedical)

4 Supertagger-based parsing [Clark and Curran, 2004; Ninomiya et al., 2006] Supertagging [Bangalore and Joshi, 1999]: Selecting a few LEs for a word by using a probabilistic model of P(LE | sentence) I like it HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS P: large P: small

5 Ignore the LEs with small probabilities I like it HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS P: large P: small Supertagger-based parsing [Clark and Curran, 2004; Ninomiya et al., 2006] Input to the parser LEs with P > threshold I like it HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS threshold

6 Flow in Enju parser 1.POS tagging by a CRF-based model 2.Morphological analysis (inflected  base form) by the WordNet dictionary 3.Multi-Supertagging by a MaxEnt model 4.TFS CKY parsing + MaxEnt disambiguation on the multi-supertagged sentence

7 Flow in Mogura parser 1.POS tagging by a CRF-based model 2.Morphological analysis (inflected  base form) by the WordNet dictionary 3.Supertagging by a MaxEnt model 4.Selection of (probably) constraint-satisfying supertag assignment 5.TFS shift-reduce parsing on singly- supertagged sentence

8 Ignore the LEs with small probabilities I like it HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS P: large P: small Previous supertagger-based parsing [Clark and Curran, 2004; Ninomiya et al., 2006] Input to the parser LEs with P > threshold I like it HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS threshold

9 HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS <> HEAD noun SUBJ COMPS HEAD verb SUBJ <> COMPS <> like HEAD verb SUBJ COMPS Supertagging is “almost parsing”

10 A dilemma in the previous method Fewer LEs  Faster parsing, but Too few LEs  More risk of no well-formed parse trees Ilike it HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS

11 Mogura Overview I like it HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS Supertagger I like it input sentence Enumeration of assignments I like it HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS I like it HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS I like it HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS... Deterministic disambiguation Ilikeit Prob.

12 Enumaration of the maybe-parsable LE assignments ( 11 1 2 1 1 12 1 (,,) 2 1 1... I like it 2 1 2 1 2 1... Prob....,,) 11 1 (,,) 2 1 1 (,,) 1 2 1 Prob. Supertagging result Enumeration of the highest-prob. LE sequences CFG-filter

13 Parsing with a CFG that approximates the HPSG [Kiefer and Krieger, 2000; Torisawa et al, 2000] – Approximation = elmination of some constraints in the grammar (long-distance dep., number, case, etc.) – Covering property: if a LE assignment is parsable by the HPSG  it is also parsable by the approx. CFG – CFG parsing is much faster than HPSG parsing

14 Parser grammarAccuracySpeed MST parserdependency 90.02% (LAS) 4.5 snt/sec Sagae’s parserdependency 89.01% (LAS) 21.6 snt/sec Berkeley parserCFG 89.27% (LF 1 ) 4.7 snt/sec Charniak’s parserCFG 89.55% (LF 1 ) 2.2 snt/sec Charniak’s parser reranker CFG % (LF 1 ) 1.9 snt/sec Enju parserHPSG 88.87% (PAS-LF 1 ) 2.7 snt/sec Mogura parserHPSG 88.07% (PAS-LF 1 ) 22.8 snt/sec Results on PTB-WSJ

15 Supertagging with Enju grammar Input: POS-tagged sentence Number of supertags (lexical templates): 2,308 Current implementation – Classifier: MaxEnt, point-wise prediction (i.e., no dependencies among neighboring supertags) – Features: words and POS tags in -2/+3 window 92% token accuracy (1-best, only on covered tokens) It’s “almost parsing”: 98-99% parsing accuracy (PAS F1) given correct lexical assignments

16 Pointwise-Supertagging P1P2P3P4P5P6P7P8 w1w2w3w4w5w6w7w8 S1S2S3S4S5S6S7S8 Input Output Lex. Ent. POS tag Word

17 Pointwise-Supertagging P1P2P3P4P5P6P7P8 w1w2w3w4w5w6w7w8 S1S2S3S4S5S6S7S8 Input Output Lex. Ent. POS tag Word

18 Pointwise-Supertagging P1P2P3P4P5P6P7P8 w1w2w3w4w5w6w7w8 S1S2S3S4S5S6S7S8 Input Output Lex. Ent. POS tag Word

19 Pointwise-Supertagging P1P2P3P4P5P6P7P8 w1w2w3w4w5w6w7w8 S1S2S3S4S5S6S7S8 Input Output Lex. Ent. POS tag Word

20 Pointwise-Supertagging P1P2P3P4P5P6P7P8 w1w2w3w4w5w6w7w8 S1S2S3S4S5S6S7S8 Input Output Lex. Ent. POS tag Word

21 Pointwise-Supertagging P1P2P3P4P5P6P7P8 w1w2w3w4w5w6w7w8 S1S2S3S4S5S6S7S8 Input Output Lex. Ent. POS tag Word

22 Supertagging: future directions Basic strategy: do more work in supertagging (rather than in parsing) Pros – Model/algorithm is simpler  Easy error analysis  Various features without extending the parsing algorithm  Fast try-and-error cycle for feature engineering Cons – No tree structure  Feature design is sometimes tricky/ad-hoc: e.g., “nearest preceding verb/noun”, instead of “possible modifiee of a PP”

23 Supertagging: future directions Recovery from POS-tagging error in supertagging stage Incorporation of shallow processing results (e.g., chunking, NER, coordination structure prediction) as new features Comparison across other languages/grammar frameworks

24 Thank you!

25 Deterministic disambiguation Implemented as a shift-reduce parser – Deterministic parsing: only one analysis at one time – Next parsing action is selected using a scoring function next action F: scoring function (averaged-perceptron algorithm [Collins and Duffy, 2002] ) Features are extracted from the stack state S and lookahead queue Q A: the set of possible actions (CFG-forest is used as a `guide’)

26 Example I like it HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS Initial state S Q

27 HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS argmax F(a, S, Q) = SHIFT I like it S Q

28 HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS I like it S Q argmax F(a, S, Q) = SHIFT

29 HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS I like it S Q argmax F(a, S, Q) = SHIFT

30 HEAD noun SUBJ COMPS argmax F(a, S, Q) = REDUCE(Head_Comp) I like it HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS <> Head-Comp-Schema S Q

31 argmax F(a, S, Q) = REDUCE(Subj_Head) I like it HEAD verb SUBJ COMPS HEAD noun SUBJ COMPS HEAD verb SUBJ COMPS <> HEAD noun SUBJ COMPS HEAD verb SUBJ <> COMPS <> Subj-Head-Schema S Q


Download ppt "HPSG parser development at U-tokyo Takuya Matsuzaki University of Tokyo."

Similar presentations


Ads by Google